Re: [Haskell-cafe] Haskell XML Parsers

2010-05-06 Thread Malcolm Wallace
I have been looking at using XML for a little program I have been  
writing. The
file I am currently trying to load is about 9MB, and I have now  
tried to use
HaXml and HST. Without any of my own code, just a simple call to  
the basic

parsers, they both use huge amount of memory.
HST is the worst and about 7GB and climbing. HaXml uses 1.3Gb.


For the archives, the user took the suggestion of switching to HaXml's  
lazy parser, which solved the memory issue.


Regards,
Malcolm

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell XML Parsers

2010-05-05 Thread Neil Mitchell
Hi,

You might want to take a look at TagSoup
(http://community.haskell.org/~ndm/tagsoup) - it parses XML/HTML
lazily returning a stream of tags. It doesn't do nesting, but it does
have good memory usage.

Thanks, Neil

On Fri, Apr 30, 2010 at 11:35 AM, R Senington sc06...@leeds.ac.uk wrote:
 Dear all,

 I have been looking at using XML for a little program I have been writing. 
 The file I am currently trying to load is about 9MB, and I have now tried to 
 use
 HaXml and HST. Without any of my own code, just a simple call to the basic 
 parsers, they both use huge amount of memory.
 HST is the worst and about 7GB and climbing. HaXml uses 1.3Gb.

 The code I am using is
 HST
 xml - readFile file_name_here;k-runX (parseXmlDocument True) xml;print k

 and for HaXml
 x-readFile file_name_here
 let (Document _ _ e _) = xmlParse t x
 let t = myFilter $ CElem e
 print $ length t


 I have seen on previous posts to the cafe that other people have run into 
 this problem with HST. Is this a general problem with XML in Haskell (I know 
 that XML parsing is a slow and bulky process but this seems excessive)? Is 
 there a known solution? Does anyone have any advice?

 Cheers

 RS
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell XML Parsers

2010-05-05 Thread Gregory Collins
R Senington sc06...@leeds.ac.uk writes:

 Dear all,

 I have been looking at using XML for a little program I have been writing. The
 file I am currently trying to load is about 9MB, and I have now tried to use
 HaXml and HST. Without any of my own code, just a simple call to the basic
 parsers, they both use huge amount of memory.
 HST is the worst and about 7GB and climbing. HaXml uses 1.3Gb.

If your needs are reasonably basic, you could consider trying:

  http://hackage.haskell.org/package/hexpat

which is an FFI binding to expat.

G
-- 
Gregory Collins g...@gregorycollins.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Haskell XML Parsers

2010-04-30 Thread R Senington

Dear all,

I have been looking at using XML for a little program I have been 
writing. The file I am currently trying to load is about 9MB, and I have 
now tried to use
HaXml and HST. Without any of my own code, just a simple call to the 
basic parsers, they both use huge amount of memory.

HST is the worst and about 7GB and climbing. HaXml uses 1.3Gb.

The code I am using is
HST
xml - readFile file_name_here;k-runX (parseXmlDocument True) xml;print k

and for HaXml
x-readFile file_name_here
let (Document _ _ e _) = xmlParse t x
let t = myFilter $ CElem e
print $ length t


I have seen on previous posts to the cafe that other people have run 
into this problem with HST. Is this a general problem with XML in 
Haskell (I know that XML parsing is a slow and bulky process but this 
seems excessive)? Is there a known solution? Does anyone have any advice?


Cheers

RS
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell XML Parsers

2010-04-30 Thread Malcolm Wallace
I have been looking at using XML for a little program I have been  
writing. The file I am currently trying to load is about 9MB, and I  
have now tried to use
HaXml and HST. Without any of my own code, just a simple call to the  
basic parsers, they both use huge amount of memory.

HST is the worst and about 7GB and climbing. HaXml uses 1.3Gb.


Are you using Text.XML.HaXml.ParseLazy, or Text.XML.HaXml.Parse?  The  
lazy version should show much better space usage, provided your  
subsequent usage of the document is roughly a single-pass traversal.


Regards,
Malcolm

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe