"Matthew Pocock" <[EMAIL PROTECTED]> schrieb im Newsbeitrag news:[EMAIL PROTECTED] > On Thursday 24 January 2008, Albert Y. C. Lai wrote: >> Matthew Pocock wrote: >> > I've been using hxt to process xml files. Now that my files are getting >> > a >> > bit bigger (30m) I'm finding that hxt uses inordinate amounts of >> > memory. >> > I have 8g on my box, and it's running out. As far as I can tell, this >> > memory is getting used up while parsing the text, rather than in any >> > down-stream processing by xpickle. >> > >> > Is this a known issue? >> >> Yes, hxt calls parsec, which is not incremental. >> >> haxml offers the choice of non-incremental parsers and incremental >> parsers. The incremental parsers offer finer control (and therefore also >> require finer control). > > I've got a load of code using xpickle, which taken together are quite an > investment in hxt. Moving to haxml may not be very practical, as I'll have > to > find some eqivalent of xpickle for haxml and port thousands of lines of > code > over. Is there likely to be a low-cost solution to convincing hxt to be > incremental that would get me out of this mess? > > Matthew
I don't think so. Even if you replace parsec, HXT is itself not incremental. (It stores the whole XML document in memory as a tree, and the tree is not memory effecient. Still I am a bit surprised that you can't parse 30m with 8 gig memory. This was discussed here before, and I think someone benchmarked HXT as using roughly 50 bytes of memory per 1 byte of input. i.e. HXT would then be using about 1.5 gig of memory for your 30m file. Rene. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe