On 20/02/14 11:30, Christian Maeder wrote: > Hi, > > I've got some difficulties parsing "large" xml files (> 100MB). > A plain SAX parser, as provided by hexpat, is fine. However, > constructing a tree consumes too much memory on a 32bit machine. > > see http://trac.informatik.uni-bremen.de:8080/hets/ticket/1248 > > I suspect that sharing strings when constructing trees might greatly > reduce memory requirements. What are suitable libraries for string pools? > > Before trying to implement something myself, I'ld like to ask who else > has tried to process large xml files (and met similar memory problems)? > > I have not yet investigated xml-conduit and hxt for our purpose. (These > look scary.) > > In fact, I've basically used the content trees from "The (simple) xml > package" and switching to another tree type is no fun, in particular if > this gains not much. > > Thanks Christian > _______________________________________________ > Glasgow-haskell-users mailing list > Glasgow-haskell-users@haskell.org > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users >
HXT will not work for you, you will run out of memory on files ~30MB. I don't know about xml-conduit, I'd love to hear how it goes if you try it. -- Mateusz K. _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users