Thank you everyone for your advice, I found it useful and think that I am
part-way to a solution using clojure.data.xml/source-seq as suggested by
dannue.
I'll post what I have done so far in the hope it might help someone else...
comments on style welcome.
*Solution*:
Given the following XML
Good question. Every lib that came to mind when I saw
clojure.data.xml/parse's
tree of Elements {:tag _,
:attrs _, :content _} only works on zippers which apparently sit in memory.
One option is to use `clojure.data.xml/source-seq` to get back a lazy
sequence
of Events {:type _, :name _, :attrs
On general Java principles, you can "stream" a large XML file with either
SAX or StAX and pluck what you like from it without wasting memory on the
rest. If the file is a long series of small sections that could be
examined separately, you might use SAX to partition the file and then
subject e
As far as I know, using zippers like that will need the whole XML data
structure to be in memory. data.xml returns fast because it's lazy (uses
pull parsing). Until you start traversing down the structure, it won't
parse more of it. data.xml should also be fully streaming, so it shouldn't
requir
Hi all,
I'm attempting to parse a large (500MB) XML, specifically I am trying to
extract various parts using XPath. I've been using the examples presented
here: http://clojure-doc.org/articles/tutorials/parsing_xml_with_zippers.html
and all was going when tested against small files, however no