thanks for the analysis :)
I just skimmed over the tabl of contents and this caugth my eyes.
Le 14/7/16 à 01:58, monty a écrit :
Thanks for the link.
In-place parsing is a non-starter because it means storing the entire input as
a string in memory, so you could only parse files that fit in Pharo's address
space. The multi-gigabyte OpenStreetMap docs the article mentions would be
unparsable with SAX in a 32-bit VM.
Linked lists for storing child nodes is common (LibXML2 and Xerces do it) and
provides constant time insertion and sibling access, but arrays/vectors
(Arrays/OrderedCollections) are more cache friendly and faster for sequential
access and in Pharo are almost always the correct choice.
There is always the option of an FFI-based parser, but it shouldn't be a hybrid
like Python's minidom (FFI Expat with a Python DOM implementation), because
something like that already exists in Smalltalk/X (FFI Expat with a Smalltalk
DOM) and it was slower than a St/X port of XMLParser in my tests (I assume due
to the FFI overhead), so it's probably not worth it. But a non-hybrid parser
with everything (including the DOM) done in C should definitely be faster.
Sent: Wednesday, July 13, 2016 at 10:27 AM
From: stepharo <[email protected]>
To: "Pharo Development List" <[email protected]>
Subject: [Pharo-dev] tricks for XML parsing.
Hi guys
these free books may be interesting for you
http://aosabook.org/
http://aosabook.org/en/posa/parsing-xml-at-the-speed-of-light.html
stef