On 05/05/2015 11:41, "Mario =?UTF-8?B?S3LDtnBsaW4i?=
<linkr...@github.com>" wrote:
Recently, I compared DOM parsers for an XML files of 100 MByte:
15.8 s tango.text.xml (SiegeLord/Tango-D2)
13.4 s ae.utils.xml (CyberShadow/ae)
8.5 s xml.etree (Python)
Either the Tango DOM parser is slow compared to the Tango pull parser,
or the D2 port ruined the performance.
fwiw I did some tests a couple of years back with
https://launchpad.net/d2-xml on 20 odd megabyte files and found it
faster than Tango.
Unfortunately that would need some work to test now, as xmlp is
abandoned and wouldn't build last time I tried it :-(
I also had some success with https://github.com/opticron/kxml, though it
had some issues with chuffy entity decoding performance.
Also, profiling showed a lot of time spent in the GC, and the recent
improvements in that area might have changed things by now.