Charlie Clark schrieb am 19.01.24 um 15:00:
On 18 Jan 2024, at 18:10, Charlie Clark wrote:
Apart from the fact that this currently doesn't work, I imagine that both
Elements and their children would happily be passed to the write, which could
lead to an almighty mess. Getting this to work properly, possibly rewritten for
async to avoid the awfully awful (yield) hack could be a nice addition to the
documentation.
Thinking about this again, I think a pull parser is probably the way to go as I
really don't want or need to create elements, it's probably fine if I just make
the changes to what's coming through and stream the text straight back into
another file. I'll give that a go.
If you want to avoid creating element objects all together, maybe even
don't need a full (sub-)tree structure to get all relevant information, I
suggest you try the low-level SAX interface.
https://lxml.de/parsing.html#the-target-parser-interface
It's quite efficient and usable for locally constrained XML
transformations, e.g. filtering elements or attributes.
And you can still parse input chunk by chunk, if you need that:
https://lxml.de/parsing.html#the-feed-parser-interface
Stefan
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com