On 1/04/25 03:33, Jens Tröger via lxml - The Python XML Toolkit wrote:
1) What actually is that current limitation on the size of nodes imposed by
libxml2?
https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-parser.html#xmlCtxtSetOptions
- text nodes, tags, comments, PI, CDATA: 10MB
- names, system literals, pubid literals: 50KB
- nesting depth of elements: 256
- nesting depth of entities: 20
If XML_PARSE_HUGE is set, these are increased to 1GB, 10MB, 2048, and 40
respectively.
2) Assuming there is some flexibility with libxml2, what options does lxml
offer to deal with huge text nodes?
https://lxml.de/parsing.html#parser-options
> huge_tree - disable security restrictions and support very deep trees
and very long text content (only affects libxml2 2.7+)
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com