On 1/04/25 03:33, Jens Tröger via lxml - The Python XML Toolkit wrote:
1) What actually is that current limitation on the size of nodes imposed by 
libxml2?

https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-parser.html#xmlCtxtSetOptions

- text nodes, tags, comments, PI, CDATA: 10MB
- names, system literals, pubid literals: 50KB
- nesting depth of elements: 256
- nesting depth of entities: 20

If XML_PARSE_HUGE is set, these are increased to 1GB, 10MB, 2048, and 40 respectively.

2) Assuming there is some flexibility with libxml2, what options does lxml 
offer to deal with huge text nodes?

https://lxml.de/parsing.html#parser-options

> huge_tree - disable security restrictions and support very deep trees and very long text content (only affects libxml2 2.7+)
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to