[lxml] Re: Elements without textual content overlap

2025-09-17 Thread Stefan Behnel via lxml - The Python XML Toolkit
Schimon Jehudah schrieb am 09.09.25 um 13:16: On Tue, 9 Sep 2025 08:50:41 +0200 Stefan Behnel via lxml - The Python XML Toolkit wrote: xml_data_bytes = memoryview(newdom) xml_data_str = str(xml_data_bytes, 'UTF-8') Did you mean to write. xml_data_bytes = memoryv

[lxml] Re: Elements without textual content overlap

2025-09-08 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, Schimon Jehudah schrieb am 27.08.25 um 09:19: Function is at. https://git.xmpp-it.net/sch/Rivista/src/branch/main/rivista/parser/xslt.py Is this parsed as HTML? With which options? Yes. I suppose so. So, this is your Python code running the transformation: def transform(filepa

[lxml] Re: Elements without textual content overlap

2025-08-25 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, sorry for the late response. Schimon Jehudah via lxml - The Python XML Toolkit schrieb am 24.08.25 um 12:06: I think, that I have found the cause to the issue, or, at least, now I know how to cause to the issue, and how to define it. Issue - Elements without text content would overlap

[lxml] Re: lxml 6.0.0: XMLSchemaParseError: Invalid argument

2025-07-06 Thread Stefan Behnel via lxml - The Python XML Toolkit
Stefan Behnel schrieb am 06.07.25 um 08:38: Austin Matherne schrieb am 01.07.25 um 04:01: I’m upgrading a project from lxml 5.4.0 to the newly released lxml 6.0.0 and encountering an unexpected XMLSchemaParseError. I’ve distilled the problem into a minimal, self-contained example and uploaded i

[lxml] Re: lxml 6.0.0: XMLSchemaParseError: Invalid argument

2025-07-06 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, Austin Matherne schrieb am 01.07.25 um 04:01: I’m upgrading a project from lxml 5.4.0 to the newly released lxml 6.0.0 and encountering an unexpected XMLSchemaParseError. I’ve distilled the problem into a minimal, self-contained example and uploaded it as a GitHub gist: https://gist.githu

[lxml] Re: pip install of lxml 5.3.2 for Python 3.10 win32 gets bad CRC-32

2025-04-18 Thread Stefan Behnel via lxml - The Python XML Toolkit
Michael Kinney schrieb am 08.04.25 um 03:58: Anyone else seeing this? Many other python versions are able to install lxml 5.3.2 Test>pip install lxml Collecting lxml Downloading lxml-5.3.2-cp310-cp310-win32.whl (3.5 MB) 3.5/3.5 MB 4.0 MB/s eta

[lxml] Re: Return type of text_content()

2025-04-04 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, tomi.belan--- via lxml - The Python XML Toolkit schrieb am 15.03.25 um 01:15: I noticed that the text_content() method of lxml.html elements returns a _ElementUnicodeResult, i.e. a 'smart' string. However, its getparent(), attrname are None, and is_tail, is_text, is_attribute are False. T

[lxml] Re: Max length of node content (huge nodes)

2025-04-01 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, Jens Tröger via lxml - The Python XML Toolkit schrieb am 01.04.25 um 15:22: Oh one more: do these limits apply to serializing, e.g. tostring() (see: https://lxml.de/apidoc/lxml.etree.html#lxml.etree.tostring ) as well? Jens No, the safety limits are only a parser thing. They aim to preven

[lxml] Re: Return type of text_content()

2025-03-21 Thread Stefan Behnel via lxml - The Python XML Toolkit
Tomi Belan schrieb am 20.03.25 um 23:53: You could change .xpath() and etree.XPath() itself so that the expression "string(...)" always returns a plain str. 'Smart' strings will only be returned (as elements of a Python list) when the XPath result is a node set containing text/cdata/attribute nod

[lxml] Re: Performance issues when using element.clear() in Python 3.x

2025-02-14 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, Noorulamry Daud schrieb am 14.02.25 um 09:56: Are you using the same versions of lxml (and libxml2) in both? No, and that's what makes it so frustrating. I cannot tell management that using the latest version of Python and lxml actually causes a significant performance penalty. By rights

[lxml] Re: Performance issues when using element.clear() in Python 3.x

2025-02-13 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, Noorulamry Daud schrieb am 13.02.25 um 12:28: I've been cracking my head about this performance issue I'm having and I could use some help. At my work we have to parse extremely large XML files - 20GB and even larger. The basic algorithm is as follows: with open(file, "rb") as reader:

[lxml] Re: Appending XSLT to XML

2025-01-14 Thread Stefan Behnel via lxml - The Python XML Toolkit
Schimon Jehudah via lxml - The Python XML Toolkit schrieb am 12.01.25 um 08:53: On Fri, 10 Jan 2025 17:28:00 +0100 jholg--- via lxml - The Python XML Toolkit wrote: >>> from lxml import etree >>> elem = etree.fromstring('') >>> tree = elem.getroottree() >>> tree.getroot().addprevious(etree.Pro

[lxml] Re: XML namespaces are not propagated over from the ancestor elements when using find* methods

2025-01-13 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi, Matthew Ouyang schrieb am 10.01.25 um 20:08: It would be really nice if the namespaces in the XML document could be considered. I read this request quite often. It's in the FAQs: https://lxml.de/FAQ.html#how-can-i-find-out-which-namespace-prefixes-are-used-in-a-document Namespace prefix

[lxml] Re: Broken EXSLT link in docs

2024-09-29 Thread Stefan Behnel via lxml - The Python XML Toolkit
Hi Jens, Jens Tröger via lxml - The Python XML Toolkit schrieb am 28.09.24 um 09:45: I think the EXSLT link here: https://lxml.de/xpathxslt.html#regular-expressions-in-xpath or source here: https://github.com/lxml/lxml/blob/9818374770aedc96f8f1e77943f45dea8e7fb4a8/doc/xpathxslt.txt#L31