Stefan Behnel schrieb am 12.08.21 um 09:21:
Holger.Joukl schrieb am 21.07.21 um 20:17:
python2.7 -c 'from lxml import objectify; root = objectify.fromstring("<root><x>1000_000</x></root>"); print(root.x, type(root.x), type(root.x.pyval)); print(root.x.text, type(root.x.text)); print(objectify.dump(root))'
('1000_000', <type 'lxml.objectify.StringElement'>, <type 'str'>)
('1000_000', <type 'str'>)
root = None [ObjectifiedElement]
     x = '1000_000' [StringElement]

python3.6 -c 'from pytaf.objectify.xmsg import *; from lxml import etree, objectify; root = objectify.fromstring("<root><x>1000_000</x></root>"); print(root.x, type(root.x), type(root.x.pyval)); print(root.x.text, type(root.x.text)); print(objectify.dump(root))'
1000000 <class 'lxml.objectify.IntElement'> <class 'int'>
1000_000 <class 'str'>
root = None [ObjectifiedElement]
     x = 1000000 [IntElement]

According to https://www.w3.org/TR/xmlschema-2/#integer 1000_000 is not a valid integer literal. But it is for Python since 3.6.

The magic lxml.objectify type lookup/annotation simple does int(s) and interprets success as "shall be interpreted as int". One could argue that - when parsing XML data - this is not the right/sane/intuitive choice. Or is it? :-)
<x>1000_000</x> is not an integer in the XML world.

Then we shouldn't make it one. It's unlikely that data gets passed through XML in Python syntax. We have the same for "True" and "False", which come out as str, not bool. And this applies to FloatElement as well, which uses float() as parser and thus also supports "_" in Py3.6+.

I'll see what I can come up with.

https://github.com/lxml/lxml/commit/83e6c031994d553b74991501c6cd85e3517fadd8

Stefan
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to