Stefan Behnel schrieb am 12.08.21 um 09:21:
Holger.Joukl schrieb am 21.07.21 um 20:17:
python2.7 -c 'from lxml import objectify; root =
objectify.fromstring("<root><x>1000_000</x></root>"); print(root.x,
type(root.x), type(root.x.pyval)); print(root.x.text, type(root.x.text));
print(objectify.dump(root))'
('1000_000', <type 'lxml.objectify.StringElement'>, <type 'str'>)
('1000_000', <type 'str'>)
root = None [ObjectifiedElement]
x = '1000_000' [StringElement]
python3.6 -c 'from pytaf.objectify.xmsg import *; from lxml import etree,
objectify; root = objectify.fromstring("<root><x>1000_000</x></root>");
print(root.x, type(root.x), type(root.x.pyval)); print(root.x.text,
type(root.x.text)); print(objectify.dump(root))'
1000000 <class 'lxml.objectify.IntElement'> <class 'int'>
1000_000 <class 'str'>
root = None [ObjectifiedElement]
x = 1000000 [IntElement]
According to https://www.w3.org/TR/xmlschema-2/#integer 1000_000 is not a
valid integer literal. But it is for Python since 3.6.
The magic lxml.objectify type lookup/annotation simple does int(s) and
interprets success as "shall be interpreted as int".
One could argue that - when parsing XML data - this is not the
right/sane/intuitive choice. Or is it? :-)
<x>1000_000</x> is not an integer in the XML world.
Then we shouldn't make it one. It's unlikely that data gets passed through
XML in Python syntax. We have the same for "True" and "False", which come
out as str, not bool. And this applies to FloatElement as well, which uses
float() as parser and thus also supports "_" in Py3.6+.
I'll see what I can come up with.
https://github.com/lxml/lxml/commit/83e6c031994d553b74991501c6cd85e3517fadd8
Stefan
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com