Dear XLML Users!
I am developing lxml.objectify2(lxml.o2). Lxml.o2 has tree objectives:
* making lxml more pythonic
* introducing robust namespaced properties
* making lxml more fun
*Following the old ways**
*
Imagine the following xml file.
xml_str = '''\
<obj:root xmlns:obj="objectified" xmlns:other="otherNS">
<obj:c1 a1="A1" a2="A2" other:a3="A3">
<obj:c2>0</obj:c2>
<obj:c2>1</obj:c2>
<obj:c2>2</obj:c2>
</obj:c1>
<obj:c1>
<other:c2>3</other:c2>
<other:c2>5</other:c2>
<obj:c2>2</obj:c2>
</obj:c1>
<obj:c1>
<other:c2>42</other:c2>
</obj:c1>
</obj:root>'''
Please notice that the tags obj:c1 and obj/other:c2 are multiple childs
of the same {ns}name.
Here a glance at the data processed by xlml.o (standard lxml.objectfy)
from the PyCharm IDE perspective.
https://backend.datenadler.de/kram/bildschirmfoto-vom-2022-03-07-23-02-33.png/image_view_fullscreen
You may notice that there is no multiplicity at all. lxml.o is quite
limited and not really pythonic. Therefore any Python-IDE will struggles
with a representation of lxml processed data.
*Following the new ways*
Let's use lxml.objectify2 instead.
from lxml.objectify2 import ObjectifiedElement2
obj2_lookup = ObjectifyElementClassLookup(tree_class=ObjectifiedElement2)
parser = etree.XMLParser()
parser.set_element_class_lookup(obj2_lookup)
node = etree.XML(xml_str, parser=parser)
A look from the PyCharm debugger into the data structure processed by
lxml.o2:
https://backend.datenadler.de/kram/bildschirmfoto-vom-2022-03-07-22-34-10.png/image_view_fullscreen
As you can see lxml.o2 handles multiple children with same qtag by
assigning an "[index]" to them.
*<rant>Yeah, that is nice screenwork, but this will never work in
code?**</rant>*
>>> node.obj_c1[2].obj_c2
[3]
here the call to
node.obj_c1
returns a list. Then python takes over get the desired second element.
*<rant>Ok, but this will not work with getattr**</rant>*
>>> getattr(node, 'obj_c1[0]').obj_c2
[0, 1, 2]
Here lxml.o2 does the selection of the element [0] really fast in c-space.
**
*<rant>OK, and where is the catch**</rant>*
To implement this functionality we need to ensure that two rules are
followed by the user.
1) If there are elements without a namespace, a default namespace has to
be defined.
2) Any access to a "tag" has to be done qualified, with the exception of
the default namespace.
node.<namespace>_<name>
mit default namespace
node.<name>
If these rules a too much for you, go back to lxml.objectify and be happy.
*<rant>Ah, go away. Where do you find such nice XML**</rant>*
Mh. I have never seen so simple XML documents like in the lxml.objectify
tests in the real world.
But I am aware that lxml.o2 will have to be tested thoroughly.
*
*
*<rant>You will never convince all the users of lxml to change to
lxml.o2**</rant>*
That is true. But I do not even try. lxml.o2 is an alternative to lxml.o
for certain usecases.
You are welcome to rant at me :-)
You are also welcome to help with the development of lxml.o2. This is a
spare time job for me.
If you do not have the time to help, you may express your liking of
lxml.o2, here.
lxml.o2 lives at
https://github.com/Inqbus/lxml <https://github.com/Inqbus/lxml>
in the branch
https://github.com/Inqbus/lxml/tree/objectify_prefix
Cheers,
Volker
--
=========================================================
inqbus Scientific Computing Dr. Volker Jaenisch
Hungerbichlweg 3 +49 (8860) 9222 7 92
86977 Burggenhttps://inqbus.de
=========================================================
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com