[lxml] source location or ordering of tag

Robin Becker Thu, 20 Jan 2022 03:58:31 -0800

I am trying to port an xml parser to using lxml.etree.


I can get the parser to work find and get it to validate properly and produce a 
parse tree
in tuple form ie (tag, attrib, [contents...], extra)

The standard xml error locations are find and provide both line number and 
column.

However, the current parser allows for debugging during post processing of the parse and it has a tuple of informationin place of extra above that looks like ((srcname,startline,startcolumn),(srcname,endline,endcolumn)). This extrainformation allows post analysis to determine if one tag starts after another.


I can find no way to access the column information in the standard parsers.

I believe that the information is present in the XMLReader that libxml2 
provides, but no way to get access in lxml.

I think I just need to determine a tag ordering ie does tag0 start before or 
after tag1 in the source.

Is there an obvious way to do that?

Currently (tag0.startline, tag0.startcolumn) is compared with (tag1.startline, tag1.startcolumn) and the latest tag isreturned.


I believe I could just add a tag sequence to determine order, but is there an 
easier way?
--
Robin Becker
_______________________________________________
lxml - The Python XML Toolkit mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: [email protected]

[lxml] source location or ordering of tag

Reply via email to