Hi,

Curious, I tried this out, first creating some data in the file child.py:

#!/usr/bin/env python3
import lxml.etree as ET
t = ET.fromstring(
"""
<html>
<body>
<div data-flag="TODO">
        <p>This content is flagged as <code>TODO</code>. It gets a background 
colour (red, in this case) and the label <code>TODO</code> is rendered into the 
document margin.</p>
</div>
</body>
</html>
""")


And then applied your query:

for n, i in enumerate(t.xpath(r'.//div/child::*')):
        print(f"[{n}] {i}")

Which gave me the following result:

❯ ./child.py
[0] <Element p at 0x1023c3d00>

However, when experimenting, I put two forward slashes before the child 
operator, and then I saw the output you describe, i.e:

for n, i in enumerate(t.xpath(r'.//div//child::*')):
        print(f"[{n}] {i}")

Give:

❯ ./child.py
[0] <Element p at 0x10f2a3d40>
[1] <Element code at 0x10f2a3e40>
[2] <Element code at 0x10f2b4140>


(Although, as far as i can see, that output is correct when the two forward 
slashes are present.)

The above was performed on Python 3.11.3 and lxml version 4.9.2.

Could a double slash have sneaked into your query before "child"...?

Kind regards

aid





> On 29 Jun 2023, at 19:17, wayneb--- via lxml - The Python XML Toolkit 
> <lxml@python.org> wrote:
> 
> Here's a bit of code I was trying to parse: 
> 
> <div data-flag="TODO">
>       <p>This content is flagged as <code>TODO</code>. It gets a background 
> colour (red, in this case) and the label <code>TODO</code> is rendered into 
> the document margin.</p>
> </div>
> 
> I was using  .//div/child::* 
> What I got back was a list of three items: <p><code><code> 
> those aren't the children of div! Those are the descendant elements of div. I 
> decided to test this using .//div/descendant::* which gave me the proper: 
> <p><code><code> elements. 
> To further confirm this I used other parsers and they provided the proper 
> child (p in this case). 
> 
> How do we go about getting this fixed in lxml?
> _______________________________________________
> lxml - The Python XML Toolkit mailing list -- lxml@python.org
> To unsubscribe send an email to lxml-le...@python.org
> https://mail.python.org/mailman3/lists/lxml.python.org/
> Member address: a...@logic.org.uk

_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to