Hi Brian & Charlie, I'm not the OP; but, FYI, i can see the same issue (on an Intel Mac):
aid@orac tmp % ./tail.py Python : sys.version_info(major=3, minor=9, micro=13, releaselevel='final', serial=0) lxml.etree : (4, 9, 0, 0) libxml used : (2, 9, 14) libxml compiled : (2, 9, 14) libxslt used : (1, 1, 35) libxslt compiled : (1, 1, 35) b'<form action="action1">\n</form>\n</body>\n</html>\n' You can see my machine is using lxml 2.9.14; which is a pity as in the thread you linked to it looked like the issue would have been resolved in that version... However, I found that if you update the call to etree.tostring() to use method='html' then the trailing body and html elements are no longer shown. i.e.: print(etree.tostring(nodeList[0], method='html')) With that update made, the script outputs the desired: aid@orac tmp % python3 -i tail.py Python : sys.version_info(major=3, minor=9, micro=13, releaselevel='final', serial=0) lxml.etree : (4, 9, 0, 0) libxml used : (2, 9, 14) libxml compiled : (2, 9, 14) libxslt used : (1, 1, 35) libxslt compiled : (1, 1, 35) b'<form action="action1">\n</form>\n' I've no idea why this behaviour seems to have changed.... Kind regards aid > On 7 Jun 2022, at 17:02, Charlie Clark <charlie.cl...@clark-consulting.eu> > wrote: > > On 7 Jun 2022, at 16:56, brian.b...@trustpayments.com > <mailto:brian.b...@trustpayments.com> wrote: > > In more recent versions of lxml the tostring() method can return extra text > after the closing tag of the node I've passed to it. So instead of returning > > b'\n\n' > > it returns > > b'\n\n\n\n' > > This looks a lot like this > https://mail.python.org/archives/list/lxml@python.org/thread/LCTOSIIWGGALAMSZAYHRRYUWYDRESCUO/ > > <https://mail.python.org/archives/list/lxml@python.org/thread/LCTOSIIWGGALAMSZAYHRRYUWYDRESCUO/> > Can you update your version of libxml2? > > Charlie > > -- > Charlie Clark > Managing Director > Clark Consulting & Research > German Office > Sengelsweg 34 > Düsseldorf > D- 40489 > Tel: +49-203-3925-0390 > Mobile: +49-178-782-6226 > > _______________________________________________ > lxml - The Python XML Toolkit mailing list -- lxml@python.org > To unsubscribe send an email to lxml-le...@python.org > https://mail.python.org/mailman3/lists/lxml.python.org/ > Member address: a...@logic.org.uk
_______________________________________________ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com