Tuens out there's no need to use the pathlib module: The issue with
" " is gone when 1) first reading HTML into a variable 2) before
parsing it, even with the standard open():
============
""" OK
from pathlib import Path
with Path(f).open() as tempfile:
tree = et.parse(tempfile, parser=parser)
"""
#BAD
#tree = et.parse(f,parser)
#OK
with open(f) as reader:
content = reader.read()
#BAD tree=et.fromstring(content)
tree = et.parse(content, parser)
============
I didn't think about calling parse() with a variable since the examples
I read so far used either parse() with a file handler or the fromstring().
Thank you.
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com