Without looking at the tidtly source, I would expect that it is looking for 
closing tags, I. E.

<meta blah />
<br/>
________________________________
From: Gilles <codecompl...@free.fr>
Sent: Wednesday, May 11, 2022 11:52:31 AM
Cc: lxml@python.org <lxml@python.org>
Subject: [lxml] Re: [newbie] lxml adds &#13; before each end of line

On 11/05/2022 12:19, Charlie Clark wrote:
It could always be a bug, but really we need a sample file to test. Which 
version of lxml and Python are you using?

Here it is:

https://we.tl/t-WowFCDBp5A

Python 3.8.8, lxml 4.6.3.0

But, if all you want is pretty printing then I recommend simply using the 
command line tool tidy.

I tried it before asking, but tidy fails with a few errors I don't understand. 
Here's the output from a full file (not the sample I uploaded):

line 9 column 1 - Error: unexpected </head> in <meta>
line 68 column 87 - Error: unexpected </li> in <br>
line 73 column 1 - Error: unexpected </ol> in <br>
line 108 column 1 - Error: unexpected </body> in <br>
line 110 column 1 - Error: unexpected </html> in <br>
Tidy found 0 warnings and 5 errors!

========

    <meta name="generator" content="Namo WebEditor v4.0">
</head>

…

<li>blah:<br>&lt;?php<br>phpinfo();<br>?&gt;</li>

…

========

Is it bad practice to include <br> outside plain <p>…</p>?

Regardless, the origin of the problem is the unwanted addition of "&#13;" 
before each carriage return.
_______________________________________________
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com

Reply via email to