Thomas Weißschuh <[email protected]> wrote:
> Hi,
> 
> it seems the rendering of \r\n (Windows-style) linebreaks, is a bit suboptimal
> on the website.
> 
> The \r are rendered literally. Mutt for example does not.
> 
> Example: 
> https://lore.kernel.org/lkml/[email protected]/

Thanks for the example.

> Raw message:
>     ...
>     Content-Type: text/plain; charset="utf-8"
>     Content-Transfer-Encoding: quoted-printable
>     ...
> 
> 
>     Hi,=0D
>     =0D
>     ....
> 
> Rendered:
> 
>     ....
>     Hi,\r
>     \r
>     ...
> 
> 
> The fix is probably obvious for you, if not I can try to come up with one.

Yes, except I remember adding support for CR-LF long ago...
The problem here is some messages are CR-CR-LF for some odd reason.
Oh well, it's a 1 character fix on our end for the HTML.

Not sure if ContentHash (deduplication) and SolverGit (blob
regeneration) ought to strip redundant CR, yet...

-------8<-------
Subject: [PATCH] view: remove all CR before LF

While we've rendered CR-LF as LF-only in HTML for many years,
some messages end up as CR-CR-LF.  So strip ALL all CR bytes
preceding LF bytes, while preserving odd CR in the middle of
lines.

Reported-by: Thomas Weißschuh <[email protected]>
Link: 
https://public-inbox.org/meta/[email protected]/
---
 lib/PublicInbox/View.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 2e9cf705..ca02ae05 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -586,7 +586,7 @@ sub add_text_body { # callback for each_part
 
        # makes no difference to browsers, and don't screw up filename
        # link generation in diffs with the extra '%0D'
-       $s =~ s/\r\n/\n/sg;
+       $s =~ s/\r+\n/\n/sg;
 
        # will be escaped to `&#8226;' in HTML
        obfuscate_addrs($ibx, $s, "\x{2022}") if $ibx->{obfuscate};
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/

Reply via email to