Eric Wong <[email protected]> wrote:
> Leah Neukirchen <[email protected]> wrote:
> > 2) Weird From: lines crash the whole import
> > 
> > From: "=?iso-8859-1?Q?Jochen_K=FCpper?= <usenet"@jochen-kuepper.de
> > 
> > This funny line broke import_maildir:
> > 
> > fatal: Missing > in ident string: =?iso-8859-1?Q?Jochen_K=FCpper?= usenet 
> > <"=?iso-8859-1?Q?Jochen_K=FCpper?= <usenet"@jochen-kuepper.de> 1101853296 
> > +0100
> > fast-import: dumping crash report to 
> > /var/lib/public-inbox/repositories/ding.git/fast_import_crash_31402
> > EOF from fast-import:  at 
> > /usr/share/perl5/vendor_perl/PublicInbox/Import.pm line 96, <$r> line 54681.
> > 
> > I fixed it manually.  (But I think it's actually a valid mail address,
> > even in this botched state.)  I'm not sure what added the ">", it's
> > not in the original mail.
> > 
> > (I use public-inbox-1.3.0/git-2.25.0 on Void Linux.)
> 
> Gah, this looks like it's because Email::Address::XS leaves a
> "<" in the name...   Perhaps Import should delete all [<>]
> characters unconditionally? (or swap in appropriate Unicode
> homographs and assume users have the necessary glyphs...)

So we already do `$name =~ tr/<>//d', so I think doing the same
with `$email' is appropiate for fast-import.  The "correct"
address featuring '<' will still be indexed in Xapian, at least.

-------------8<-------------
Subject: [PATCH] import: drop '<' and '>' characters in addresses

Some strange "From:" lines will cause Email::Address::XS to
leave '<' (and presumably '>') in the address which
git-fast-import won't accept even if quoted.  Workaround this
problem by deleting '<' and '>' the same way we delete them for
the ident name.

Reported-by: Leah Neukirchen <[email protected]>
Link: https://public-inbox.org/meta/[email protected]/
---
 lib/PublicInbox/Import.pm | 4 ++++
 t/import.t                | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm
index d8dc49b8..68dc0c7e 100644
--- a/lib/PublicInbox/Import.pm
+++ b/lib/PublicInbox/Import.pm
@@ -293,6 +293,10 @@ sub extract_cmt_info ($) {
                }
        }
        if (defined $email) {
+               # Email::Address::XS may leave quoted '<' in addresses,
+               # which git-fast-import doesn't like
+               $email =~ tr/<>//d;
+
                # quiet down wide character warnings with utf8::encode
                utf8::encode($email);
        } else {
diff --git a/t/import.t b/t/import.t
index e71dd714..b88d308e 100644
--- a/t/import.t
+++ b/t/import.t
@@ -55,6 +55,8 @@ $im->done;
 my @revs = $git->qx(qw(rev-list HEAD));
 is(scalar @revs, 1, 'one revision created');
 
+my $odd = '"=?iso-8859-1?Q?J_K=FCpper?= <usenet"@example.de';
+$mime->header_set('From', $odd);
 $mime->header_set('Message-ID', '<[email protected]>');
 $mime->header_set('Subject', 'msg2');
 like($im->add($mime, sub { $mime }), qr/\A:\d+\z/, 'added 2nd message');
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/

Reply via email to