On Wed, Sep 14, 2016 at 12:30:06PM -0700, Junio C Hamano wrote:

> Another small thing I am not sure about is if the \ quoting can hide
> an embedded newline in the author name.  Would we end up turning
>       From: "Jeff \
>             King" <p...@peff.net>
> or somesuch into
>       Author: Jeff
>         King
>         Email: p...@peff.net
> ;-)

Heh, yeah. That is another reason to clean up and sanitize as much as
possible before stuffing it into another text format that will be

> So let's roll the \" -> " into mailinfo.
> I am not sure if we also should remove the surrounding "", i.e. we
> currently do not turn this
>       From: "Jeff King" <p...@peff.net>
> into this:
>       Author: Jeff King
>         Email: p...@peff.net
> I think we probably should, and remove the one that does so from the
> reader.

I think you have to, or else you cannot tell the difference between
surrounding quotes that need to be stripped, and ones that were
backslash-escaped. Like:

  From: "Jeff King" <p...@peff.net>
  From: \"Jeff King\" <p...@peff.net>

which would both become:

  Author: "Jeff King"
  Email: p...@peff.net

though I am not sure the latter one is actually valid; you might need to
be inside syntactic quotes in order to include backslashed quotes. I
haven't read rfc2822 carefully recently enough to know.

Anyway, I think that:

  From: One "Two \"Three\" Four" Five

may also be valid. So the quote-stripping in the reader is not just "at
the outside", but may need to handle interior syntactic quotes, too. So
it really makes sense for me to clean and sanitize as much as possible
in one step, and then make the parser of mailinfo as dumb as possible.


Reply via email to