R. David Murray <rdmur...@bitdance.com> added the comment:

That header is *completely* non-RFC compliant.  If gmail generated that header 
there is something very wrong in google-land :(

The RFC compliant formatting for that header looks like this:

Content-Disposition: attachment;
 filename*=utf-8''Schulbesuchsbest%C3%A4ttigung.pdf

You will note that this is nothing like encoded word format.  Encoded words are 
not valid inside quoted strings, and quoted strings can't be used in mime 
header attributes if there are non-ascii characters involved.  Nor can encoded 
words.  

Now, all that said, there is an obvious rule that can be followed to understand 
what that header is trying to convey, and the current parser already implements 
most of it (you will find comments about it in the parser, as well as defects 
being registered).  So, a patch to _header_value_parser to fix the error 
recovery will be accepted.  I've looked at the code to remind myself, but not 
deeply enough to be *sure* where the changes need to be made.  There are two 
possibilities I see off the bat (and both may need fixing): 
get_bare_quoted_string and get_parameter.  Either one or both of those may be 
forgetting that whitespace between encoded words should be dropped.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39040>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to