Abhilash Raj <raj.abhila...@gmail.com> added the comment: I tried to take a look at the code to see where the fix needs to be and I probably need some help.
I looked at the parse tree for the header and it looks something like this: ContentDisposition([Token([ValueTerminal('attachment')]), ValueTerminal(';'), MimeParameters([Parameter([Attribute([CFWSList([WhiteSpaceTerminal(' ')]), ValueTerminal('filename')]), ValueTerminal('='), Value([QuotedString([BareQuotedString([EncodedWord([ValueTerminal('Schulbesuchsbestättigung.')]), WhiteSpaceTerminal(' '), EncodedWord([ValueTerminal('pdf')])])])])])])]) The offending piece of code, which seems to be working as designed is get_bare_quoted_string() in email/_header_value_parser.py. while value and value[0] != '"': if value[0] in WSP: token, value = get_fws(value) elif value[:2] == '=?': try: token, value = get_encoded_word(value) bare_quoted_string.defects.append(errors.InvalidHeaderDefect( "encoded word inside quoted string")) except errors.HeaderParseError: token, value = get_qcontent(value) else: token, value = get_qcontent(value) bare_quoted_string.append(token) It just loops and parses the values. We cannot ignore the FWS until we know that the atom before and after the FWS are encoded words. I can't seem to find a clean way to look-ahead (which can perhaps be used in get_parameters()) or look-back (which can be used after parsing the entire bare_quoted_string?) in the parse tree to delete the offending whitespace. Any example of such kind of parse-tree manipulation in the code base would be awesome! ---------- versions: +Python 3.9 -Python 3.5, Python 3.6 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39040> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com