R. David Murray <[email protected]> added the comment:
Here's a patch that makes the example work correctly. This is not a fix, a
real fix will be more complicated. This just demonstrates the kind of thing
that needs fixing and where.
The existing parser produces a sub-optimal parse tree as its result...the parse
tree is hard to inspect and manipulate because there are so many special cases.
A good fix here would create some sort of function that could be passed an
existing TokenList, the new token to add to that list, and the function would
check all the special cases and do the EWWhiteSpaceTerminal substitution when
and as appropriate. This could then be used in the unstructured parser as well
as Phrase...and some thought should be given to where else it might be needed.
It has been long enough since I've held the RFCs in my head that I don't
remember if there is anywhere else.
I haven't looked at the actual character string, so I don't know if we need to
also be detecting and posting a defect about a split character or not, but we
don't *have* to answer that question to fix this.
diff --git a/Lib/email/_header_value_parser.py
b/Lib/email/_header_value_parser.py
index e805a75..d5d5986 100644
--- a/Lib/email/_header_value_parser.py
+++ b/Lib/email/_header_value_parser.py
@@ -199,6 +199,10 @@ class CFWSList(WhiteSpaceTokenList):
class Atom(TokenList):
+ @property
+ def has_encoded_word(self):
+ return any(t.token_type=='encoded-word' for t in self)
+
token_type = 'atom'
@@ -1382,6 +1386,12 @@ def get_phrase(value):
"comment found without atom"))
else:
raise
+ if token.has_encoded_word:
+ assert phrase[-1].token_type == 'atom', phrase[-1]
+ assert phrase[-1][-1].token_type == 'cfws'
+ assert phrase[-1][-1][-1].token_type == 'fws'
+ if phrase[-1].has_encoded_word:
+ phrase[-1][-1] = EWWhiteSpaceTerminal(phrase[-1][-1][-1],
'fws')
phrase.append(token)
return phrase, value
----------
______________________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue35547>
______________________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com