New submission from Tim Bell:

According to RFC 5322, an email address like this isn't valid:

u...@example.com <u...@example.com>

(The display-name "u...@example.com" contains "@", which isn't in the set of 
atext characters used to form an atom.)

How it's handled by the email package varies by policy:

>>> import email
>>> from email.policy import default
>>> email.message_from_bytes(b'To: u...@example.com <u...@example.com>')['to']
'u...@example.com <u...@example.com>'
>>> email.message_from_bytes(b'To: u...@example.com <u...@example.com>', 
>>> policy=default)['to']
'u...@example.com'
>>> email.message_from_bytes(b'To: u...@example.com <u...@example.com>', 
>>> policy=default).defects
[]

The difference between the behaviour under the compat32 vs "default" policy may 
or may not be significant.

However, if coupled with a further invalid feature, namely a space after the 
">", here's what happens:

>>> email.message_from_bytes(b'To: u...@example.com <u...@example.com> ')['to']
'u...@example.com <u...@example.com> '
>>> email.message_from_bytes(b'To: u...@example.com <u...@example.com> ', 
>>> policy=default)['to']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/message.py",
 line 391, in __getitem__
    return self.get(name)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/message.py",
 line 471, in get
    return self.policy.header_fetch_parse(k, v)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/policy.py",
 line 162, in header_fetch_parse
    return self.header_factory(name, value)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py",
 line 586, in __call__
    return self[name](name, value)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py",
 line 197, in __new__
    cls.parse(value, kwds)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py",
 line 337, in parse
    kwds['parse_tree'] = address_list = cls.value_parser(value)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/headerregistry.py",
 line 328, in value_parser
    address_list, value = parser.get_address_list(value)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py",
 line 2368, in get_address_list
    token, value = get_invalid_mailbox(value, ',')
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py",
 line 2166, in get_invalid_mailbox
    token, value = get_phrase(value)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py",
 line 1770, in get_phrase
    token, value = get_word(value)
  File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/email/_header_value_parser.py",
 line 1745, in get_word
    if value[0]=='"':
IndexError: string index out of range
>>> email.message_from_bytes(b'To: u...@example.com <u...@example.com> ', 
>>> policy=default).defects
[]

I believe that the preferred behaviour would be to add a defect to the message 
object during parsing instead of throwing an exception when the invalid header 
value is accessed.

----------
components: email
messages: 296309
nosy: barry, r.david.murray, timb07
priority: normal
severity: normal
status: open
title: Exception parsing certain invalid email address headers
type: behavior
versions: Python 3.5, Python 3.6, Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30701>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to