[Python-checkins] gh-143935: Email preserve parens when folding comments (#143936)

Yhg1s Mon, 19 Jan 2026 04:38:57 -0800

https://github.com/python/cpython/commit/17d1490aa97bd6b98a42b1a9b324ead84e7fd8a2
commit: 17d1490aa97bd6b98a42b1a9b324ead84e7fd8a2
branch: main
author: Seth Michael Larson <[email protected]>
committer: Yhg1s <[email protected]>
date: 2026-01-19T12:38:22Z
summary:


gh-143935: Email preserve parens when folding comments (#143936)

Fix a bug in the folding of comments when flattening an email message
using a modern email policy. Comments consisting of a very long sequence of
non-foldable characters could trigger a forced line wrap that omitted the
required leading space on the continuation line, causing the remainder of
the comment to be interpreted as a new header field. This enabled header
injection with carefully crafted inputs.

Co-authored-by: Denis Ledoux <[email protected]>

files:
A Misc/NEWS.d/next/Security/2026-01-16-14-40-31.gh-issue-143935.U2YtKl.rst
M Lib/email/_header_value_parser.py
M Lib/test/test_email/test__header_value_parser.py

diff --git a/Lib/email/_header_value_parser.py 
b/Lib/email/_header_value_parser.py
index 46fef2048babe7..172f9ef9e5f096 100644
--- a/Lib/email/_header_value_parser.py
+++ b/Lib/email/_header_value_parser.py
@@ -101,6 +101,12 @@ def make_quoted_pairs(value):
     return str(value).replace('\\', '\\\\').replace('"', '\\"')
 
 
+def make_parenthesis_pairs(value):
+    """Escape parenthesis and backslash for use within a comment."""
+    return str(value).replace('\\', '\\\\') \
+        .replace('(', '\\(').replace(')', '\\)')
+
+
 def quote_string(value):
     escaped = make_quoted_pairs(value)
     return f'"{escaped}"'
@@ -943,7 +949,7 @@ def value(self):
         return ' '
 
     def startswith_fws(self):
-        return True
+        return self and self[0] in WSP
 
 
 class ValueTerminal(Terminal):
@@ -2963,6 +2969,13 @@ def _refold_parse_tree(parse_tree, *, policy):
                     [ValueTerminal(make_quoted_pairs(p), 'ptext')
                      for p in newparts] +
                     [ValueTerminal('"', 'ptext')])
+            if part.token_type == 'comment':
+                newparts = (
+                    [ValueTerminal('(', 'ptext')] +
+                    [ValueTerminal(make_parenthesis_pairs(p), 'ptext')
+                     if p.token_type == 'ptext' else p
+                     for p in newparts] +
+                    [ValueTerminal(')', 'ptext')])
             if not part.as_ew_allowed:
                 wrap_as_ew_blocked += 1
                 newparts.append(end_ew_not_allowed)
diff --git a/Lib/test/test_email/test__header_value_parser.py 
b/Lib/test/test_email/test__header_value_parser.py
index 426ec4644e3096..e28fe3892015b9 100644
--- a/Lib/test/test_email/test__header_value_parser.py
+++ b/Lib/test/test_email/test__header_value_parser.py
@@ -3294,6 +3294,29 @@ def 
test_address_list_with_specials_in_long_quoted_string(self):
             with self.subTest(to=to):
                 self._test(parser.get_address_list(to)[0], folded, 
policy=policy)
 
+    def test_address_list_with_long_unwrapable_comment(self):
+        policy = self.policy.clone(max_line_length=40)
+        cases = [
+            # (to, folded)
+            ('(loremipsumdolorsitametconsecteturadipi)<[email protected]>',
+             '(loremipsumdolorsitametconsecteturadipi)<[email protected]>\n'),
+            ('<[email protected]>(loremipsumdolorsitametconsecteturadipi)',
+             '<[email protected]>(loremipsumdolorsitametconsecteturadipi)\n'),
+            ('(loremipsum dolorsitametconsecteturadipi)<[email protected]>',
+             '(loremipsum dolorsitametconsecteturadipi)<[email protected]>\n'),
+             ('<[email protected]>(loremipsum dolorsitametconsecteturadipi)',
+             '<[email protected]>(loremipsum\n 
dolorsitametconsecteturadipi)\n'),
+            ('(Escaped \\( \\) chars \\\\ in comments stay 
escaped)<[email protected]>',
+             '(Escaped \\( \\) chars \\\\ in comments stay\n 
escaped)<[email protected]>\n'),
+            
('((loremipsum)(loremipsum)(loremipsum)(loremipsum))<[email protected]>',
+             
'((loremipsum)(loremipsum)(loremipsum)(loremipsum))<[email protected]>\n'),
+            ('((loremipsum)(loremipsum)(loremipsum) 
(loremipsum))<[email protected]>',
+             '((loremipsum)(loremipsum)(loremipsum)\n 
(loremipsum))<[email protected]>\n'),
+        ]
+        for (to, folded) in cases:
+            with self.subTest(to=to):
+                self._test(parser.get_address_list(to)[0], folded, 
policy=policy)
+
     # XXX Need tests with comments on various sides of a unicode token,
     # and with unicode tokens in the comments.  Spaces inside the quotes
     # currently don't do the right thing.
diff --git 
a/Misc/NEWS.d/next/Security/2026-01-16-14-40-31.gh-issue-143935.U2YtKl.rst 
b/Misc/NEWS.d/next/Security/2026-01-16-14-40-31.gh-issue-143935.U2YtKl.rst
new file mode 100644
index 00000000000000..c3d864936884ac
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2026-01-16-14-40-31.gh-issue-143935.U2YtKl.rst
@@ -0,0 +1,6 @@
+Fixed a bug in the folding of comments when flattening an email message
+using a modern email policy. Comments consisting of a very long sequence of
+non-foldable characters could trigger a forced line wrap that omitted the
+required leading space on the continuation line, causing the remainder of
+the comment to be interpreted as a new header field. This enabled header
+injection with carefully crafted inputs.

_______________________________________________
Python-checkins mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/python-checkins.python.org
Member address: [email protected]

[Python-checkins] gh-143935: Email preserve parens when folding comments (#143936)

Reply via email to