Fwd: report 1/3: buffer under-read in rfc2047.c lwslen()

Kevin J. McCarthy Wed, 17 Jun 2026 17:09:53 -0700

----- Forwarded message from Acts1631 <[email protected]> -----

Date: Wed, 17 Jun 2026 20:15:06 +0000
From: Acts1631 <[email protected]>
To: "[email protected]" <[email protected]>
Subject: report 1/3: buffer under-read in rfc2047.c lwslen()
X-Spam-score: 0.0
X-Delivered-to: [email protected]
Message-ID: 
<KB1oAIhwdCCndxRC7kVDWhJwXFMiK_E4A1Rr0WtQC3oDqkYPf6QD03ytEi_v2em8t8G3YKXEM_U5PYaYGijG4BPSorlrytsZh79CIJT6t7U=@proton.me>

Hi again sir. Thanks for taking my earlier bug. I have 3 more for you to 
consider. Each one has a proposed patch and reproducer. Here's the first one.

The lwslen() helper calculates the length of linear whitespace at the beginning 
of a string. It scans until the first non-whitespace character, then evaluates 
*(p - 1):

  for (; p < s + n; p++)
    if (!strchr(" \t\r\n", *p))
    {
      len = (size_t)(p - s);
      break;
    }
  if (strchr("\r\n", *(p-1)))
    len = (size_t)0;

If the string starts with a non-whitespace character, the loop breaks on its 
first iteration with p == s and len == 0. The subsequent *(p - 1) read accesses 
one byte before the buffer.

This is reachable through rfc2047_decode() when ignore_linear_white_space is 
enabled and lwslen() is called on text following an encoded word.

Example trigger subject:
  Subject: =?utf-8?Q?foo?=bar

After decoding the encoded word, rfc2047_decode() sees the trailing "bar", sets 
found_encoded, and calls lwslen("bar", 3). The first character is not whitespace, so the 
under-read occurs in the unpatched code.

Proposed fix: Guard the final look-behind with len > 0:

  if (len > 0 && strchr("\r\n", *(p-1)))
    len = (size_t)0;

When len > 0, p has advanced beyond s, so p - 1 is guaranteed to be in bounds. 
When len == 0, the CR/LF check cannot change the return value and can be safely 
skipped.

----- End forwarded message -----

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA

diff --git a/rfc2047.c b/rfc2047.c
index d72ae997..2bdd8553 100644
--- a/rfc2047.c
+++ b/rfc2047.c
@@ -815,7 +815,7 @@ static size_t lwslen(const char *s, size_t n)
       len = (size_t)(p - s);
       break;
     }
-  if (strchr("\r\n", *(p-1))) /* LWS doesn't end with CRLF */
+  if (len > 0 && strchr("\r\n", *(p-1))) /* LWS doesn't end with CRLF */
     len = (size_t)0;
   return len;
 }

#!/usr/bin/env python3
"""
Reproducer for Bug 1: rfc2047.c lwslen() buffer under-read.

This emits an mbox message whose Subject has trailing non-whitespace text
after an RFC2047 encoded word:

    Subject: =?utf-8?Q?foo?=bar

In rfc2047_decode(), once an encoded word has been found,
ignore_linear_white_space makes mutt call lwslen() on the trailing "bar".
The first byte is not whitespace, so the unpatched lwslen() reads one byte
before the buffer via *(p - 1).

Example:

    python3 reproducer_lwslen.py > /tmp/lwslen.mbox
    mutt -e 'set ignore_linear_white_space' -f /tmp/lwslen.mbox

Use an AddressSanitizer build to make the one-byte under-read visible.
"""

email = """\
From [email protected] Wed Jun 17 00:00:00 2026
From: [email protected]
To: [email protected]
Subject: =?utf-8?Q?foo?=bar
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii

This message triggers lwslen() on trailing non-whitespace text when
ignore_linear_white_space is enabled.
"""


if __name__ == "__main__":
    print(email)

signature.asc
Description: PGP signature

Fwd: report 1/3: buffer under-read in rfc2047.c lwslen()

Reply via email to