Re: Odd character, was Re: buster, ekiga.
On Tue 23 Jul 2019 at 15:31:12 (-0400), Michael Stone wrote: > On Tue, Jul 23, 2019 at 02:19:27PM -0500, David Wright wrote: > > I don't see any NUL characters, but x80 as shown below. I'm reading > > the cached message that mutt downloaded from an IMAP server. Is that > > different from you? > > I see it as x80 in mutt and x00 in the raw file on the imap server. I > assume mutt is trying to defang the nul, similar to java's conversion > to 0xc0 0x80, but I haven't actually looked through the code to > confirm. I don't think mutt is doing that. I downloaded a message directly from my hosting service's IMAP server¹ and that shows <80>, not <00>, just as mutt does. My experience with mutt is that if a NUL is sent in a "legitimate"² manner within an email, it causes truncation. I don't know whether mutt does it or the pager, but as I said elsewhere it doesn't make me happy. I'm not sure whether I can get any "closer" to my IMAP server than that, in order to find whether there's a NUL there; perhaps by logging in using my credentials? That would require some research as I don't normally access the service in that way. One thing we don't know is whether the routes being used by the MTAs to communicate with each other are 8-bit transparent or not. As pointed out by tomás, <80> and <00> only differ in the top bit. > > So it would appear the OP has pasted the Unicode "RIGHT-POINTING > > MAGNIFYING GLASS" character into their postings, which seems somewhat > > reasonable as it's used on the Debian web pages to mark all the > > Message-IDs and references thereto. > > > > Where that gets mangled along the way, I can't guess. but it would see > > that 0x80 is a reasonable choice as that's a Latin-1 Control Character > > with the meaning PAD. > > https://en.wikipedia.org/wiki/Latin-1_Supplement_(Unicode_block) > > I'm not entirely surprised that an MUA that is unaware of the changes > to internet mail that have happened since the early 80s (codified back > in 2001) is also unaware of unicode. My last paragraph wasn't necessarily limited to the behaviour of the OP's MUA. It's likely the MTAs are more up-to-date that what is alleged to be a very old MUA. ¹$ curl --url 'imaps://my-hosting-service:993/INBOX;UID=1234' --user 'my-username:my-password' -o Documents/raw-message ²eg as =00 in a quoted-printable encoded message. Cheers, David.
Re: Odd character, was Re: buster, ekiga.
On Tue, Jul 23, 2019 at 09:35:59PM +0200, to...@tuxteam.de wrote: On Tue, Jul 23, 2019 at 03:31:12PM -0400, Michael Stone wrote: On Tue, Jul 23, 2019 at 02:19:27PM -0500, David Wright wrote: >I don't see any NUL characters, but x80 as shown below. I'm reading >the cached message that mutt downloaded from an IMAP server. Is that >different from you? I see it as x80 in mutt and x00 in the raw file on the imap server. I assume mutt is trying to defang the nul, similar to java's conversion to 0xc0 0x80, but I haven't actually looked through the code to confirm. Heh. that is strange: with mutt ("edit raw") I do see an x00 (shown by vim as ^@). My message doesn't go through an IMAP server, fwiw. Dunno what exim does to it, though :-) I guess it's the imapd defanging then, before it gets to mutt.
Re: Odd character, was Re: buster, ekiga.
On Tue, Jul 23, 2019 at 03:38:33PM -0400, Greg Wooledge wrote: > On Tue, Jul 23, 2019 at 02:19:27PM -0500, David Wright wrote: > > On Tue 23 Jul 2019 at 11:07:37 (-0400), Greg Wooledge wrote: > > > Yup. Two NUL bytes in the body of the message. How completely bizarre. > > > > > > Apparently what mutt does is truncate that *line* at the first NUL > > > byte, but then show all the other lines after that just fine. > > > I don't see any NUL characters, but x80 as shown below. I'm reading > > the cached message that mutt downloaded from an IMAP server. Is that > > different from you? > > In my case, the email is sent first to a Debian 9 system running qmail + > magic-smtpd [...] > I'll try to remember to keep a copy of the next one for hex-dumping. Looking forward :-) > Meanwhile, as a test, I ran the following from my home system outside > the workplace firewall: [...] Interesting. Note that both your tests have a Content-Transfer-Encoding (first: 8bit, second: quoted-printable; the first one gets a slight indigestion, the second not). The original messages have no Content-Type, much less Content-Transfer-Encoding (so x00 as well as x80 should be both a no-no anyway). Cheers -- t signature.asc Description: Digital signature
Re: Odd character, was Re: buster, ekiga.
On Tue, Jul 23, 2019 at 02:19:27PM -0500, David Wright wrote: > On Tue 23 Jul 2019 at 11:07:37 (-0400), Greg Wooledge wrote: > > Yup. Two NUL bytes in the body of the message. How completely bizarre. > > > > Apparently what mutt does is truncate that *line* at the first NUL > > byte, but then show all the other lines after that just fine. > I don't see any NUL characters, but x80 as shown below. I'm reading > the cached message that mutt downloaded from an IMAP server. Is that > different from you? In my case, the email is sent first to a Debian 9 system running qmail + magic-smtpd (with possible interference from corporate firewall products over which I have no control), and from there to my Debian 10 desktop system, also running qmail, with qmail-smtpd as the receiver. Mail is delivered locally on the Debian 10 system to a Maildir in my home directory, and mutt reads it directly from there. No IMAP or POP3 for me. I'll try to remember to keep a copy of the next one for hex-dumping. Meanwhile, as a test, I ran the following from my home system outside the workplace firewall: printf 'Testing \0nul\0\nDid it work?\n' | mailx -s test wool...@eeg.ccf.org Here's what hd shows (last few lines only): 0370 38 22 0a 43 6f 6e 74 65 6e 74 2d 54 72 61 6e 73 |8".Content-Trans| 0380 66 65 72 2d 45 6e 63 6f 64 69 6e 67 3a 20 38 62 |fer-Encoding: 8b| 0390 69 74 0a 0a 54 65 73 74 69 6e 67 20 0a 44 69 64 |it..Testing .Did| 03a0 20 69 74 20 77 6f 72 6b 3f 0a| it work?.| 03aa Which probably means mailx on my sender truncates the line with the raw NUL bytes, and the test is inconclusive. So, next test: printf 'Test two \0nul\0\nDid it work?\n' | mutt -s test wool...@eeg.ccf.org Here's what I got: 03b0 73 66 65 72 2d 45 6e 63 6f 64 69 6e 67 3a 20 71 |sfer-Encoding: q| 03c0 75 6f 74 65 64 2d 70 72 69 6e 74 61 62 6c 65 0a |uoted-printable.| 03d0 58 2d 4f 70 65 72 61 74 69 6e 67 2d 53 79 73 74 |X-Operating-Syst| 03e0 65 6d 3a 20 4c 69 6e 75 78 20 34 2e 31 39 2e 30 |em: Linux 4.19.0| 03f0 2d 35 2d 61 6d 64 36 34 0a 55 73 65 72 2d 41 67 |-5-amd64.User-Ag| 0400 65 6e 74 3a 20 4d 75 74 74 2f 31 2e 31 30 2e 31 |ent: Mutt/1.10.1| 0410 20 28 32 30 31 38 2d 30 37 2d 31 33 29 0a 0a 54 | (2018-07-13)..T| 0420 65 73 74 20 74 77 6f 20 3d 30 30 6e 75 6c 3d 30 |est two =00nul=0| 0430 30 0a 44 69 64 20 69 74 20 77 6f 72 6b 3f 0a |0.Did it work?.| 043f ... well, that's self-explanatory, isn't it. I don't feel like writing a script to send raw NUL bytes through /usr/sbin/sendmail or through netcat mxhost 25 at this time, so I'll just leave it at that.
Re: Odd character, was Re: buster, ekiga.
On Tue, Jul 23, 2019 at 03:31:12PM -0400, Michael Stone wrote: > On Tue, Jul 23, 2019 at 02:19:27PM -0500, David Wright wrote: > >I don't see any NUL characters, but x80 as shown below. I'm reading > >the cached message that mutt downloaded from an IMAP server. Is that > >different from you? > > I see it as x80 in mutt and x00 in the raw file on the imap server. > I assume mutt is trying to defang the nul, similar to java's > conversion to 0xc0 0x80, but I haven't actually looked through the > code to confirm. Heh. that is strange: with mutt ("edit raw") I do see an x00 (shown by vim as ^@). My message doesn't go through an IMAP server, fwiw. Dunno what exim does to it, though :-) Cheers -- t signature.asc Description: Digital signature
Re: Odd character, was Re: buster, ekiga.
On Tue, Jul 23, 2019 at 02:19:27PM -0500, David Wright wrote: I don't see any NUL characters, but x80 as shown below. I'm reading the cached message that mutt downloaded from an IMAP server. Is that different from you? I see it as x80 in mutt and x00 in the raw file on the imap server. I assume mutt is trying to defang the nul, similar to java's conversion to 0xc0 0x80, but I haven't actually looked through the code to confirm. So it would appear the OP has pasted the Unicode "RIGHT-POINTING MAGNIFYING GLASS" character into their postings, which seems somewhat reasonable as it's used on the Debian web pages to mark all the Message-IDs and references thereto. Where that gets mangled along the way, I can't guess. but it would see that 0x80 is a reasonable choice as that's a Latin-1 Control Character with the meaning PAD. https://en.wikipedia.org/wiki/Latin-1_Supplement_(Unicode_block) I'm not entirely surprised that an MUA that is unaware of the changes to internet mail that have happened since the early 80s (codified back in 2001) is also unaware of unicode.
Re: Odd character, was Re: buster, ekiga.
On Tue, Jul 23, 2019 at 02:19:27PM -0500, David Wright wrote: [...] > I don't see any NUL characters, but x80 as shown below [...] Oh, that's cute :-) If I followed along correctly, the questionable mails have neither Content-Type nor Content-Transfer-Encoding. So the content type defaults to text/plain; charset=us-ascii, right? If you kill the high bit in x80 you're left with x00. Hmmm... Cheers -- tomás signature.asc Description: Digital signature
Odd character, was Re: buster, ekiga.
On Tue 23 Jul 2019 at 11:07:37 (-0400), Greg Wooledge wrote: > On Tue, Jul 23, 2019 at 07:41:20AM -0700, pe...@easthope.ca wrote: > > * From: Brad Rogers > > Oh, it's this guy again. > > /me looks at the raw mail message with less(1) > > * From: Brad Rogers ^@b...@fineby.me.uk^@ > > Yup. Two NUL bytes in the body of the message. How completely bizarre. > > Apparently what mutt does is truncate that *line* at the first NUL > byte, but then show all the other lines after that just fine. > > Other people are seeing the entire message truncated at that point, not > just one line truncated. > > Peter, whatever you're doing with your outgoing mail is really strange, > and if possible, you should try to stop it. Embedding raw NUL characters > in the body of an email is a problem. I don't see any NUL characters, but x80 as shown below. I'm reading the cached message that mutt downloaded from an IMAP server. Is that different from you? 17C0 64 2D 73 65 │ 61 72 63 68 │ 2F 45 31 68 │ 70 76 79 69 │ 2D 30 30 30 d-search/E1hpvyi-000 17D4 31 6E 78 2D │ 4B 6C 40 64 │ 61 6C 74 6F │ 6E 2E 69 6E │ 76 61 6C 69 1nx-Kl@dalton.invali 17E8 64 0A 52 65 │ 73 65 6E 74 │ 2D 44 61 74 │ 65 3A 20 54 │ 75 65 2C 20 d.Resent-Date: Tue, 17FC 32 33 20 4A │ 75 6C 20 32 │ 30 31 39 20 │ 31 34 3A 35 │ 37 3A 32 30 23 Jul 2019 14:57:20 1810 20 2B 30 30 │ 30 30 20 28 │ 55 54 43 29 │ 0A 0A 2A 09 │ 46 72 6F 6D + (UTC)..*.From 1824 3A 20 42 72 │ 61 64 20 52 │ 6F 67 65 72 │ 73 20 80 62 │ 72 61 64 40 : Brad Rogers .brad@ 1838 66 69 6E 65 │ 62 79 2E 6D │ 65 2E 75 6B │ 80 0A 2A 09 │ 44 61 74 65 fineby.me.uk..*.Date 184C 3A 20 46 72 │ 69 2C 20 31 │ 39 20 4A 75 │ 6C 20 32 30 │ 31 39 20 31 : Fri, 19 Jul 2019 1 1860 39 3A 33 32 │ 3A 34 36 20 │ 2B 30 31 30 │ 30 0A 3E 20 │ 49 74 20 77 9:32:46 +0100.> It w 1874 61 73 20 72 │ 65 70 6C 61 │ 63 65 64 20 │ 62 79 20 45 │ 6D 70 61 74 as replaced by Empat 1888 68 79 2E 0A │ 0A 54 68 61 │ 6E 6B 73 2E │ 20 20 45 6D │ 70 61 74 68 hy...Thanks. Empath Well, here's what I think is going on. The OP wrote "The links are from the debian mailing list software. 128270(9) = 1F50E(E) or 128270(decimal) = 1F50E(hexadecimal). U+1F50E is beyond the list in …" So it would appear the OP has pasted the Unicode "RIGHT-POINTING MAGNIFYING GLASS" character into their postings, which seems somewhat reasonable as it's used on the Debian web pages to mark all the Message-IDs and references thereto. Where that gets mangled along the way, I can't guess. but it would see that 0x80 is a reasonable choice as that's a Latin-1 Control Character with the meaning PAD. https://en.wikipedia.org/wiki/Latin-1_Supplement_(Unicode_block) Converting it to NUL seems hazardous to me, almost asking for trouble. Cheers, David.