Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
Hi Andy, > EXMH must be lenient when it comes to missing MIME part markers (maybe > it just assumes end-of-file is good enough). mhstore(1) fares badly too. $ mhstore mhstore: bogus multipart content in message 20593 storing message 20593 part 1 as file 20593.1 $ echo $? 0 -- Cheers, Ralph. -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
>The entire size of the message on disk (including additional trace >headers added by my MTA) is 11,374,046 while the size of the offending >line is 11,370,773. That means that the rest of the message headers and >text/plain part of the message occupy 3,273 bytes. It occurs to me that allocating 11 MB shouldn't be a problem on any modern system. But really, this isn't necessary; once inc(1) parses the headers it doesn't care about the content. It could just go in a loop and read data and write it out. All it REALLY cares about is converting \r\n to \n. --Ken -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
Hi Ken, > So how big WAS this message, actually? wc(1) said the long line was 11,370,773 bytes. -- Cheers, Ralph. -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
Thus said Ken Hornstein on Tue, 10 Sep 2019 09:15:00 -0400: > So how big WAS this message, actually? I'm trying to understand the > scope of the problem. The entire size of the message on disk (including additional trace headers added by my MTA) is 11,374,046 while the size of the offending line is 11,370,773. That means that the rest of the message headers and text/plain part of the message occupy 3,273 bytes. > Really, I think that the best course of action would be that inc > always tries to write something out (unless it encounters something > like an I/O error) and exits cleanly. Actually I failed to report that inc *did* write out something. It wrote out until the MIME content started, so it got up to the headers of the MIME part and then while trying to scan the next line issued that error---the resulting file was truncated. I found the problem later when I was reading messages with EXMH which showed the attachment, but when I saved the attachment it was a 0 byte file. EXMH must be lenient when it comes to missing MIME part markers (maybe it just assumes end-of-file is good enough). Thanks, Andy -- TAI64 timestamp: 40005d77a9cf -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
>That's actually how I figured this problem out. I found that my POP3 >daemon kept crashing and when I investigated it, I found that it was >because it didn't have sufficient memory to respond to inc's RETR >command. After I increased the amount of memory that the POP3 daemon was >allowed to allocate, the RETR command succeeded, but then I ended up >with an inc that refused to incorporate emails. So how big WAS this message, actually? I'm trying to understand the scope of the problem. >Whether or not we think making inc handle nonconforming lines is worth >tackling, it might be a good idea to make inc handle the failure a >little better. What happened instead was that inc exited after having >partially RETR'ieved the message, without having told the POP3 server to >DELE the ones it had already successfully pulled down. So each time I >ran inc, it would pull down the messages, die on the same bogus message, >and repeat; so that I ended up with a few duplicates. > >I think issuing a warning and leaving a bad message on the server would >be better than aborting the entire POP3 session and causing a repeat. Architecturally, this is difficult. We issue a DELE after every message we RETR, but those DELE's dont get committed until you issue the QUIT (this is part of the POP3 protocol). We call die() a lot and that just means we call exit() and never issue the QUIT. Really, I think that the best course of action would be that inc always tries to write something out (unless it encounters something like an I/O error) and exits cleanly. --Ken results in a -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
>> In a perfect world I think we SHOULD parse those messages (up to the >> limits of virtual memory), but right now we don't. > >My regular statement that no we should not accept such dribble into the >system. Users should be aware it's arriving. If they then want to fix >it, then mhfixmsg(1) is where knowledge of all the world's crud can be >put. The problem I have with THAT is that pretty much every other MUA deals with this just fine; that makes us the odd one out. And I see a chicken and egg problem here; if we can't incorporate such a message, we can't really have mhfixmsg deal with it. Also, thinking more about this makes me think that at least for inc we should be able to deal with this WITHOUT having to parse everything line-by-line. Even if it was a single line of 200 MB, you should be able to write that out without having to malloc() out 200 MB. --Ken -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
Hi Andy, > What happened instead was that inc exited after having partially > RETR'ieved the message, without having told the POP3 server to DELE > the ones it had already successfully pulled down. You may want to see how fetchmail(1) does out of interest. -- Cheers, Ralph. -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
Hi Ken, > In a perfect world I think we SHOULD parse those messages (up to the > limits of virtual memory), but right now we don't. My regular statement that no we should not accept such dribble into the system. Users should be aware it's arriving. If they then want to fix it, then mhfixmsg(1) is where knowledge of all the world's crud can be put. And some users, like David, can arrange for it to handle all incoming email so they never need to know. https://tools.ietf.org/html/draft-thomson-postel-was-wrong-00 I would not like virtual memory to have to be exhausted, crippling the machine's performance, before nmh finally bails out and gives a `line too long' that it could have stated much, much earlier. -- Cheers, Ralph. -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
Hi Andy, > $ time (cat bigmessage | sed -ne '62p' | wc) >1 1 11370773 I expect you know how to remedy this so you can read the email. Something like perl -lpe 'length > 39 and s/.{16}/$&\n/g' adjusting those numbers used for testing to suit. -- Cheers, Ralph. -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
Thus said Ken Hornstein on Mon, 09 Sep 2019 17:04:05 -0400: > In a perfect world I think we SHOULD parse those messages (up to the > limits of virtual memory), but right now we don't. That's actually how I figured this problem out. I found that my POP3 daemon kept crashing and when I investigated it, I found that it was because it didn't have sufficient memory to respond to inc's RETR command. After I increased the amount of memory that the POP3 daemon was allowed to allocate, the RETR command succeeded, but then I ended up with an inc that refused to incorporate emails. Whether or not we think making inc handle nonconforming lines is worth tackling, it might be a good idea to make inc handle the failure a little better. What happened instead was that inc exited after having partially RETR'ieved the message, without having told the POP3 server to DELE the ones it had already successfully pulled down. So each time I ran inc, it would pull down the messages, die on the same bogus message, and repeat; so that I ended up with a few duplicates. I think issuing a warning and leaving a bad message on the server would be better than aborting the entire POP3 session and causing a repeat. > Based on my personal experience ... you may not be able to find anyone > who really cares about fixing that (I have run into some people who > care about fixing broken email, most of the time I get ignored or > blown off). Just to warn you. Yeah, I just wanted to double-check my facts before I sent off an email asking them if they are aware of their misbehaving mail system. I'll see how they react (if they even get the message---it's difficult to find functioning postmaster@ addresses these days). Thanks, Andy -- TAI64 timestamp: 40005d7706d8 -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers
Re: [nmh-workers] inc: Unable to find a line terminator after 32768 bytes
>This is the first time I've ever seen such an error from inc. In looking >at the message that is causing the problem, apparently it's a MIME >message that has a base64 encoded MIME body that is all on one line that >even sed has a hard time parsing: So ... yeah, that's a total violation of RFC 5322 (sendmail, for all of it's faults, will force a newline when it encounters those huge-lined messages, although that has it's own problems). Nothing in MIME supercedes any of those limits; those messages shouldn't be generated. Well, okay, there is ONE minor exception; content which has a Content-Transfer-Encoding of "binary" does not require a CR-LF pair every 1000 bytes. That doesn't apply in this case, and I have never actually seen any binary-encoded content in the wild. I only mention it out of completeness. In a perfect world I think we SHOULD parse those messages (up to the limits of virtual memory), but right now we don't. >Is this something I should report to the sender as a clear violation of >RFC5322, which as far as I can tell, restricts line lengths to 998 >characters, or is there something special about MIME that supersedes the >limit and which means inc needs fixing? Based on my personal experience ... you may not be able to find anyone who really cares about fixing that (I have run into some people who care about fixing broken email, most of the time I get ignored or blown off). Just to warn you. --Ken -- nmh-workers https://lists.nongnu.org/mailman/listinfo/nmh-workers