Ken Hornstein wrote in <20230219001921.597ad1e0...@pb-smtp20.pobox.com>: ... |- mutt ... |[.]Internally mutt does |have an idea if the content contains a NUL (the CONTENT structure contains |a 'nulbin' member which contains the number of NUL bytes), but it's not |clear to me what happens when a NUL is encountered.
Seems to me this is classifcation of attachment data, which will end up as octet-stream in that case. For S-nail we more or less do what Heirloom mailx has done. For classification purposes we switch to octet-stream. For display purposes we happily display it after passing it through some kind of makeprint. isuni = ((n_psonce & n_PSO_UNICODE) != 0); ... if(!iswprint(wc) && wc != '\n' /*&& wc != '\r' && wc != '\b'*/ && wc != '\t'){ if ((wc & ~S(wchar_t,037)) == 0) wc = isuni ? 0x2400 | wc : '?'; else if(wc == 0177) wc = isuni ? 0x2421 : '?'; else wc = isuni ? 0x2426 : '?'; }else if(isuni){ /* TODO ctext */ /* Need to filter out L-TO-R and R-TO-R marks TODO ctext */ if(wc == 0x200E || wc == 0x200F || (wc >= 0x202A && wc <= 0x202E)) continue; /* And some zero-width messes */ if(wc == 0x00AD || (wc >= 0x200B && wc <= 0x200D)) continue; /* Oh about the ISO C wide character interfaces, baby! */ if(wc == 0xFEFF) continue; } Or, without mb* and wc* sausage, { int c; while(inp < maxp){ c = *inp++ & 0377; if(!su_cs_is_print(c) && c != '\n' && c != '\r' && c != '\b' && c != '\t') c = '?'; *outp++ = c; } out->l = in->l; } This is even a degression against Heirloom mailx that Jörg Schilling was very dissatisfied about, as the above only handles ASCII printable regardless of the locale. (My plan was to write a CText library for Unicode handling, and it was quite progressed with only about two months until decomposition and normalization were implemented (Christmas 2014), when something very bad happened. Maybe i will do it someday. Or simply do what OpenBSD does and use perl's fantastic Unicode support to generate some tables.) The implementation is total crap. (longjmp codebase, data leaks, blocking I/O, all that (it was).) All of these (mailbox read, content-transfer decoding, character set conversion, .. display preparation) should be "filters" with input and output plugged together, with internal buffers as necessary. That is the v15 MIME and I/O layer rewrite that is not happening for nine years. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)