Re: mysterious c0 80
>> I took a quick look at the emails and don't understand the problem. >> Michael, can you explain how to replicate it? > >I got close to understanding the problem, and then it escaped my brain. >It was a null byte in the buffer, written by emacs, but I didn't figure out >why. I had something to do with GPG signing, I think. If the answer is "it's all emacs's fault", then that's fine with me. We MIGHT be turning a NUL into C0 80 via iconv, but I don't think that's really our fault (if that's what is happening). --Ken
Re: mysterious c0 80
>Closing the loop on this: > >> There was a recent bug reported by Cy Schubert that may be the cause >> of this: >> >>http://savannah.nongnu.org/bugs/?65002 > >That nmh bug has been closed: > >Let's close this bug. The problem was determined to be a bug in ZFS >block_cloning. So I am confused; was Cy's original bug caused by ZFS, or Michael's? Looking back at the mailing list archives, it seems we never found out a cause of Michael's issue. C0 80 is technically invalid UTF-8, but it does decode to a NUL character (00) so I could see a naive encoder converting a NUL to those bytes. I have a difficult time believing that nmh could generate that out of nowhere, but I've been wrong before. --Ken
Re: Bug #65934: rcvdist runs afoul of gmail DMARC/DKIM/ARC policies
>The recent bug report, Bug #65934: rcvdist runs afoul of gmail >DMARC/DKIM/ARC policies, could use some discussion here, I believe. > >Here are the contents of the bug >(https://savannah.nongnu.org/bugs/?65934): > >I have been using slocal and rcvdist for many years to forward >filtered mail to a private gmail address. But their recent >enforcement of DKIM/DMARC policies is causing many (most?) such >messages to be bounced, since my SMTP server doesn't match the From: >address in the message. I regard this behavior as a feature of >rcvdist, since the incoming message appears to be from the original >sender. But I acknowledge that this use case is not easy to >distinguish from spam. > >Is there a workaround here? Perhaps a switch (or user customization >via rcvdistcomps) to allow a modified from line along the lines of >"Original Sender "? Or is rcvdist no longer the >right tool for this purpose? Oh, hm, what a great question. I hope the creator of this bug is on nmh-workers and can chime in. However ... I'm not sure there's a wonderful answer, although maybe it's possible with some (possibly a lot of) work. As I understand it from this bug report, the message flow is: - User gets email from "person@internet" to their nmh mailbox ("user@nmh"), presumably with a From: header field of "person@internet". - rcvdist dist(1)s that email to "privategm...@gmail.com". So the email has these headers: Resent-From: user@nmh Resent-To: privategm...@gmail.com From: person@internet To: user@nmh - Gmail rejects this based on From: person@internet (maybe?) Now, due to general lack of feedback on SPAM filtering, it might not be possible to determine exactly why it fails. At first glance SPF SHOULDN'T be affected because that works on the SMTP Mail From and in theory post(8) with the -dist flag would take the SMTP envelope from Resent-From so that should be correct. But maybe you're one of those people against all advice who use "mts: sendmail/pipe" in your mts.conf file. In that case I'm not really sure what your SMTP envelope is going to end up being; that might be problem one. A quick perusal of my email shows DKIM signatures that include Rsent-From, Resent-To, and Resent-Date and rcvdist modifies those headers so that would break a DKIM signature, so that's possibly another place where things break. If your local email provider does produce their own DKIM signature headers then in theory at least one of the signatures in the message headers would be valid and that should solve that problem, but again we don't know anything about your email setup. Another issue is depending on the DMARC DNS record, the non-working domains might specify a strict SPF alignment check, which means that your SMTP envelope should "align" with the email From: header field. Again, the details there matter; if the alignment is "strict" the domains have to match exactly, and what those end up as depends entirely on the details of your email configuration. It might be interesting to have example domains that succeed and fail in that setup so we could query them and see what the relevant DKIM/DMARC/SPF DNS records are; that was why I am hoping the creator of that bug report is willing to chime in. Now, first off ... what, exactly, are you trying to accomplish with this setup? I ask because while in a perfect world this should work just fine and you wouldn't need to change anything, unfortunately in 2024 that isn't the reality we live in and no amount of screaming into the void will change that. So I'm wondering WHY you do this: do you want to read those messages on your phone and you never reply? Do you want to do something else with them on your private Gmail account? Are you just using Google as a last-ditch backup mechanism? The answers here kind of depend on your goal. Now, as to SOLUTIONS ... well, there aren't wonderful answers. But here are some general thoughts: - If you want to perform some surgery on the original message to the point where it's not easy to reply to it from gmail, you could: - Strip off all DKIM/DMARC/ARC headers - Change the From: header field to point to something in your email domain. You should be able to accomplish this with anno(1) and some shell scripting. That should get the contents of the email onto Gmail where you can at least read it. - Do the right magic to forward the message along. Ralph brings up ARC, but my understanding is that isn't quite enough. You're going to need DKIM, SRS, _and_ ARC, at least according to this post: https://serverfault.com/questions/949620/gmail-rejects-forwarded-mail-with-dmarc-but-i-am-using-srs That references this post which has more details (a lot of it is postfix-specific but it gives you an idea of what you need to do): https://forum.howtoforge.com/threads/postfix-rspamd-do-not-dkim-sign-forwarded-messages-solved.87742/ There are a lot of details t
Re: folders which are numerical names
>Hi David, > >> How about this simple addition to identify the offending directory: > >Passes the ‘First, do no harm’ test. +1. Also +1 from me. --Ken
Re: folders which are numerical names
>I'm willing to bet if I look back in the prior 3 decades I'll find myself >asking the same question and the infrequency of it bugging me, is a strong >indication of how (un)important it is. > >maybe I'd move my marker and ask if it could become a "soft" error on scan: > >% ls -F | sort -n | tail -5; >586 >587 >588/ >589 >590 >% scan +sent >... > 584 2024/06/17 To:Leo Li > 585 2024/06/18 To:APNIC Office > 586 2024/06/18 To:Dan Fidler > 587 2024/06/18 To:Joao Damas >scan: unable to read: Is a directory >scan: scan() botch (-3) This happens because open() failing is considered a "soft" error and scan will continue, but read() failing is considered a hard error. I'd be willing to consider read() failing with EISDIR a soft error. Thoughts from others? --Ken
Re: folders which are numerical names
>if your folder is a primary directly under $MAILPATH/ it works fine, as a >pure numeric > >so folder +2021 is fine -and I do this for archives by year. > >if your folder is sub-folder under another folder like +sent/2021 it only >kind-of works. Any function which performs scan functions looks to pure >numeric objects as files, and barfs on the directory. So I work around it >by having archives like sent/y2021 and it works fine. The exact case is that there isn't a problem as long as something that is a folder doesn't contain a non-file who's name consists of all decimal digits. So in theory 'scan +$MAILPATH' probably won't work so well (but maybe that's not that useful in practice). >the other path out here is to do folder +2021/sent and subfolder sent >inside the archive year which works fine. > >I just wondered if a stat() on the object to skip (sub)folders in contexts >which demand files would work? Might be a pain to implement without >unwanted side-effects. This comes up occasionaly, most recently here: https://lists.nongnu.org/archive/html/nmh-workers/2023-03/msg0.html Unfortunately, there are no good answers. stat()ing every possible message would be very expensive. dirent->d_type is not always reliable (and it's not easy to figure out when it will be reliable). The consensus in the past has been "don't do that". I am open to suggestions of course. --Ken
Re: Trivia: space before '(others)' from 'folder' command
>.. note the weird spacing before '(others)' in the second. I'm guessing >this is (line 613 of ./nmh/uip/folder.c on master, I think) because >'folder' and 'folders' ?maybe? share the same binary and the extra >'curmsgdigits + 6' spacing is to keep columns aligned in 'folders' >output. Possibly this could be conditional on the 'all' flag which gets >turned on if the command name ends in 's'? Yeah, that is exactly why it happens. Honestly it doesn't personally bother me, but what do others think? --Ken
Re: Where is my editor?
>Thus said Robert Elz on Fri, 19 Jan 2024 12:30:08 +0700: > >> That's "prompter" - has always been mh's default. > >Not always: > >https://git.savannah.nongnu.org/cgit/nmh.git/log/sbr/geteditor.c?h=1.8-release > >Looks like it was changed in 1.8 (if I read that correctly). > >I wasn't aware of "prompter" before, thanks for the education. I was curious and went back and looked. This was apparantly prompted (pun intended) by Fedora packaging and was discussed in this email thread, which, oddly enough Andy, you DID participate in: https://lists.nongnu.org/archive/html/nmh-workers/2018-03/msg5.html Like many threads involving ancient greybeards, it kind of devolved into a discussion about the "true" vi and how vim wasn't vi enough, but I guess part of this change is my fault since I went back and dug around to find out the original behavior of MH, which was to use "prompter" (the details are a little more complicated): https://lists.nongnu.org/archive/html/nmh-workers/2018-03/msg00037.html And the consensus of everyone was prompter was fine as a default. Also, Andy, you later said in that thread: Under what conditions will this change? I have neither EDITOR/VISUAL nor profile settings for editor, but maybe that won't matter because my usage patterns will never invoke prompter? Right now, when I run comp from the command line, I get a vi editor with with components in it. Is this where prompter comes in? Sounds like I might have to add something to my profile now after this change is made to avoid prompter. David replied to you and said: Yes, it does. Add this to your profile to preserve your current behavior: Editor: vi I'm not dragging you because it was almost 6 years ago and I forgot until now that I made that change, much less everything behind it. But to be fair, we did have a reasonable discussion about this change (that it seems we all forgot about). --Ken
Re: Strange problem replying to message
>It is quite possible. I guess I am a LITTLE surprised this is the first instance you've ever seen where a text part was encoded with base64; I get those all of the time (and not just in work emails either). Maybe you get those more often than you think and this is just the first time you wanted to reply to such a message? But yes, nmh out of the box does not handle that well and it doesn't sound like there was any configuration change on your side. Note that I don't think anybody thinks that the current default nmh behavior is optimal by any means, but fixing this natively is ... complicated. >Perhaps I am losing it but when I first looked at the header fields >for the message that was giving me problems I could have sworn it said >"Content-Transfer-Encoding: Base64". Now when I look at it instead of >base64 it says 8bit. Am I confused or is that what mhfixmsg did, changed >the message from base64 to 8bit? As others have indicated, yes, that is exactly what it does. >I also notice looking thru other messages that most say 7bit. That is the "standard" encoding for ASCII email. If people send text parts with 8bit characters the encoding format varies. I suspect you probably see quoted-printable a fair amount, as my unscientific impression is that encoding is a lot more common for text parts with 8bit characters. --Ken
Re: Strange problem replying to message
>Yesterday everything worked. Today, trying to reply to a message I get: >[...] >I am running OpenBSd 7.4 and nmh-1.8. My ~/.mh_profile contains the >line: > >repl: -query -annotate -nocc me -filter mhl.reply > >and the mhl.reply filter is: >[...] I don't see anything in this setup that would undo a base64-encoded text message body. Is it simply possible you haven't seen one of those recently? --Ken
Re: mysterious c0 80
>> So while I agree post would fail with this hypothetical dist(1) of a >> message containing a NUL, it's not clear you could inc(1) such a >> message in the first place. > >Today or historically? Historically is a long time. I can't say that I know every single line of nmh and MH code since the dawn of time, but I've been in and out of it a lot. The assumption that you can represent email lines as C strings has been embedded in nmh and MH for as long as I can tell. Such emails don't work with nmh, and they probably don't work well with other MUAs either. Okay, fine, "don't work" is a bit of an oversimplification; it seems like those lines will be silently truncated at the NUL character, so they probably "work" fine and you just don't notice the NULs. --Ken
Re: mysterious c0 80
>> I'm thinking of removing the support in post(8) for sending NULs. Any >> disagreement? It's not a lot of code so could be easily restored in >> the future if conditions change. I think this makes sense, and it does seem to cause some kind of problem as reported in Cy's message. It would be nice to understand the root cause of the bug, though. >> > Now, how about dist(1) of that old email? I'd have thought it >> > should send the old email verbatim, NUL and all. If that causes a >> > bounce then the sender can MIME-forward instead with a single >> > message/rfc822 part. >> >> Agreed. > >But doesn't dist → send → post so if you remove post's support for >sending NULs then dist won't be able to send the old email verbatim. It is not clear to me that any of the OTHER nmh programs could actually even receive a message with NUL in it, and plenty of other programs fail if a message contains a NUL. Here's some messages when I brought this up last year: https://lists.nongnu.org/archive/html/nmh-workers/2023-02/msg00029.html https://lists.nongnu.org/archive/html/nmh-workers/2023-04/msg00031.html So while I agree post would fail with this hypothetical dist(1) of a message containing a NUL, it's not clear you could inc(1) such a message in the first place. --Ken
Re: mysterious c0 80
>> ...draft file does NOT contain a '\n' as the last character? My >> memory is that for some strange reason Emacs like to default to doing >> that. I suspect we do not test for that. > >A POSIX text file is zero or more lines where a line is zero or more >bytes terminated by '\n'. I don't make the news, I just report it. My vague memory is that this is centered around Emacs having it's roots in a pre-UNIX operating system. However, it seems like Mike's problem is NOT that; the last two bytes of his draft file are 00 a0. Cy's bug report said that can happen anywhere, though. I know this change was to handle NUL bytes in outgoing messages, but I am wondering if maybe we should reject such drafts? Seems like any message actually sent with a NUL in it would be rather unfriendly. That might break things for people using MH-E, though, and as we've seen before that has a very long release cycle. According to RFC 5322, a NUL in a message body is not permitted. From §3.5: body= (*(*998text CRLF) *998text) / obs-body text= %d1-9 /; Characters excluding CR %d11 / ; and LF %d12 / %d14-127 Obviously if you are sending binary, you can (RFC 2045 explicitly disallows NUL when sending 8bit). I realize that I started this whole mess last time I asked about this. Sigh. --Ken
Re: mysterious c0 80
Stupid question time: is it possible this bug is only triggered if the draft file does NOT contain a '\n' as the last character? My memory is that for some strange reason Emacs like to default to doing that. I suspect we do not test for that. --Ken
Re: mysterious c0 80
>It certainly includes that commit above. >I updated just last week before starting this thread actually. >Looking at my outbox, I think it did start when I upgraded. > >I tried "git revert 8f897f65", but it results in a bunch of conflicts, which >I decided not to investigate. Could you just do a 'git checkout 8f897f65^' and try that? --Ken
Re: mysterious c0 80
>> What version of nmh are you using? There was a recent bug reported >> by Cy Schubert that may be the cause of this: >> >> http://savannah.nongnu.org/bugs/?65002 > >Interesting. Can anyone replicate this bug? Michael, are you running >on FreeBSD or something else? I haven't tried yet; it was on my list to look at. --Ken
Re: message-Id has localhost
>I also find it hard to beleive that someone wants the MUA to have a >specific Message-ID for their email, but there is at least one software >that I'm aware of that does act upon the contents of it: > >http://smarden.org/qconfirm/qconfirm-check-mid.1.html I mean, yes, it looks for messages based on Message-IDs in the References header; I think it would work fine with 'random'. >> My personal feeling is that the people who (a) care about generating a >> local Message-ID, and (b) actually care WHAT appears right of the '@' >> either need to configure their system appropriately or write code to >> change nmh behavior. > >I think that's reasonable. I don't see anything necessarily wrong with >nmh being able to generate a Message-ID so I'm sure patches would be >considered if someone offered. Just so I'm clear, I'm fine with people submitting patches to change the current behavior. --Ken
Re: message-Id has localhost
>> [...] Sendmail, >> yes, it looks like you could change it if you really want to; it also >> defaults to something based on the local hostname. I am personally >> skeptical that people actually configure this. >> > >FWIW, MIT's campus computer network (Athena) did this for a long time, >because that network was composed of thousands of workstations that did not >normally receive mail and all wanted to send mail that came from, for >example, rather than something like < >yand...@w20-575-77.mit.edu>. What you're talking about is a very common way of configuring Sendmail and I've personally done that many times; I call that a "site" configuration where all email that is submitted to the main Sendmail server with an internal hostname (or no hostname) is re-written to have the 'site' domain name. But what I was specifically talking about was that I am skeptical that anyone specifically configures a Message-ID header to be added by sendmail that is different than the default, which is based on the 'j' macro. I just looked in my wife's Sendmail 'bat' book and it says j holds the FQDN of the local machine, which probably means it does something similar to what nmh does; you can override that value if Sendmail gets it wrong, _HOWEVER_ it's used by a bunch of things and not just for Message-ID generation. So to one of Ralph's earlier points, it seems like we are using some MTA prior art, it just that it doesn't work for everybody. --Ken
Re: message-Id has localhost
>> I mean, that's not a reason in my thinking? Like, WHY do people want >> that? > >To be able to uniquely refer to that email in future by knowing what the >message-id field contains. The reference may be to oneself or to other >recipients. That is the purpose of the field. Not knowing the field's >value lessens its worth to tracing the flow through downstream parties >in log files. I think we're probably not going to agree whether or not those reasons qualify as "vague", but, fine. That's not really my point; I honestly don't care if people want to generate their own Message-IDs. What I _do_ care about is when they do and then complain that nmh is using the "wrong" hostname to do so; I do not believe there is solution to this that will universally work, or even work in a large majority of cases considering the configuration of the modern Internet. To Mike's question: >Can we just use "localname" from mts.conf? We COULD, it would just be wrong for some people. That's the "local" hostname, and is used for a bunch of things INCLUDING constructing the default hostname for email addresses. But here's a thought experiment: let's say you set it to 'gmail.com' because your email is hosted at gmail. There's no way you could guarantee your Message-ID isn't going to be used by gmail.com already. Yes, you could send your default email address via another mechanism, but a quick glance at the code makes me realize that's still used for a bunch of things. We could add another knob, but honestly I'd rather people just use 'random' if the existing logic doesn't work for you. To Mike's other point: >> FWIW, I took a quick look at the MTAs Postfix and Sendmail; Postfix does >> not seem to have any Message-ID-specific configuration knobs, it hardcodes >> adding a Message-ID based on it's idea of the local hostname. Sendmail, >> yes, it looks like you could change it if you really want to; it also >> defaults to something based on the local hostname. I am personally >> skeptical that people actually configure this. > >gethostname() is not the same as what you said we were doing, which takes a >trip through /etc/hosts. Well, technically, it's constructing the Message-ID based on the value of the 'j' Sendmail macro, which is used for a ton of things; that macro value is configurable and in my limited sendmail experience you usually do explicitly configure it (I do not know what that defaults to). --Ken
Re: mysterious c0 80
>Ralph Corderoy wrote: > >> The email I'm replying to: > >> $ hd `mhpath .` | tail -3 1ac0 6e 73 74 65 61 64 2e 0a 0a 0a >> c0 80 c0 80 c0 80 |nstead..| 1ad0 c0 80 c0 80 c0 80 >> c0 80 c0 80 0a 0a || 1adc $ > >> So perhaps look at your draft-template files in your `mhparam >> path`. > >So it has nothing to with GPG creation. It shows up in my outbox. If >it were in my template files, then it would show up in my emacs buffer >and draft files, right? It does not. What version of nmh are you using? There was a recent bug reported by Cy Schubert that may be the cause of this: http://savannah.nongnu.org/bugs/?65002 --Ken
Re: message-Id has localhost
>> 2) The recommendation for Message-IDs is to use a domain name for the >>right-hand side > >Recommendation or rule? I don't recall. Officially, from RFC 5322 §3.6.4: Note: As with addr-spec, a liberal syntax is given for the right- hand side of the "@" in a msg-id. However, later in this section, the use of a domain for the right-hand side of the "@" is RECOMMENDED. Again, the syntax of domain constructs is specified by and used in other protocols (e.g., [RFC1034], [RFC1035], [RFC1123], [RFC5321]). It is therefore incumbent upon implementations to conform to the syntax of addresses for the context in which they are used. [...] The message identifier (msg-id) itself MUST be a globally unique identifier for a message. The generator of the message identifier MUST guarantee that the msg-id is unique. There are several algorithms that can be used to accomplish this. Since the msg-id has a similar syntax to addr-spec (identical except that quoted strings, comments, and folding white space are not allowed), a good method is to put the domain name (or a domain literal IP address) of the host on which the message identifier was created on the right-hand side of the "@" (since domain names and IP addresses are normally unique), and put a combination of the current absolute date and time along with some other currently unique (perhaps sequential) identifier available on the system (for example, a process id number) on the left-hand side. Though other algorithms will work, it is RECOMMENDED that the right-hand side contain some domain identifier (either of the host itself or otherwise) such that the generator of the message identifier can guarantee the uniqueness of the left-hand side within the scope of that domain. >> 4) Some people, for reasons I would classify as "vague", prefer to >>generate their Message-IDs locally so their saved copy of the >>message has the Message-ID in it. > >The reason you state seems precise rather than vague. I mean, that's not a reason in my thinking? Like, WHY do people want that? That's where things get vague when this came up before. >> 7) There's not too much prior art here to crib from because of (3). > >The first-hop MTAs would be a source of prior art. Most probably let >the domain part be given as configuration. Well, I was talking about prior art from MUAs. For MTAs that represent an email domain, it's relatively straightforward because they assume that they're the only one generating Message-IDs for that email domain. FWIW, I took a quick look at the MTAs Postfix and Sendmail; Postfix does not seem to have any Message-ID-specific configuration knobs, it hardcodes adding a Message-ID based on it's idea of the local hostname. Sendmail, yes, it looks like you could change it if you really want to; it also defaults to something based on the local hostname. I am personally skeptical that people actually configure this. And, well, we TRY to use the local hostname to generate the Message-ID for the people who actually want that (because being unique is a MUST in RFC 5322). But as we've seen there's not a fullproof way of doing that. >I think the existing -messageid option which takes either ‘local’ or >‘random’ should also accept ‘@...’ to allow the user to specify it. >This stops using heuristics if the user prefers. My personal feeling is that the people who (a) care about generating a local Message-ID, and (b) actually care WHAT appears right of the '@' either need to configure their system appropriately or write code to change nmh behavior. --Ken
Re: Macintosh for nmh?
>If you're on a mac and using O365 you may need >https://github.com/simonrob/email-oauth2-proxy > >Using it for a year, happily. We DO support XOAUTH2 natively, BTW. --Ken
Re: Macintosh for nmh?
>I've had a hate, love, hate, love, hate relationship with Apple over >the years. I think the pendulum is swinging the other way back towards >love, LOL! I am sure my perspective is skewed by buying into the Apple ecosystem; the automatic syncing between devices, little things like replying to text messages on your desktop, all make things much smoother. >How are you getting your email onto your Mac? I use 'inc' to a POP server. I admit this doesn't play so well if you want to use IMAP with other devices. --Ken
Re: Macintosh for nmh?
>I have an old linux desktop that I'm sure would work, but I'm wondering >if I should consider buying a new Apple laptop. Last time I used a Mac, >it was mostly tolerable for an old UNIX head like me. Are there any >issues running nmh on a Mac? I'm typing this from a Mac right now (well, via exmh, but I still use bare nmh a lot). exmh has some challenges relating to fonts, of all things, but bare nmh works perfectly fine. Nmh is a package in Homebrew (I think the best open-source packaging system for MacOS X) and I expect it to be well-supported in the future. I view the issue of "Mac vs Linux" mostly as a philosophical one. Yes, there's a lot of Unix under the hood and as a fellow old UNIX head I make great use of that. The upside for me is there's a lot of support for "mass market" kind of stuff and you don't have to fiddle with things a lot as you might have to do under Linux. The downside is that there is still some hidden magic so it's not 100% Unix everywhere and it's not as customizable as a pure Linux system; you have to be happy with (or at least be willing to live with) some of the decisions Apple has made for you. At this point in my life I find someone else making a bunch of those decisions for me to be relieving; I am sure that plenty of people would find it stifling. --Ken
Re: message-Id has localhost
Ralph has already noted that this message still has those bytes at the end, but I think there was a mh-e error as this message wasn't even clearsigned. Instead this was at the top: ><#secure method=pgp mode=sign> As a note I'm wondering what those bytes even are; they aren't valid UTF-8. --Ken
Re: message-Id has localhost
>Maybe strcmp(hostname,"localhost") should cause that value to be >ignored, and if necessary, resort to random messageid. Or maybe that should >just be the default in some way. There are a bunch of competing factors here that I am not sure it is possible to resolve to everyone's satisfaction: 1) Message-IDs must be unique 2) The recommendation for Message-IDs is to use a domain name for the right-hand side 3) Most MUAs solve this problem by letting their first-hop MTA generate the Message-ID. 4) Some people, for reasons I would classify as "vague", prefer to generate their Message-IDs locally so their saved copy of the message has the Message-ID in it. 5) Because of (1) and (2), when people configure nmh to generate a Message-ID locally the default nmh configuration is to try very hard to get the canonical name for the local system (which isn't necessarily the same as the user's email domain name, so we can't use the value of "localname"). 6) As seen in this email thread, that doesn't always work, sometimes because of local system configuration for vague reasons. 7) There's not too much prior art here to crib from because of (3). I'm open to suggestions, but I'm not a fan of special-casing localhost. --Ken
Re: message-Id has localhost
>127.0.0.1 localhost obiwan.sandelman.ca obiwan > >which is often required for certain things work, but I don't remember >what now. I'll change it see what breaks. Is there any override? >Let's see this message. I see that worked. There is not an override; if you use -messageid random you'll get a random set of characters as the hostname. I'm personally reluctant to put yet another knob in nmh just for this. --Ken
Re: message-Id has localhost
>I noticed that my emails have stuff like: >Message-ID: <25170.1703892488@localhost> > >rather than my hostname or domain name. >Since it's like this in my outgoing folder, it must be generated by NMH >rather than by my local postfix. That would be a nmh-style Message-ID; the first number is the process id of whatever generated that (probably post(8)), the second is a Unix timestamp. Do you have -msgid set in your profile for send(1)? >I think that mts.conf ought to set this, but localname/localdomain do not >seem right, and they aren't set, so my hostname (which is correct, I think) >ought to be used. > >I'm running 1.7+dev ... NOPE. >I'm actually rather unsure how to get a --version out of NMH. I think pretty much anything takes a -version flag? % mhparam -version mhparam -- nmh-1.8 built 2023-06-21 04:53:27 + on Ventura >Still have @localhost in the message-id. >From the code, that's calling LocalName(1). What that does is: getaddrinfo(gethostname(), AI_CANONNAME). So maybe your resolver library canonicalizes your hostname to "localhost"? If getaddrinfo() fails then it falls back to the value of gethostname(). --Ken
Re: Testing the welcome message (was: nmh is vital to me)
>Indeed that is what is happening. It seems like you prefer it work >this way rather than showing the welcome message just once; in that >case, there is nothing to do on the situation I present. I think you misunderstand me; I'm not saying I PREFER it work that way, I just am explaining why it DOES work that way. It probably should be fixed to behave better, but since right now I lack the time to do it myself I don't make a big deal over it. --Ken
Re: Testing the welcome message (was: nmh is vital to me)
>Dear colleagues, >I have for a long time had a different issue with the welcome message. >Perhaps it is worthwhile to report. To produce this issue, set up a new >MH directory, and run a failing command as the first command. >[...] >$ scan +this-is-not-a-folder # Show the welcome message >$ scan +this-is-not-a-folder # Show the welcome message >$ folders # Show the welcome message >$ scan +this-is-not-a-folder # Don't show the welcome message So what's happening here is the information that you've seen the welcome message is stored in the context file. But depending on what happens the context file is not always updated; for example, if the command errors out. So the successful run of "folders" updates the context file and subsequent commands don't show the welcome message. --Ken
Re: Calendaring?
>The topic itself -- i do not know. I tend to totally agree with >your statement, but then again, not. I mean there is nothing >i would have to be ashamed of or that could make me blackmailable. >But i do have SSH keys to other people's computer(s) (networks). >I have S/MIME and PGP (and signify) key. >There is access to a bank account, and i make a living from that >alone, one bank, one account. At some time, make one of three. I understand where you're coming from; I'm not worried about blackmail either, but physical threats to me would totally compromise everything. But I'm not really worried about that in practice; nothing in my job really involves anything that critical, and unless they killed me the compromise would be detected very quickly. If my life was different I'm sure I'd take greater precautions. --Ken
Re: Calendaring?
>The all-embedded RFC 7265 JCAL (plus JMAP etc) is surely the >future for all of you. I get the feeling there's some scorn in that statement. I'm not trying to drag anyone else's choices; it's tough to find the right balance between "keeping up with the times" and "sticking with stuff you know that works just fine". I can only offer (when asked!) how I came to make the decisions I chose. >Unfortunately no Yubikey/-alike thing yet, so i type my harddisk >decryption key, then my password, then the password of an >additional encfs filesystem, and then have a script which loads >keys into ssh-agent aka provides decrypted versions for easy >consumation (via copy+paste or so). The browser "container" which >(actually) has passwords is special and also stores in such a one. >(Also decrypted by that script.) I can only say that my personal security is NOT designed to defend against nation-state level attacks :-) But maybe it should be? --Ken
Re: Calendaring?
>I'm firmly planted in the Linux, MacOS, Windows, and Android worlds. >So, my current calendar is a piece of paper on a clipboard that I always >carry with me, LOL! I find it way more expedient to use the clipboard >rather than try to have a to-do list app and a calendar app and remember >which device I'm using at the time and figure out how to keep everything >in sync. I mean, hey, whatever works for you ... but do you not have anyone else who you need to share a calendar with? To me that is the other key piece. If it's just you then you can get away with more analog solutions. FWIW, _if_ you are willing to live in the Apple Universe then the syncing just happens across all of your devices automatically. I just don't think about it; I know that when I (or anyone else in my household) add something to any device then soon-ish it will automatically appear across all devices for everyone in my household. >I really miss my old Windows phone, but I digress. Does >anyone remember the Chandler app? Warts and all, that was the best >answer to my habits. I do remember that, but it seems like it ended up being too complicated and of course it took Real Money(tm) to push it as far as they got and they ran out of Real Money. Also, as I understand it they solved the syncing issue by running their own CalDev server that other clients could connect to. >I've gotten into the practice of using my PC as the master calendar, >since that's what I use to pay bills, etc., so when I get home, I >reconcile my clipboard with my PC. I suppose I should take the time to >connect my Android with my PC, but I try to keep my phone disconnected >from everything as much as possible. I don't install apps, and I try >not to keep any passwords or charge card info on it... Geez, what DO you use your Android phone for? Actually making VOICE CALLS??? :-) Since you mentioned it ... what do other people use for password management? I ended up using a solution called "Master Password" which is an open-source password generator which is now called "Spectre" (https://spectre.app) and has a reasonable paid app for mobile devices. But I am wondering what other people do. --Ken
Re: Calendaring?
>I'm an old Unix system admin command line type, and I love MH/nmh. > >I let the Mac and PC worlds distract me for a bit, but I'm really >tired of dealing with Calendar, Outlook, and the like, especially when >Microsoft is threating to change the user interface again. Is there >a calendaring program that uses a similar structure to nmh behind the >scenes? If one doesn't exist, I might have to make one of my own. I'm pretty much in the same boat myself. However ... for _me_ at least the value to a calendar and a reminder list is having it on all devices so you can add events when you are out; e.g., adding your next checkup when you are _at_ the dentist and having it updated everywhere is to me the whole point. That pretty much means you're stuck doing some kind of calendar server. You can run your own but when I looked at that they had a hellish dependency tree, and of course running your OWN server in the modern Internet age is a huge pain unless you pay for a hosting company (and as Conrad notes even that doesn't guarantee it will work with everything). I know people that use Google Calendar with some success, and there are existing command-line tools that interface with that and as far as I can tell that interfaces with all major client software. I personally use the Calendar app on MacOS X, as it is easy to put calendar events from nmh into it and seems to have reasonable authentication and sharing capabilities (this only works well if you already use MacOS X of course). The Reminders app on MacOS X also deals with events as standard iCalendar format events (with some MacOS X-specific extension fields) so you can interface with it; all of the same caveats about drinking MacOS X Kool-Aid apply. --Ken
Re: inc -file truncates, manual pages say it doesn't
> By using the -file name switch, one can direct inc to incorporate >messages from a file > other than the user's mail drop. Note that the named file will not be >zeroed, unless the > -truncate switch is given. Is it _possible_ you have -truncate in your .mh_profile under inc? The code says that it should work, and I just tested it now and it does what it says on the tin. --Ken
Re: some thought on m_getfld
>I'm not sure if it's a good idea to start this topic, but it bugs me a >bit. Also as a disclaimer my view on this is highly influenced by my own >m_getfld variant[0]. > >Because of the missunderstanding of when LENERR is used I have looked >at m_getfld.c and I see some problems with the interface and the >implemetation. >[...] First, my apologies for not contributing more on the discussion regarding header folding; my life has been busy as of late. But I have some thoughts on this. I think you are thinking at the wrong scale. The problem with m_getfld() is not the interface, it is that it should not exist at all. As you note, there's a ton of duplicated code when dealing with m_getfld(). This has led me to believe that the number of programs that actually need a m_getfld()-style interface is close to zero. I think programs tend to fall into two styles of header access: - One where they process each header field at a time (like the config file parser). - One where they need to slurp in all header fields and then just extract the ones they care about (I think this is the most common). So this suggests to me that really in most cases the whole header should just be read and then you get some accessor functions to get out the headers you care about. In terms of implementation, as you note header fields have a well-defined structure so honestly my preference would be to write a flex lexer to process header fields. But really I think there should be a higher level API to deal with header fields rather than have nmh code call m_getfld(). I realize that this is a larger project and if someone just wants to improve m_getfld() that is fine with me. --Ken
Re: mhbuild and long header fields
>> Thank you for pointing that out. Header field folding does need to be >> properly implemented. It would be a great contribution if someone has >> the bandwidth. > >I have looked a bit at it. I found two places where this could be >implemented: > >Either as part of encode_rfc2047(), this function does this allready, >but only if a non ascii char is in the field body. Right, I don't think this is the right function. That is SPECIFICALLY for RFC 2047 encoding, which is actually only permitted for some headers (definitely not References). Yes, it does folding but that's really just in the context of creating RFC 2047 "phrases". >Or in output_headers(). I'm not sure if there an extra options would be >required. That is one option. Another is that repl(1) could do a better job; I suppose that is the fault of mhl. --Ken
Re: graphical mail reader for one-off use
>To answer my own question, mhshow can be customized to use the user's >preferred mail reader. Ken mentioned w3m with display_link_number. I >use firefox, I find that its performance on a modern machine with an SSD >drive is adequate. I think a BIG problem with this is that doesn't get you the "complete" message. The main issue here is embedded images with Content-ID URLs; in _theory_ I suppose you could write out the embedded images as files and rewrite the HTML content to point to those local files, but that just seems fraught and fragile. Another issue I personally run into is I occasonally get S/MIME messages I have to decrypt and that basically needs the whole message as well (in a perfect world nmh would handle those natively and that is on my list, but that list is long and sadly I don't have the time right now to work on that list). I know that many times just viewing the HTML content sans images is sufficient, but I've certainly run into situations where it isn't. --Ken
Re: graphical mail reader for one-off use
>You can use $ open -a seamonkey `mhpath cur` > >It opens it as a text file. The .eml extension is required to show >text/thml. But with .eml extension you can just do > >$ open foo.eml > >and it will open in your default MUA. i.e. Apple Mail. If you haven't >configured it, I don't know if it will work but it *should* work (not >that I have tested this). That's all good information; my only caveat is that I was under the impression that Paul is a Linux user based on the number of times he's asked how to get a working DB library development package, and "open" is a MacOS X thing (I see some Linux distros have a "open", but it seems to do something else). --Ken
Re: graphical mail reader for one-off use
>Thanks Ken! I'll be giving this a try! (I would have "just tried it >myself", but I don't have any modern readers installed! Small point >of pride, until now. :-) One note: you MIGHT have to have Thunderbird configured properly as a MUA to do this (I already had this done). >Huh. ".eml". I've spent years using and part-time developing email clients >and servers, and never heard of that extension. Good to know. I am not sure those things are standardized, but I just Googled "eml file extension" and it seems kind of common (some results suggested it originally came from Outlook), and I had personally seen that for a while. Right or wrong, Apple had decided in the transition to OS X that "file types" were determined by the filename suffix since you didn't really have resource forks so the way things that use the OS facilities to determine the file type will look at the file suffix. I only mention this to say that I don't know how Thunderbird will determine a file type on other platforms. I was curious and did a little bit of digging. New MIME types that are registered with IANA can include a suggested file extension. message/rfc822 predates that registry and the original MIME RFCs do not specify a file extension for that type. The message/global MIME type (a RFC822 message but with UTF-8 everywhere) has a suggested file extension of ".u8msg", which I have never personally seen "in the wild" anywhere. ¯\_(ツ)_/¯ --Ken
Re: graphical mail reader for one-off use
>Once in a while my wife or I (both MH users) get an email that really >can't be handled directly by MH. Today's example looks like this: >[...] I hear you, dude (I also have a wife that is a nmh user as well; go figure. I wonder how many dual-MH households there are?) >$ modern-mail-reader $(mhpath cur) > >and have it pop up a window on the message. > >Is this a practical wish? Well, I just tried this (I am on MacOS X): % cp `mhpath cur` /tmp/foo.eml % /Applications/Thunderbird.app/Contents/MacOS/thunderbird -file /tmp/foo.eml And ... it seemed to do what you want! (I chose a message with embedded images and they were displayed correctly). I did the copy with the filename extension because MacOS X really really wants to use filename extensions as the identifier for a file type, and just doing "-file `mhpath cur`" didn't work; I took a guess on the ".eml" extension and it seemed to be correct. So I think this is a reasonable option. BTW, I do use w3m as a text-based HTML viewer and I set the option "display_link_number" so I can find the link in a message and then copy and paste it to a web browser, so that works for a message where I just need to find one link. --Ken
Re: Forwarding email
>I have added the line forw: -format to my ~/.mh_profile. > >Now I find if I do: > >forw -mime 33 > >I get that empty #forw line and nothing else. I know this has been mentioned several times, but when you get this line you STILL need to run the "mime" command at the What now? prompt before you send the draft. We could improve on this, I think, but that's the reality of today. >However if I simply do: > >forw 33 > >I see the draft, the forwarded mail go out correctly and i can read it >in my Proton account. This won't preserve the MIME structure of the message, and I'm not quite sure what would happen if it's a 8-bit character set (probably the wrong thing). If you find this works for you then I suppose it will be sufficient. --Ken
Re: Forwarding email
>If I forward to my iCloud account, it works. If I forward to my >Proton account wHat I get is an empty message with an attachment; the >attachment consisting of a string of numb...@eddie.fios-router data 8 KB > >I have stumbled upon the fact that if I use MH-E within emacs I >can forward in-line instead of as an attachment and that works. >Unfortunately, I can't seem to find how to use nmh to forward in-line. What does "forward in-line" mean, exactly? "forw -mime" does not mark it as an attachment; technically, it does not put a disposition at all. It may be that Proton Mail sees a MIME message with a disposition header as an attachment (disposition headers are optional; in theory the MUA is allowed to do whatever it wants with it). I believe that if you modify the forw line with a disposition, the right thing will happen. That would look like this: #forw [forwarded message] {inline} ... rest of forw line >Finally, using dist to send to the iCloud account works. Using dist to >send to Proton fails with: > >Rejected due to unmatching envelope and header sender. Sigh, THAT is completely wrong in that Proton Mail isn't looking at the Resent-From header. You could add an Envelope-From header to set the Envelope from header to match the original message, but that will probably fail some other anti-spam feature. --Ken
Re: Forwarding email
>Unfortunately, I am back again with the same issue. >[...] I went back and looked at your original email about this, which is here: https://lists.nongnu.org/archive/html/nmh-workers/2023-04/msg00083.html The original information you were given still holds true, in that doing this should work: forw -mime 42 [...] What now? mime What now? send Now you reported back then it "didn't work" to a Proton Email account. Drilling down into this, it seems that it made it to Proton but it wasn't viewable. Let's talk about what is going on here. When you run "forw -mime" it's inserting a mhbuild directive into the reply message. That's the line that begins with "#forw ...". The "mime" command runs mhbuild and you end up with a message that contains a message/rfc822 part. If you don't add anything before or after the "#forw" line, that's ALL the message contains. It looks like Proton Mail (I am presuming the web interface) doesn't quite deal with a message with a single message/rfc822 MIME part properly; it treats it as an "attachment" and you can't view the content. Which is unfortunate. You could try adding some text before the #forw line (this is before you run "mime"), and MAYBE that would work. But if you goal is to just forward a single message you COULD use dist(1); that will just pop up a draft where you can just enter the recipient's names and they will get an exact copy of the message (with the Resent-From and Resent-To headers from the dist(1) draft). Generally all web mail clients seem to do the right thing with those messages (but they don't all display the Resent-From header so it can be confusing). --Ken
Re: Unsupported nroff macros on MacOS X
>Not to prolong the agony, I tried the example on OSX for man tbl: > > .TS > tab(@); > ccc. > This@is@centered > Well,@this@also > .TE > >It didn't work with the nroff -man they supply. It did work with mandoc Silly question: what version of MacOS X do you have? On my Ventura install I get: % which nroff nroff: Command not found. (I did find a Monterey install and it does have an nroff on it) I think as long as whatever "man" ends up invoking on a particlar system supports tbl(1) macros we are fine. And I believe that is true! Even on the aforementioned Monterey system if I put those contents in "foo.1" and run "man ./foo.1" then the right thing happens. --Ken
Re: Unsupported nroff macros on MacOS X
>>> Sorry if I jumped into the middle and missed something, but what about >>> using this to convert once? >>> >>> groff -Thtml > >> I guess my next question is ... what do we do after that? > >I thought if we ran it through with man (nroff/groff) to ascii, then we'd get >asciidoc format essentially. At which point there are tools that deal with >further transformations. I ... am not sure that is correct? The man page examples I see suggest it is closer to something like Markdown. E.g.: https://docs.asciidoctor.org/asciidoctor/latest/manpage-backend/ --Ken
Re: Unsupported nroff macros on MacOS X
>I don't see a tbl command on MacOS (or freebsd, except if you >installed groff (or plan9port -- ignore the troff comment!). At least on MacOS X, 'man tbl' actually works (but there is not a separate tbl command, true). The man page says: The tbl language formats tables. It is used within mdoc(7) and man(7) pages. This manual describes the subset of the tbl language accepted by the mandoc(1) utility. >Unless you mean replacing the current setup with the output of >equivalent tbl syntax and then checking that in? I tried that. >mandoc doesn't understand tbl output for the simple test i tried. > >/usr/local/bin/tbl <.TS >c c c . >This is centered >Well, this also >.TE >EOF It looks like to me that tbl format is handled directly by mandoc. E.g., this works: mandoc -man <
Re: Unsupported nroff macros on MacOS X
>Why not just add a note in man pages affected by the .fc problem >that if the tables are not properly lined up, the user must install >groff (or plan9ports, where you also get troff)? I don't necessarily object to this, but ... well, that troff request is weird. Like I'm still not quite sure what is going on there; I don't understand why you need that fill space. Also you manually need to specify the maximum cell size when you make those tables by hand. tbl (which seems like it has been supported ... forever?) does the hard work of creating tables for you. It seems like the right tool for the job; even I could figure it out. As Anthony has pointed out mandoc is the default man page formatter for BSD systems, now MacOS X, and some Linux distributions. Seems to me that switching to tbl for those things is a no-brainer unless I am missing something, which is always possible! --Ken
Re: Unsupported nroff macros on MacOS X
>> Pandoc is available in lxplus, aiadm and most RPM repositories. It's >> written in Haskell, which means it relies on hundreds of megabytes of >> library dependencies. > >That's certainly fair, but wouldn't it need to be used only once, after >which the documentation could be maintained in markdown format? I suppose >that would require a tool to go from markdown to man, but at least it's a >thought. Well, to me the PRIMARY way that nmh users interact with the documentation is via man pages, viewed with man(1). So being able to generate that format is crucial. If the point of a conversion to Markdown (or anything else) is just so my dumb ass doesn't have to learn random troff requests ... well, that seems like overkill. If it's so other people don't have to learn random troff requests, that's fair ... but if we could solve this just by switching to tbl for tables then that seems more sensible. My dumb ass was able to figure out how to convert those .fc requests to tbl tables in a short amount of time so maybe that makes the most sense. >>I have no objection to Markdown but I'm not sure what it would gain us >>exactly, other than maybe someone younger than 35 could edit the >>documentation. > >That may be the point -- or not, I suppose, depending on one's point of >view. (I'm far past the point of being under 35 myself, for what that's >worth.) I mean, yeah, I hear you. Same for me! I am just wondering out loud if any of these up-and-coming young punks are interested in editing man pages directly. Doug writes: >Anway, I'm sure I'll have a few cycles that I can devote to documentation... We welcome the help! Really, any way anyone wants to contribute is welcome. >Anybody else remember the NPM debacle? Oh yes, that's was a classic one! >> I have no objection to Markdown but I'm not sure what it would gain >> us exactly, other than maybe someone younger than 35 could edit the >> documentation. > >LOL. OK. I suppose I should comb the archives, but how many of us are >actually using nmh these days? I mean, I don't REALLY want to know >whether I'm as much of a dinosaur as I fear, but who are we maintaining >nmh for? It's a fair question! I mean, why does ANYONE maintain a particular open-source software package if they aren't getting paid for it? For me, I figure as long as I still use nmh, I'm gonna keep on developing it. --Ken
Re: Unsupported nroff macros on MacOS X
>>In a more practical sense, I am not sure there is anyone with the free >>cycles to convert the current man pages into some other markup language. > >This seems like the sort of thing that should be possible to automate, and >that question has been raised before. A quick search turned up the >following, among others: > > > https://stackoverflow.com/questions/13433903/convert-all-linux-man-pages-to-text-html-or-markdown > https://jeromebelleman.gitlab.io/posts/publishing/manpages/ I am ... concerned about depending on pandoc, because of this: Pandoc is available in lxplus, aiadm and most RPM repositories. It's written in Haskell, which means it relies on hundreds of megabytes of library dependencies. That screams to me, "this is going to bitrot at some point in the future". There seems to be a wealth of man/mdoc/roff converters and I feel like there are always going to be some kind of nroff converter, at least until Y2038.. I have no objection to Markdown but I'm not sure what it would gain us exactly, other than maybe someone younger than 35 could edit the documentation. --Ken
Re: Unsupported nroff macros on MacOS X
>> I am kinda against depending on some third-party tool > >Where does built-in turn into third-party? With all the modern package >managers, it's trivial to install other tools as needed. That's a fair question! And one I struggle with. One thing that is common is we do make a distinction between what DEVELOPERS are required to have versus what end-users are required to have. E.g.: you need autoconf/automake/flex to build it from the git repo, but you do not need any of that to build from a distribution tar file. Something similar could be done with man pages. Nmh is a "traditional" tool and we generally have the goal of sticking to "traditional" support tools, but I acknowledge there is no good definition here and it's kind of arbitrary what is traditional and what isn't. In a more practical sense, I am not sure there is anyone with the free cycles to convert the current man pages into some other markup language. Yes we could convert the man pages to HTML, but I am not sure there's a way to convert back. If I am wrong please let me know. --Ken
Summary regarding NUL in email messages
I meant to send this out earlier, but I wanted to summarize what I think are the prevailing attitudes about NUL characters in email messages that I asked about in February: - It seems like NULs are just not that common in the real world, so no urgent changes are required - We should probably try to do better when code is updated/modernized. Any dissenters from this view? --Ken
Re: Unsupported nroff macros on MacOS X
>Sorry if I jumped into the middle and missed something, but what about >using this to convert once? > >groff -Thtml I guess my next question is ... what do we do after that? I am assuming that we still want to ship man pages; do we use some tool to convert them back? Do we have to make man page modifications to HTML? I am kinda against depending on some third-party tool; I'm not sure what the options are in this space. I was under the impression that most things go from man pages TO HTML, not the other way around, or they use some other markup format that can output man pages and HTML. --Ken
Re: Unsupported nroff macros on MacOS X
>The status quo is fine. It doesn't require understanding all of troff. >Just man(7) plus the odd bit here and there. Sigh. The "odd bit" unfortunately, for me, requires a lot of knowledge that seems to take some serious roff-fu. Let's take the example you gave where the first line for a man page that uses tbl should contain: '\" t So, my question is ... what does this mean? I understand that \" is a comment, but I'm confused about the leading single quote. As a random note, this string is rather hard to search for. Also, I don't really understand how single quotes are used in roff, I guess; all of the documentation seems to assume you already know this. And it's this way for me for EVERY SINGLE BIT of "odd bit"; there's a huge pile of knowledge that is assumed you know. It's not that I'm afraid of digging into hours-long rat holes; I write programs using OpenSSL, after all. It's just a lot to deal with when I am just trying to format a nice table. And I do it so infrequently that I have to re-learn it all every time I want to edit a man page. (I did play around with tbl, and it seems like that is actually very easy so I am thinking that Anthony is is right and we should just switch to that). --Ken
Re: Unsupported nroff macros on MacOS X
>You could replace .fc plus *all* the man macros with all-mdoc macros in >all the nmh man pages. It's man ^ mdoc == 1. Ah, poop. I see what you mean; you can't mix and match macros across packages. Dang it. --Ken
Re: Unsupported nroff macros on MacOS X
>mandoc is a pain. It's one of many programs which attempt to interpret >man pages whilst being an incomplete implementation. I hang out in >places which like to talk about troff/nroff, including for man pages, >and mandoc's flaws crop up a lot. So I'll admit my ignorance here. What's the difference between mandoc and mdoc? It seems like mandoc is just the program that interprets nroff source? A lesser implementation, as you say. >Others have mentioned the mdoc macros. These are an alternative to man. >Their big selling point is semantic mark-up rather than presentation. >But they have their own problems, they're YA-standard, and I'd avoid >them. Stick with simple man macros and troff/nroff with ASCII, UTF-8, >and PostScript/PDF as targets. Sigh. Well, I'll admit my biases here. In my youth somehow I missed the troff boat; we used a lot of SunOS 4 and that didn't seem to have a complete troff implementation (nor any documentation), and everyone a bit older than me had System V experience which did come with a complete troff manual and somehow it seemed like everybody else had figured out troff, but other than modifying the occasional man page (which was mostly done by copying other man pages) I never did really grok troff. I realize that this is all solvable and the original troff manuals are available online. But ... well, I'm busy and all that, and I kind of view it like learning Morse code; yes, it would be cool, but I just don't have the time. I did a quick Google, and I found this answer which is similar to my question: https://unix.stackexchange.com/questions/391399/what-is-the-difference-between-the-mdoc-macro-set-for-troff-and-the-plain-man-ma Now obviously that person has their own biases, but this paragraph is very telling to me: Unless you're an experienced manpage author with a sound understanding of Roff grammar and the pipeline's mechanics, you really shouldn't be using anything other than mdoc for authoring your manual-pages. Veteran troff users may find mdoc to be needlessly verbose or restrictive, finding man to be lighter and less intrusive. However, these authors are experienced enough to know damn well what they're doing ― so unless you're a grizzled veteran, just stick to using mdoc. And that describes me exactly! The frustrating thing for _me_ is that doing simple stuff (like formatting command options) in man pages seems kinda complicated and mdoc has macros which deal with that exactly. Like it does all of the stuff I'd ever want to do in man pages. And it seems like it is everywhere now? Is there a reason to NOT use it? I am open to more input on this topic. My specific question is: should we replace the .fc macros in nmh man pages with mdoc macros? It seems like that should Just Work on all common platforms today. --Ken
Re: Unsupported nroff macros on MacOS X
> | Given my druthers I think I'd rather do the last one, since this kind > | of seems like a table! > >I would do it that way (now) too, either that way, or just use mdoc >primitives - an appropriate layout could probably be achieved using >the list macros (with tags) in compact mode. Fair enough! I'll take a look and see if I can convert those to either a mdoc primitive or something using tbl. Are we all in agreement that (when possible) we should change man pages over to mdoc and new man pages should be written in mdoc? --Ken
Re: mhl nocomponent
>It does seem like the size of the headers exceeds the size of the body >in a lot of cases :-) I mean ... yes? Doesn't seem like there's much we can do about that unfortunately. --Ken
Unsupported nroff macros on MacOS X
So I noticed that after an upgrade to MacOS X, I started getting this warning on certain nmh man pages: This manpage is not compatible with mandoc(1) and might display incorrectly. After some digging, it turns out man(1) is a shell script and to make a long story short is running this command: mandoc -Tlint -Wunsupp Which returns this: mandoc: whom.man:140:2: UNSUPP: unsupported roff request: fc And some quick googling suggests that in fact the mandoc macros only support a subset of roff requests, and .fc aint one of them. I will admit that my roff-fu is not very good, but I took a look at this. It seems this is a common idiom for nmh man pages. Specifically (this is from packf(1) but it's similar everywhere else): .SH "PROFILE COMPONENTS" .fc ^ ~ .nf .ta 2.4i .ta \w'ExtraBigProfileName 'u ^Path:~^To determine the user's nmh directory ^Current\-Folder:~^To find the default current folder ^Msg\-Protect:~^To set mode when creating a new `file' .fi So as I vaguely understand it, the .fc line sets '^ as a field delimeter and '~' as the character to pad a field. .nf sets the following lines to no-fill mode. ".ta 2.4i" sets the tab stop to 2.4 inches. The following line sets the tab stop to the width of "ExtraBigProfileName" and 'u is the default horizontal span for the terminal (now that I look at it, I'm not sure why there are two .ta lines right after another). The following lines use '^' to delimine each field and the "~" to pad out each field. So you get something that is supposed to look like: PROFILE COMPONENTS Path:To determine the user's nmh directory Current-Folder: To find the default current folder Msg-Protect: To set mode when creating a new `file' But on MacOS X you now get: PROFILE COMPONENTS ^Path:~^To determine the user's nmh directory ^Current-Folder:~^To find the default current folder ^Msg-Protect:~^To set mode when creating a new `file' This isn't wonderful and I'd like to fix it, but I'm not sure what to do. Ideas include: - Tell MacOS X users to install groff(1) (the man(1) script will try to call groff if it encounters a warning like this from mandoc) - Switch to some other roff construct; I was under the impression that actual tabs would work? I'm not sure why tabs aren't used here. - Switch to tbl(1) macros which as far as I can tell are supported by mandoc and seem to work everywhere. Given my druthers I think I'd rather do the last one, since this kind of seems like a table! But I'd like to hear what other people think since I know there are people here with much greater roff-fu than I (I do not know when tbl(1) was created, but it wouldn't surprise me if MH predated it). --Ken
Re: mhl nocomponent
>I use MH-E, which does it's header display/hiding outside of MH. In >general, I like to see extra headers, but they have gotten way out of >hand from the MS/Outlook space. We probably need a better list of >common "this is junk" header list that probably has to have wildcards in >it. Sigh. I mean, it seems like this has been expanding, and "everybody else" doesn't consider it a problem since every OTHER MUA only shows "relevant" headers by default. Even Robert Elz (who was a long-time proponent of showing those headers) finally gave up and deleted the extras:nocomponent line in his mhl file: https://lists.nongnu.org/archive/html/nmh-workers/2021-06/msg00025.html I am thinking maybe we should change the nmh default mhl files as well to recognize current reality (in that thread Ralph also chimed in and said the current default doesn't make sense). --Ken
Re: mhl nocomponent
>Thanks Ken. So my mhl.headers looks like this: >[...] I am wondering if you are, in fact, not using that mhl.headers file like David just suggested? It sure looks like you are not. It almost seems like your showproc is just "more". --Ken
Re: mhl nocomponent
>Hey all, been meaning to get to this for a while. > >Things are way more verbose in the current release in that all sorts >of headers clog up my display. The extras:nocomponent only seems to >cut down on a few components. Well, I know this is confusing, bttt ... "extras:nocomponent" means "output everything else not explicitly ignored". The "nocomponent" means "Don't output the component part", which in this case would be "extras". Again, this is confusing. If you only want to show explictly defined headers, you need to remove that line. If you want to show everything not explictly ignored, leave it in and add more ignore lines. Other possible changes that you might be seeing (note: I do not think any of this changed in this release): - Something changed upstream to add more headers, probably anti-spam related - You're getting more MIME messages which means mhshow is being called more and you didn't ignore more stuff in mhl.headers (which is what is used for headers when mhshow is used, and yes, this is also confusing). --Ken
Re: %(addr{}) and RFC-invalid headers
>However, ISTM having addr output text that can be something other than >its documented "contract" (an mbox@host or host!mbox) is perhaps a bit >dodgy, even when the root issue is invalid (or unsupported) input. I >don't know how the formatting engine works, but couldn't addr >conceivably scan its own prospective output and error if it's not in the >right shape? (This is idle speculation, not a request; and I've no idea >whether such a change would be an unwelcome incompatibility for other >folks. As I mentioned above, I think I can work around what addr appears >to do.) _Could_ it? W ... maybe? The "ap" command does this (and fmttest in -address mode), but what it does isn't suitable for general consumption. What IT does is directly try parsing the address; if that fails it sets a special "error" component to the error message it got from the address parser (oddly enough this is one of the few places you DO get an error from the address parser), and the default format program for ap and fmttest check to see if the {error} component is set. Setting a special component for a general message is kind of a bit tricky, especially since there are multiple headers that could be parsed and I don't like intruding into that namespace for a message. What happens under the scenes is a little more complicated; let's break it down a bit. Here's a very simple format program in it's compiled form: % fmttest -dump -format '%(addr{text})' Instruction dump of format string: %(addr{text}) PARSEADDR, c_name "text", c_type 0x1 LS_ADDR, c_name "text", c_type 0x1 STR DONE Every time a format function is invoked which involves address parsing, it calls the PARSEADDR function; once a component is parsed it sets the CF_PARSED flag for that component so further calls to PARSEADDR will simply make it a no-op and the address parser is only invoked once per component. _If_ the address is parsed successfully, a field in the component structure is set to the value of the address parser return (the getm() function) which has all of the various address fields broken out. If address parsing FAILS, then the field in the component structure is set to someting called fmt_mnull, which has every address field set to NULL. Now what happens next is a little bit interesting. For the SPECIFIC cases of the %(addr) and %(friendly) functions (they end up emitting a LS_ADDR and LS_FRIENDLY instructions respectively in addition to PARSEADDR), if the address parser has failed (using fmt_mnull) then the format engine will output the original text of the component. Everything ELSE, it will end up using the value from fmt_mnull which is typically a NULL pointer, which means nothing will be output. So the decision to always output the component text on address parsing error for %(addr) and %(friendly) was deliberate. I could see that making sense for %(friendly), but it's fair to point out that %(addr) is a bit of a tougher sell and is inconsistent with it's stated output. But ... that's long-standing behavior and I have reluctance to change it. I'm open to discussion here. I guess this boils down to a question of (a) Are there any changes we should make in the long run, and (b) are there things you can do today to make things better? For (a), we should handle those addresses better. But should we make a format function that could detect a mis-parsed address? That seems straightforward to me; might require a slight API extension internally but shouldn't be too bad. Something like %(addrerror{from}) that would return a true if the address failed to parse correctly. For (b), it DOES occur to me that you could use the feature of the return of NULL for most invalid address parts to test for a mis-parsed address. E.g., "%(mbox)" (which normally returns "user" for an email address of "user@host"). So you could do this: %<(mbox{text})%(addr{text})%|Address is borked%> Which I think would do what you want, today. Obviously put whatever you want for "Address is borked" and use the appropriate component for {text}. --Ken
Re: %(addr{}) and RFC-invalid headers
>1. Could (should?) %(addr{}) be expected to be able to > extract the addr-spec part out of an invalid message header? Ummm no? If the address parser fails, well ... we're kind of stuck. Internally that's what makes all of the format engine address functions work. I'm not even sure what makes sense here; are you thinking if address parsing fails we should shift to some other, backup address parser? And if that one fails, what next? >2. Has anybody got general tips on dealing with slightly-invalid > messages in nmh? They seem somewhat unavoidable. Sigh. I hate to say it, but ... we get this wrong. "." in the "phrase" (which is the part before the email address) is officially valid as part of the "obsolete" syntax in RFC 5322. However, if you did get a RFC 2047-encoded From: header with an unencoded ".", THAT is unambiguously invalid. But it doesn't really solve the core issue here, in that we should be more accepting of "." in phrases and we aren't. Sigh. I realize that doesn't address your question unfortunately, and I don't have a good answer for you. --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>In the meantime, an occasional folder(1) -pack might solve the problem >manually. It did occur to me that if you did a folder -pack in the MH-E index folder and you had a numeric sub-folder then your sub-folder would change its name and I am not sure what that would do to the MH-E index. --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>while actual bytes of memory on my laptop are semi-precious, addresses >in the address space are much less so. here's somebody who uses mmap(2) >to allocate a huge chunk of address space, and then madvise(2) (a call i >think i've never used) to have that chunk backed by (lots and lots of) >zeroes. > >https://robert.ocallahan.org/2016/06/managing-vast-sparse-memory-on-linux.html > While that is interesting, I see some issues: - MAP_NORESERVE is a Linux-specific feature of mmap(), as far as I can tell. I'm not opposed to OS-specific features but we'd need to think about it carefully. - As for if it would help ... well, it depends on what you are doing. In the specific case of flist(1), it would probably help because one of things folder_read() does is count up the total number of messages in a folder (mp->nummsg) and that's what flist uses. But if you tried to use scan(1) on that folder, well ... what scan(1) does is start at "lowmsg" and calls does_exist() on every number between "lowmsg" and "highmsg" to determine if that message exists. And does_exist() is using the msgstat array to see if a message exists, so you'd be reading every single msgstat array member. The bottom line is nmh (and MH before it) is just not going to perform well with billion-sized gaps in message numbers and fixing that is going to be very very hard. --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>a folder with the highest message number of "N" will cause the array to be >configured to support N messages, even if there are many fewer (perhaps >even one) messages No, that's not correct. If you have a single message in a folder with a count of 100, you only get one entry allocated. The number of entries allocated is based on the difference between the lowest and highest message number. >Scale the array based upon the number of directory entries in the folder. >This will over commit due to subfolders being counted, scratch files, and >deleted messages. It seems this would only over commit in interesting cases >by 3x (baseline of 1 covers the messages, the 2nd set is scratch files and >deleted messages, and 3 is subfolders). Short of malicious actions, you'd >end up with, maybe 5x (message, extracted parts of the message, deleted >message, folders that look like message numbers). If you want more >compactness, you take pains to dump the stuff that isn't a message number >(the aforementioned extracted parts and deleted messages). It's not filesystem internals that is the issue, it's (n)mh internals. Right now the msgstats array is indexed by taking the message number and subtracting the value of the lowest message number. Obviously there are much better ways to deal with this, but all of the nmh code directly accesses the msgstats array. And of course time is not infinite so someone who HAS time would have to roll up their sleeves to fix it. (A general assumption is that there are few holes in nmh message numbers and this is reflected in more locations than just this). --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>From Ken's description above, these 111 messages would allocate almost >800,000 msgstat structures. I don't know how huge the message numbers >get in the results folder, but six digits is common. I don't recall if >I've seen seven digit or larger message numbers. I see Conrad pointed out that if you set "sort=date+" in your .mairixc then this resolves this issue (but I do not know if that has negative side effects or if that interacts badly with MH-E). This does suggest to me we should probably change the internal API so sparse message ranges are handled better; right now all of the programs access the folder structure members directly and assume that there will be a msgstat structure in every location in the array. Sigh. One more thing to add to the list. --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>> I think we have to push this back on the MH-E people; Robert's >> suggestion to add a non-numeric prefix to directories it creats sounds >> like the best answer to me. > >$ refile +31415 $ folder +31415 >31415+ has 1 message (1-1). I'm aware of that, but what happens if you have a subfolder that is all numeric? I believe all of the nmh tools will treat that subfolder as a message (that's the real issue). --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>it seems that at some point i had done a search for 74600607886815 (your >basic "magic number" :). mh-e, i guess, had created a directory with >that number as its name (it uses the search term to name subfolders >under the normal mhe-index folder). and, i guess, flist decided that >(under the ~/Mail/MHE-INDEX folder) was a message number? > >does that make sense? i guess mh-e could not create such subfolders >with names consisting only of decimal integers (i have some >hexadecimal-named folders which don't seem to give a problem). or, i >could not search for such. or, maybe flist (or, nmh in general?) could >not think that a directory was a message? The loop in folder_read() that is scanning for messages is this: while ((dp = readdir (dd))) { if ((msgnum = m_atoi (dp->d_name)) && msgnum > 0) { [...] So if the directory entry is a positive decimal integer, nmh (and MH before it) considers it a message. Robert already explained the issues involved; stat()ing every file to determine if it was a file or not would be prohibitively slow (and this would affect every nmh program; almost everything calls folder_read()), and using d_type isn't portable. I think we have to push this back on the MH-E people; Robert's suggestion to add a non-numeric prefix to directories it creats sounds like the best answer to me. --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>ah, great! yes, that works. and, yes, to my ignorant eye, it appears >that the call from `folder_read()` to `mh_xmalloc()` is where we are >going south. >[...] >#2 0xf898 in mh_xmalloc (size=42269452928) at sbr/utils.c:38 >#3 0xacf6 in folder_read (name=0x555d5400 >"/home/minshall/Mail/mhe-index", lockflag=0) at sbr/folder_read.c:138 Exactly HOW many messages are in mhe-index? Ah, I think I see what's happening. That line is this: mp->msgstats = mh_xmalloc (MSGSTATSIZE(mp)); MSGSTATSIZE is defined as: #define MSGSTATSIZE(mp) ((mp)->num_msgstats * sizeof *(mp)->msgstats) num_msgstats is set by the previous line: mp->num_msgstats = MSGSTATNUM (mp->lowoff, mp->hghoff); Which is defined as: #define MSGSTATNUM(lo, hi) ((size_t) ((hi) - (lo) + 1)) So ... the summary here is that nmh (and MH before it) allocates a "message status" element for every possible message. The possible number of messages is the range between the LOWEST message number and the HIGHEST message number. So if you just had 100 and 1002 in a folder, it would allocate 3 elements. But if you had 1 and 100, it would allocate a million elements. A msgstat structure is an array of "struct bvector" which might be ... 8 + 8 + 16 bytes per message on a 64 bit platform. That suggests there are either 1320920404 messages in that folder (1.2 billion) or there's a huge message number gap (that has come up before when someone had a huge gap; the my memory is the consensus was you just had to deal). In general nmh will try to handle messages and folders up to the virtual memory limit and it seems like you reached it. --Ken
Re: nmh 1.8 -- repeated Welcome message unless there is a context change
>I see that nmh commands are reading the $MHCONTEXT file, parsing the >line "Version: nmh-1.7.1" and printing the Welcome message, but not >updating the file unless there is a context change: In your .mh_profile you can put: Welcome: disable To disable the version checking completely. There's not a wonderful way of dealing with this if you are shuffling around contexts, unfortunately (we probably should do better at making sure the context is updated when we display the welcome message). --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>> If you run under the debugger, you should stop when you receive the >> signal from the OOM process. > >thanks. OOM is a pretty strange way to die... Sigh, I guess I was thinking that ptrace() would be able to catch a process killed by SIGKILL, but I guess not. Is there a long delay when you run flist? Do you have a lot of folders? Like a huge number? I see that there are arrays allocated based on the number of folders you have. I am just trying to figure out if there is a number of small allocations or large ones. You could also disable OOM completely; I suspect flist will just segfault when it hits the limit. Oh, wait, I see that using limit/ulimit and setting the "datasize" limit should cause a SIGSEGV when it hits that limit. So if you set that below the OOM limit that should make it easier to debug things. --Ken
Re: flist -- "Killed" -- oom (*not* 1.8 related)
>and, if not, any thoughts on how to debug? if i build "cc -g", any >thoughts on where to set breakpoints, or where to insert printf's, to >try to track this down? If you run under the debugger, you should stop when you receive the signal from the OOM process. That MIGHT be useful _if_ you hit the limit in the routine that is causing the memory leak, which is likely but not guaranteed. Otherwise you could run under valgrind (it should be available in the packaging system for your distro) which should very quickly tell you where memory is leaking. --Ken
Re: (Not-so) hypothetical question: What to do about NULs?
>While POP's LIST does actually include the size of the message in bytes, >that's prior to any CRLF mangling that happens so it cannot be used >reliably as a method for determining when to stop reading. Unfortunate. Right, but that's mostly because of the way multiline responses are handled in POP. It's never "read X bytes", it's "read lines until you get a line that is just .\r\n". With IMAP, it's "the next X bytes are the data you asked for". So you're used to dealing with "lines" and that lends itself to C strings. >I notice however, that some components of my email infrastructure pass >NULs through without problems and some do not. qmail successfully queued >a message with a NUL in both the header and the body, but other parts >(e.g. recipient validation tools) did not fare as well, and of course we >knew that inc would truncate (and it did because the lines with NUL were >truncated). I had an inkling popular MTAs would DTRT. --Ken
Re: (Not-so) hypothetical question: What to do about NULs?
>When I was poking around in the POP code I didn't notice any special >handling of NUL bytes. It's possible that this would result in >truncation. If that's what we do now, I suspect it's alright to continue >to do so; at least until we find legitimate emails in the wild that do >not conform (again think 16M character lines). Right, definitely the POP code doesn't handle this, and my quick check suggests we're not the only ones. However, it seems like a lot of IMAP implementations do better. I think that's due to the protocol; in POP when you retrieve a message it looks like: C: RETR 1 S: +OK S: Line 1 S: Line 2 S: [...] S: . So you're THINKING in lines so you tend to read a "line" until you get a line with the sentinel value (.\r\n). IMAP, on the other hand, looks like: C: a0001 FETCH 1 (RFC822) S: * 1 FETCH (RFC822 {1024} [... 1024 bytes of data follows ...] S: ) So you're told "I am sending this many bytes exactly", and you don't have to deal with "lines", so the implementations I've seen tend to call read() (or the equivalent) until they get the correct number of bytes, and because you're not dealing with "lines" you don't treat them as C strings. Of course, RFC 3051 explicitly says: (3) The ASCII NUL character, %x00, MUST NOT be used at any time. But you're not supposed to send 16MB lines either! --Ken
Re: (Not-so) hypothetical question: What to do about NULs?
>> I do not think this is relevant to this discussion, unless they >> are changing RFC 5322s position on NULs. > >But, it seems like a question that IETF could clarify. I don't see how further clarification is necessary here? I mean, a 16MB single line in email is clearly a MUST NOT, but people send them anyway. --Ken
Re: (Not-so) hypothetical question: What to do about NULs?
>> if a NUL appears in the header somewhere all bets are off. > >I think it would be fascinating to understand how that happened. Depending >on how the parse tree is done, it could be marginally bad, or catastrophic. > >I really would be amazed if this is seen in the wild. But its a big >network: maybe its out there? Sigh. I don't really know if it has happened in the wild before (I will presume that it has), but that's not really my point. Let me try to explain it again. I'm sitting down to write or modify nmh code. Right now we have a lot of code that assumes NUL-terminated C strings are safe to represent email everywhere. My question is: is that a valid assumption? If we are making that assumption, fine, let's be explicit and if someone DOES encounter a NUL in modern email, we tell them to suck it. If we all agree that is NOT a valid assumption, then fine, going forward we should eventually fix that, or target new APIs that fix that. If we agree that we should handle NULs in individual MIME parts but not handle them in message headers, fine, let's make that explicit. Then that begs the question of what we SHOULD do when we encounter a NUL in a message header. What I don't want is the current situation where we're kind of half-assing it and it works because NULs are extremely uncommon (unless we all agree that is fine). So, I ask again: I encounter a NUL in an email. What do I do, exactly? Pseudocode is preferred in your response. >The IETF "modern SMTP" stuff John Klensin is working on (with others) might >want to talk to that: a lot of the ICANN UA stuff is a push for UTF-8 clean >across the board. I do not think this is relevant to this discussion, unless they are changing RFC 5322s position on NULs. --Ken
Re: (Not-so) hypothetical question: What to do about NULs?
>I have received email with C-T-E set to binary. While I don't think it >was needed, I haven't checked closely. Facinating! I am curious: who/what sent this to you! Do you remember the MIME type? >> - Completely handle embedded NULs properly. This is arguably the most >> correct option but would involve a lot of code changes. > >This might not be much of a lift. m_getfld might handle NULs in bodies, >and the MIME parser comes close to handling them as well. Well, I'm not SURE that's necessarily true. As you point out, that's only true for the bodies of message fields. And I see a lot of things in the code that assume the body of a message field is a valid C string, e.g (mhparse.c): /* if necessary, get rest of field */ while (state == FLDPLUS) { bufsz = sizeof buf; state = m_getfld2(&gstate, name, buf, &bufsz); vp = add (buf, vp); /* add to previous value */ } Also a lot of things (like MIME parameter parsing, address parsing, etc etc) assume C strings. I agree that if you get a binary part it looks like the right things will happen. In terms of the networking code, it looks like the right thing will happen when sending a NUL via SMTP, but the POP code assumes that can't happen (as far as I can tell, this was true even before I switched things to the unified netsec code). I guess what I was hoping for was a consensus on what we SHOULD do when we encounter a NUL byte, because I haven't heard that yet! Like what should the code do, precisely? It seems for message bodies we're in reasonable shape (unless you are RETRIEVING a message via POP), but if a NUL appears in the header somewhere all bets are off. --Ken
Re: (Not-so) hypothetical question: What to do about NULs?
>Seems to me this is classifcation of attachment data, which will end up >as octet-stream in that case. It's ... a little confusing! >For S-nail we more or less do what Heirloom mailx has done. Well, it seems that in the message lexer if you encounter a NUL you just stop, from a_msg_scan(): cp = mslp->msl_cap->ca_arg.ca_str.s; if((c = *cp++) != '\0') break; It does look like to me that for IMAP and POP a NUL character is handled properly. But that doesn't answer the question, what do you THINK should happen? Should NULs be passed through? You basically can't use C strings anywhere if you want to handle embedded NULs. >The implementation is total crap. (longjmp codebase, data leaks, >blocking I/O, all that (it was).) All of these (mailbox read, >content-transfer decoding, character set conversion, .. display >preparation) should be "filters" with input and output plugged together, >with internal buffers as necessary. That is the v15 MIME and I/O layer >rewrite that is not happening for nine years. Sigh, I know the feeling :-/ --Ken
(Not-so) hypothetical question: What to do about NULs?
I've been idly thinking about this for a while, and while the question might be simple I think it gets at some larger meta-issues we have never really agreed on how to resolve it properly. My question is, simply: What should happen when nmh encounters a NUL character (U+) in email? The rules - In theory, a NUL is never permitted in an email message. RFC 5322 (the latest incarnation of RFC 822) says in §4: Finally, certain characters that were formerly allowed in messages appear in this section. The NUL character (ASCII value 0) was once allowed, but is no longer for compatibility reasons. However, in §4.1 a NUL character is added to the BNF for obs-utext and obs-body, so in THEORY you are supposed to handle that if you handle obsolete messages. §4 also says: Note: This section identifies syntactic forms that any implementation MUST reasonably interpret. However, there are certainly Internet messages that do not conform to even the additional syntax given in this section. The fact that a particular form does not appear in any section of this document is not justification for computer programs to crash or for malformed data to be irretrievably lost by any implementation. It is up to the implementation to deal with messages robustly. RFC 5322 punts some of the message syntax back to the MIME RFCs. The "binary" content transfer encoding does allow any octet including NUL characters. But RFC 2045 says in §6.2: Mail transport for unencoded 8bit data is defined in RFC 1652. As of the initial publication of this document, there are no standardized Internet mail transports for which it is legitimate to include unencoded binary data in mail bodies. Thus there are no circumstances in which the "binary" Content-Transfer-Encoding is actually valid in Internet mail. However, in the event that binary mail transport becomes a reality in Internet mail, or when MIME is used in conjunction with any other binary-capable mail transport mechanism, binary bodies must be labelled as such using this mechanism. RFC 9051 (IMAP4rev2) says in §4.3.1: IMAP4rev2 is compatible with [I18N-HDRS]. As a result, the identified charset for header-field values with 8-bit content is UTF-8 [UTF-8]. IMAP4rev2 implementations MUST accept and MAY transmit [UTF-8] text in quoted-strings as long as the string does not contain NUL, CR, or LF. This differs from IMAP4rev1 implementations. Although a BINARY content transfer encoding is defined, unencoded binary strings are not permitted, unless returned in a in response to a BINARY.PEEK[]<> or BINARY[]<> FETCH data item. A "binary string" is any string with NUL characters. A string with an excessive amount of CTL characters MAY also be considered to be binary. Unless returned in response to BINARY.PEEK[...]/BINARY[...] FETCH, client and server implementations MUST encode binary data into a textual form, such as base64, before transmitting the data. So it's ... a bit wishy-washy, but I think the case for NUL not being valid is mostly okay. IMAP, at least, says you can't send a NUL unless you are getting a BINARY response with the special literal8 response format (and BINARY is not defined in RFC 3501). Messages in the real world -- While other rules seem to be violated with impunity (see: 16MB single lines) I am not aware of bare NULs commonly being sent in email messages today. Also, I am not aware of "binary" being used as a C-T-E at all. Now, I could be COMPLETELY wrong about this! It would be interesting to hear about use of the binary CTE or other occurances of NUL characters in the wild. My impression is that if you are getting binary data, it is universally encoded with base64; that it something everyone seems to be doing. And a NUL character doesn't seem to be valid in non-ASCII character sets as anything other than a NUL. How other mail programs deal with NULs -- I was curious, so I took a look. I tried to look at "modern" mail programs, and by that I mean, "Seems to be kept up to date". Which sadly excludes Heirloom mailx as it seems to had it's last release in 2005. I am open to hearing about what other mail program do. - fetchmail Fetchmail uncerimously just smashes any NUL characters it sees, so if you are retrieving messages using fetchmail you never see any NUL characters. From transact.c: /* * Smash out any NULs, they could wreak havoc later on. * Some network stacks seem to generate these at random, * especially (according to reports) at the beginning of the * first read. NULs are illegal in RFC822 format. */ You might get a special header warning you that a message had an embedded NUL, though. - alpine Internally alpine (which uses a lot of c-client)
Re: nmh 1.8-RC3?
>> Has anyone tried 1.8-RC3 on a BSD platform? If good, any objection to >> releasing 1.8 soon? > >Unless there's an objection or discovery of a problem, I'd like to >release 1.8 this weekend. Just a minor note: I tested nmh 1.8-RC3 on MacOS X (which I know was already tested) but I also tested GSSAPI/TLS support for sending/receiving (TLS was tested with OpenSSL 3). Works fine! I see no reason to not release 1.8. --Ken
Re: nmh 1.8RC2, xlbiff, and $HOME
>Ken, you, and David all seem irked by unset and set-but-empty being >treated differently, as if you'd like a binary outcome by first >conflating the ternary input to binary. Sigh. I'll try to make my point clearer, but I recognize we're not going to agree on this. Since David is driving this release, I defer to his decision in terms of changing the behavior for the next release candidate (I would vote for making the behavior of unset and empty be the same, but I consider David the God Emperor of nmh 1.8, so it's up to him). You ask the question: What is the user trying to achieve by explicitly setting an empty HOME? To me, the answer is obvious: they were trying to create a clean environment. You say that they could do that by unsetting HOME; well, count me in the pool of people who did not know until now that 'unset' even existed. I mean, I vaguely knew that there was a difference between a variable that was never used and one that was set to an empty string, but as far as I could tell there wasn't a PRACTICAL difference between those two states and I certainly wouldn't expect an application to treat those states differently. I suspect the person who wrote that test for xlbiff didn't know about 'unset' either and would be bewildered that there is any difference in behavior. Also, some quick testing suggests that: % HOME='' command Does the expected thing but: % unset HOME command Simply unsets the HOME and 'command' variables. I realize you could do that in a subshell but it just seems awkward, and the idiom of doing % FOO=value command to change an environment variable is very common and I see why people would just use that. I have read your arguments carefully and ... well, I still do not agree that there should be a difference in behavior between an unset HOME and an empty HOME. To me there is no ambiguity and they are the same, even if it's a technically different variable state. Again, I realize we're just not going to agree on this. --Ken
Re: nmh 1.8RC2, xlbiff, and $HOME
>What's the intent of an empty HOME? >Is it set by accident when it's meant to be unset? >Is it empty by accident when it's meant to be non-empty? >Do they want HOME=/, HOME=$PWD, or are they expecting it to error. >Any choice could be not what the user intended so exit. I mean ... you could say the exact same things of an unset HOME! My point is that I cannot see a reason to treat unset HOME and empty HOME differently in nmh; if unset HOME uses pw_dir, then I would argue that an empty HOME should do the same thing. Bakul says: >FWIW, this is how /bin/sh behaves on FreeBSD: >[...] Fair enough; I was curious and some system call traces suggest that what happens under the hood is that when 'cd' is called with an empty HOME sh explicitly calls chdir(getcwd()). But to me that still doesn't make a case that empty HOME and unset HOME should be treated differently in nmh. I recognize that this is a case where reasonable people can disagree. --Ken
Re: nmh 1.8RC2, xlbiff, and $HOME
>So an unset HOME is allowed by this function, it's an empty HOME which >isn't. It strikes me as strange that there is a difference between an unset HOME and an empty HOME in terms of behavior. I mean, yes, I can see how the code is written, the historical precedent and how we got here, but ... well, I'm trying to understand the justification for treating those differently. --Ken
Re: nmh 1.8RC2, xlbiff, and $HOME
>$ printf 'Path: /tmp\n' > /tmp/mh-profile-minimal >$ HOME= MH=/tmp/mh-profile-minimal /usr/bin/mh/mhparam path Thank you for the analysis. I am wondering, though ... WHY does xlbiff set HOME to '' for this test? (I am neutral on whether or not this is technically a regression; I can see it both ways. Alexander does point out that HOME is supposed to be valid according to POSIX, but I am not 100% that means it can't be a zero-length string). --Ken
Re: 1.8RC2?
>> I wanted to test it on MacOS X. > >I did. Success both with debug and non-debug builds. > >> But ... did we ever get a resolution on the long lines POP patch? > >No. How about we defer to post-1.8? Can we tenatively say that it's targeted for 1.8.1? --Ken
Re: 1.8RC2?
>If all goes well, I hope to release 1.8 within a week. I wanted to test it on MacOS X. But ... did we ever get a resolution on the long lines POP patch? --Ken
Re: [nmh-commits] [SCM] The nmh Mail Handling System branch, master, updated. 1.8-RC1-9-g68228e3c
>Agreed, it doesn't. They arrive as valid UTF-8 here which show just >fine so I hadn't noticed a problem, but it's clearly wrong. I expect >it's a bug in the script but have forgotten where is it to be found. I believe it is under the control of savannah. I am not sure it is really worth doing anything about now. --Ken
Re: [nmh-commits] [SCM] The nmh Mail Handling System branch, master, updated. 1.8-RC1-9-g68228e3c
Ralph, I've noticed recently that you've been putting UTF-8 characters in commit messages. E.g: man/burst.man: re-word to avoid ‘digestifying’, etc. I'm personally fine with that, but when the email is sent out about the change I get them a little mangled because the script that notifies about changes doesn't mark that email as UTF-8. It looks like you CAN mark the encoding of commits using the config variable i18n.commitEncoding, although this suggests the default is UTF-8 and when I look at all the commit encodings they are blank, so maybe I don't quite understand it? I used this as a reference: https://www.git-tower.com/help/guides/faq-and-tips/faq/encoding/windows I don't know if that would change the encoding marked in the notification email. I don't have any strong feelings about this; if we were picking one character set for commit messages it would obviously be UTF-8, but I don't know if we ever discussed that. --Ken
Re: nmh 1.8?
>Has anyone had a chance to review my proposed changes to inc to be able >to handle long lines from POP sources? While it's not common (most big >email providers like Hotmail, Gmail, etc, all conform to RFCs), there >are occasional emails (mostly from online web stores with shoddy >software) that do send out non-conforming emails. Oof, that fell off of my personal radar. But yes, we totally should get that in for 1.8. --Ken
Plans for distribution updates
Everyone, So now that we've started the release cycle process (thanks, David!) I am wondering what the plans are for getting 1.8 packages into various distributions. I did the Homebrew formula for MacOS X and I'm glad to do it for 1.8. But I am wondering what other operating systems we should target. Ones that come to mind are: - Various RPM-based distributions (Fedora, RedHat, CentOS, Rocky, and I am sure others) - Debian (and anything else that uses .deb files) - FreeBSD/NetBSD/OpenBSD 'ports' systems I am sure there are others. I guess I am wondering what needs to be done to "turn the crank" so 1.8 makes it into the packaging distributions. I know for Homebrew once the pull request is accepted all Homebrew users should get the new version relatively quickly, and I think the ports systems are similar but I don't know what needs to happen for the others. --Ken
Re: nmh 1.8?
>I just did an upgrade at home to MacOS X Ventura; let me make sure the >test suite passes and there are no obvious issues there. Oof, wait. I just did a "make distcheck" and I get: depbase=`echo uip/mhical.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\ cc -DHAVE_CONFIG_H -I. -I../.. -g -O2 -Wall -Wextra -MT uip/mhical.o -MD -MP -MF $depbase.Tpo -c -o uip/mhical.o uip/mhical.c &&\ mv -f $depbase.Tpo $depbase.Po clang: error: no such file or directory: 'uip/mhical.c' clang: error: no input files I just committed a fix. --Ken
Re: nmh 1.8?
>>Do you or anyone else have anything else you'd like to put in before >>starting the 1.8 release cycle? > >I just did an upgrade at home to MacOS X Ventura; let me make sure the >test suite passes and there are no obvious issues there. I just did that, and it builds fine and passes the test suite fine. I _can_ think of things I'd like to get in before 1.8, but I'd rather not hold things up. I say, "Start the engines!" --Ken
Re: nmh 1.8?
>Ralph, your last round of changes look good to me. HEAD builds and >tests cleanly for me on Fedora, Solaris 11, and Cygwin. > >Do you or anyone else have anything else you'd like to put in before >starting the 1.8 release cycle? I just did an upgrade at home to MacOS X Ventura; let me make sure the test suite passes and there are no obvious issues there. --Ken
Re: nmh 1.8?
>I would think that finding two plain text files with the same MD5 >that a mail message receiver finds an acceptable read is rather >unlikely though. (Just generally speaking. CRUX Linux for >example uses signify for package checksums, but still generates >MD5 checksums as a fallback. CRC32 is also used still, but noone >would claim it is secure.) I don't doubt accidential collisions are unlikely even with MD5. But it gets back to the core question ... what are you trying to accompish, EXACTLY? If you're the only one sending out Content-MD5 headers and no one verifies them (the default in older versions of nmh was to neither generate nor verify them), then you have no MD5 hashes to verify on incoming emails, nor is anyone verifying the ones you send out. So what, exactly, is the purpose of it? If the cost of keeping it around was low I wouldn't care so much. But the implementation was buried deeply in the MIME encoding and decoding routines. The long-term cost was high. If there is something I am missing about this, please, let me know! --Ken
Re: nmh 1.8?
>> Seems like they should maybe emit warnings for a release > >Yes, i.e. be deprecated. But that ship has sailed and I don't think Ken >would be pleased if I added them back in so they could be deprecated. Sigh. Ralph, you and I don't agree on Content-MD5, which is fine. But I have to point out that as far as I can tell nmh is (was) the only MUA that generates that header or checks it. I'm not even sure we calculate the digest correct for text types, it was a mess in terms of implementation, _and_ MD5 is Officially Considered Broken. Calling the removal of it a security flaw seems ... well, inaccurate at best. Also, that header specified a hash algorithm, not an HMAC, so even if the algorithm wasn't broken it wasn't keyed so an attacker could simply modify the header to match the modified content. From my perspective making -check/-nocheck a NOP has roughly the same security properties as implementing Content-MD5. I'm fine with removing the flags completely so people are forced to remove those flags from config files, or leaving them as NOPs. As I've said before, if there are arguments FOR Content-MD5, I'm willing to hear them. But here's what I said when I removed it and I think everything in there still stands: https://lists.nongnu.org/archive/html/nmh-workers/2019-07/msg00106.html Ralph, I know you said that you think it's useful to check to see if messages get mangled or corrupted, but if you're the only one who generates or checks that header then I don't see how that will work. I think Content-MD5 was just something that wasn't thought out very well when it was created. (I do want to implement S/MIME and real GPG support at some point, which would actually be useful and have some real security properties, but ... sigh, lack of time). --Ken
Re: nmh 1.8?
>Greetings as we approach the new year. > >It's been a long time since nmh 1.7.1 was released, March 2018 to be >specific. What does everyone think of pushing out a 1.8 soon? Here >are changes since 1.7.1: > >https://git.savannah.nongnu.org/cgit/nmh.git/plain/docs/pending-release-notes > >While Ken has a worthy wish list at >https://lists.gnu.org/archive/html/nmh-workers/2019-05/msg0.html >and maybe more, I've reached the point where I don't think that we >should hold up a release any longer. Yeah, I'm with you. I even have some small fixes but ... I don't have the free cycles right now. So my vote is "yes". --Ken
Re: Question as I haven't been paying attention
>Just upgraded my system to FC37 which incldes nmh 1.7.1. >show now runs everything through more which I hate. >Can't seem to disable it, even with --showproc cat. >Can someone save the trouble of having to figure this >out from the source code? Geez Jon, what version of nmh were you using before? I suspect you're running into the case where if show encounters a MIME message it runs it through mhshow. mhshow doesn't support -showproc, because it's kind of doing the work of showproc directly. But setting "moreproc: cat" in your profile should work (it does; I just tested it). What mhshow ends up doing is kind of complicated (because MIME is complicated) so overriding showproc wouldn't make sense for it, but I could see a case for providing a switch to override the default pager. If you wish to disable the automatic running of mhshow by show, you can use -nocheckmime and that should get you back to show handling messages directly if that's what you prefer. I think this is even all in the man pages, so you wouldn't need to look in the source code unless you really wanted to :-) --Ken
Re: nmh setup on macos by newbie
>> There are two ways of establishing a TLS connection to a server: > >> I've suggested the first method in the hope your server supports >> that. If it doesn't seem to, at least on the port you're trying, >> then inc has a -tls option which attempts the second method. > > inc: -tls unknown Ralph was SLIGHTLY optimistic with his advice. I didn't implement -tls for inc for 1.7 because at the time I didn't have access to a POP server that supported the STLS command (it was implemented post-1.7.1; sigh, we need a new release). Most ISPs who run POP servers also support initial TLS on a separate port (995). You might try that. >so for now I use fetchmail to get the mail from 2 ISP's and procmail >delivers to inbox (and the lists folders) and then mh reads from there >and sends mail (which works fine with msmtp). > >Should I insist on getting inc to fetch from the remote host? Or is the >fetchmail setup good enough? (considering that inc would have to fetch >from 6 email accounts across those 2 domains.) That's up to you. Personally I prefer having as few moving parts as possible and your setup has a lot of pieces. But plenty of people do what you do. >... it seems that inc does not see new mail that procmail 'delivered' >into inbox - only when I us C-u M-x mh-rmail and choose the inbox folder >with 'all' do I get to see all mails, including the unread ones. It is important to understand the purpose of inc; it takes messages from a maildrop (e.g., a POP server, a spool file) and places them into a MH mailbox. As I understand it, you're having procmail copy the messages directly into a MH mailbox. That's totally your choice, but if that is what is happening then of course inc doesn't see them. They should be visible with other MH tools (e.g., scan). >And right now after processing the mail, I have 4 mails that are left >in the inbox folder (where they should be) but in Dired I can also see >153 other mails that have a comma prepended to their number filename and >they are not in the mh-folder view of inbox ... Comma-prepended filenames are messages that have been removed with rmm. --Ken