How to debug 'ignoring non-mail file' issues
On 2014-09-01 09:41:06, Perttu Luukko wrote: > Yes, that indeed works. I'll probably move these ignored files to a > separate folder for inspection. I looked at the mails that are still ignored after upgrading GMime to latest version, and I think I have found what they have in common. All of my ignored emails are from 2010-2011, and for some reason these mails contain a line like this: >From username Wed Sep 28 16:43:49 2011 somewhere among the headers. Note the '>' at the beginning of the line. The mails that are still ignored after upgrading GMime are those where this line happens to be the first line. Also, all of them have attachments for some reason. That line certainly doesn't look right, and I don't know where it came from. It might be some byproduct of mail redirection, since it shows my username, but the mails are not sent by me. I moved these problematic lines to the second line of each message, and now they are imported without problems. I probably won't file a bug for GMime because I have no idea whether this is just some oddity caused by my mail setup. Let this information reside here in case someone else has a similar problem. -- Perttu
How to debug 'ignoring non-mail file' issues
On 2014-09-03 19:03:40, Jani Nikula wrote: > On Wed, 03 Sep 2014, Perttu Luukko wrote: > > What I mean that there would be a separate error for cases "Does not > > resemble an email message at all", i.e., some control file your mail > > server happens to store in the mailbox, and "Looks like mail but we > > can't parse it", i.e., better find out why it can't be parsed to avoid > > potentially important messages going missing from the database. > > As I said, GMime does not tell us the difference between the two. There could be a separate parsing step that reads the first kilobyte or so and checks whether it is text, and whether there is a line starting with "From: " and possibly other headers. This could be run if GMime thinks the file is not mail so there would be negligible overhead. This is just a suggestion. Notmuch users are probably quite experienced so they can always investigate on their own why their emails are being ignored. But there could be more warning about ignored messages. Something like, at the end of each 'notmuch new' output: "Note: some files were ignored as non-mail. Check the list at ~/mail/.notmuch/ignored-files and adjust your ~/.notmuch-config". -- Perttu
How to debug 'ignoring non-mail file' issues
On 2014-09-02 23:37:12, Jani Nikula wrote: > On Mon, 01 Sep 2014, Perttu Luukko wrote: > > Yes, upgrading to GMime 2.6.20 caused all the messages on my server > > classified as mail. > > What was the old version? If it was 2.4 we should probably consider > dropping support for that in future notmuch. It was 2.4.33. It might still work for other people, I don't know. I still have some ignored mails. If I can nail down why they are ignored we might now more about why GMime 2.4 ignored even more mail. They were from around the same time period, so it might have something to do with the email setup I had at that time. > > Even more reason to give a separate warning for GMime parse errors. > > Not sure. We only get a binary success/fail from GMime, and that gets > printed for all non-email files. I'm not sure it's helpful. What I mean that there would be a separate error for cases "Does not resemble an email message at all", i.e., some control file your mail server happens to store in the mailbox, and "Looks like mail but we can't parse it", i.e., better find out why it can't be parsed to avoid potentially important messages going missing from the database. -- Perttu
Re: How to debug 'ignoring non-mail file' issues
On 2014-09-03 19:03:40, Jani Nikula wrote: On Wed, 03 Sep 2014, Perttu Luukko perttu.luu...@iki.fi wrote: What I mean that there would be a separate error for cases Does not resemble an email message at all, i.e., some control file your mail server happens to store in the mailbox, and Looks like mail but we can't parse it, i.e., better find out why it can't be parsed to avoid potentially important messages going missing from the database. As I said, GMime does not tell us the difference between the two. There could be a separate parsing step that reads the first kilobyte or so and checks whether it is text, and whether there is a line starting with From: and possibly other headers. This could be run if GMime thinks the file is not mail so there would be negligible overhead. This is just a suggestion. Notmuch users are probably quite experienced so they can always investigate on their own why their emails are being ignored. But there could be more warning about ignored messages. Something like, at the end of each 'notmuch new' output: Note: some files were ignored as non-mail. Check the list at ~/mail/.notmuch/ignored-files and adjust your ~/.notmuch-config. -- Perttu ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
How to debug 'ignoring non-mail file' issues
On 2014-09-01 09:52:20, Perttu Luukko wrote: > If the files really are ignored because of GMime it also explains why so > much more files are ignored on my mail provider's server than on my > laptop. The server probably has an older version of GMime. I'll upgrade > and see if that makes a difference. Yes, upgrading to GMime 2.6.20 caused all the messages on my server classified as mail. Even more reason to give a separate warning for GMime parse errors. I'll see if my archive of older emails still contains some ignored files. -- Perttu
How to debug 'ignoring non-mail file' issues
On 2014-08-31 07:41:42, David Bremner wrote: > Perttu Luukko writes: > > The vast majority of these ignored mails are not ignored after I > > transfer them with offlineimap to another computer. I can non-ignore > > these files probably by copying the renamed file back to the mail > > server, so this is fixable. Offlineimap shouldn't mess with the file's > > contents, so is there something that can cause notmuch to ignore a file > > based on its name? > > The most likely cause is that the files are mboxes, whether intentional > or not. In particular if they start with a "From " (note the lack of :) > and contain a second "From " at the beginning of a line later in the > file. In this case something like sed can replace the initial > "From " with "X-Envelope-From: ". > > I agree that the error message could be more informative in this case. No, the mails do contain "From: " with the appropriate colon. If I understood correctly notmuch returns the same "not mail" return code both when the essential headers are missing (so the file probably really isn't mail) and when GMime fails to parse the message. I think it would be a good idea to give a different warning in the latter case. If the files really are ignored because of GMime it also explains why so much more files are ignored on my mail provider's server than on my laptop. The server probably has an older version of GMime. I'll upgrade and see if that makes a difference. -- Perttu
How to debug 'ignoring non-mail file' issues
On 2014-08-31 09:46:12, David Bremner wrote: > Perttu Luukko writes: > > > I understand that the list of non-mail files is stored in the > > notmuch database and the files are completely ignored from there on. > > This actually makes it harder to debug these kind of issues since > > the list of ignored mails is only visible on the first invocation of > > 'notmuch new', unless the files are moved around. Is there some way > > to extract the list of ignored files from the database for > > inspection? Maybe 'notmuch new' could have some kind of > > --unignore-non-mail switch that would reconsider previously ignored > > files. > > I _think_ it should suffice to do something like > >find Maildir -type d -exec touch {} \; > > to force a rescan Yes, that indeed works. I'll probably move these ignored files to a separate folder for inspection. -- Perttu
Re: How to debug 'ignoring non-mail file' issues
On 2014-08-31 07:41:42, David Bremner wrote: Perttu Luukko perttu.luu...@iki.fi writes: The vast majority of these ignored mails are not ignored after I transfer them with offlineimap to another computer. I can non-ignore these files probably by copying the renamed file back to the mail server, so this is fixable. Offlineimap shouldn't mess with the file's contents, so is there something that can cause notmuch to ignore a file based on its name? The most likely cause is that the files are mboxes, whether intentional or not. In particular if they start with a From (note the lack of :) and contain a second From at the beginning of a line later in the file. In this case something like sed can replace the initial From with X-Envelope-From: . I agree that the error message could be more informative in this case. No, the mails do contain From: with the appropriate colon. If I understood correctly notmuch returns the same not mail return code both when the essential headers are missing (so the file probably really isn't mail) and when GMime fails to parse the message. I think it would be a good idea to give a different warning in the latter case. If the files really are ignored because of GMime it also explains why so much more files are ignored on my mail provider's server than on my laptop. The server probably has an older version of GMime. I'll upgrade and see if that makes a difference. -- Perttu ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: How to debug 'ignoring non-mail file' issues
On 2014-09-01 09:52:20, Perttu Luukko wrote: If the files really are ignored because of GMime it also explains why so much more files are ignored on my mail provider's server than on my laptop. The server probably has an older version of GMime. I'll upgrade and see if that makes a difference. Yes, upgrading to GMime 2.6.20 caused all the messages on my server classified as mail. Even more reason to give a separate warning for GMime parse errors. I'll see if my archive of older emails still contains some ignored files. -- Perttu ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
How to debug 'ignoring non-mail file' issues
Hi, I indexed my archive of emails from recent years with notmuch (about 10k messages so not much). I have quite a lot of messages 'notmuch new' ignores as non-mail files, about 1000 of them. They are not obviously malformed, meaning that the files certainly look like emails when opened in a text editor. I'd like to find out why these files are ignored, and if there is something I can do to fix them. Of course I'd like to have a complete database of my old emails, with nothing falling through the cracks like this. The vast majority of these ignored mails are not ignored after I transfer them with offlineimap to another computer. I can non-ignore these files probably by copying the renamed file back to the mail server, so this is fixable. Offlineimap shouldn't mess with the file's contents, so is there something that can cause notmuch to ignore a file based on its name? Looking at the rest of the ignored messages most of them seem to have very large attachments, but there are possibly others. There is only maybe 20 of these kinds of emails so I can try to fix them manually. Still, it would help if I knew what exactly caused notmuch to ignore the file. I understand most of the message parsing is done with gmime. Does gmime give any diagnostics on parse errors that could be used to give a reason for thinking a file is not mail? I understand that the list of non-mail files is stored in the notmuch database and the files are completely ignored from there on. This actually makes it harder to debug these kind of issues since the list of ignored mails is only visible on the first invocation of 'notmuch new', unless the files are moved around. Is there some way to extract the list of ignored files from the database for inspection? Maybe 'notmuch new' could have some kind of --unignore-non-mail switch that would reconsider previously ignored files. -- Perttu Luukko
How to debug 'ignoring non-mail file' issues
Hi, I indexed my archive of emails from recent years with notmuch (about 10k messages so not much). I have quite a lot of messages 'notmuch new' ignores as non-mail files, about 1000 of them. They are not obviously malformed, meaning that the files certainly look like emails when opened in a text editor. I'd like to find out why these files are ignored, and if there is something I can do to fix them. Of course I'd like to have a complete database of my old emails, with nothing falling through the cracks like this. The vast majority of these ignored mails are not ignored after I transfer them with offlineimap to another computer. I can non-ignore these files probably by copying the renamed file back to the mail server, so this is fixable. Offlineimap shouldn't mess with the file's contents, so is there something that can cause notmuch to ignore a file based on its name? Looking at the rest of the ignored messages most of them seem to have very large attachments, but there are possibly others. There is only maybe 20 of these kinds of emails so I can try to fix them manually. Still, it would help if I knew what exactly caused notmuch to ignore the file. I understand most of the message parsing is done with gmime. Does gmime give any diagnostics on parse errors that could be used to give a reason for thinking a file is not mail? I understand that the list of non-mail files is stored in the notmuch database and the files are completely ignored from there on. This actually makes it harder to debug these kind of issues since the list of ignored mails is only visible on the first invocation of 'notmuch new', unless the files are moved around. Is there some way to extract the list of ignored files from the database for inspection? Maybe 'notmuch new' could have some kind of --unignore-non-mail switch that would reconsider previously ignored files. -- Perttu Luukko ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
'notmuch new' trying to read non-existing files
On 2014-08-26 13:01:22, David Bremner wrote: > Perttu Luukko writes: > > When I run 'notmuch new' I get: > > > > Found 9903 total files (that's not much mail). > > Error reading file /home/users/(username)/Maildir/.act/.act: No such > > file > > or directory > > I'm grasping at straws a bit, but do you by chance have some fancy > symlinks in your Maildir? Well this is embarrassing, but this was indeed the case. I had cleaned the Maildir from what I thought were leftover symlinks from some time long ago. Actually, I had a script that creates, with symlinks, a copy of my mailbox without the dots in the directory names for use with Mutt. I had left out parameter -T from 'ln' so my script also created weird symlinks at ~/Maildir, and thus resurrected the links I thought I cleaned up. And the links were indeed the problem. Everything is working now. Sorry and thanks! -- Perttu
Re: 'notmuch new' trying to read non-existing files
On 2014-08-26 13:01:22, David Bremner wrote: Perttu Luukko perttu.luu...@iki.fi writes: When I run 'notmuch new' I get: Found 9903 total files (that's not much mail). Error reading file /home/users/(username)/Maildir/.act/.act: No such file or directory I'm grasping at straws a bit, but do you by chance have some fancy symlinks in your Maildir? Well this is embarrassing, but this was indeed the case. I had cleaned the Maildir from what I thought were leftover symlinks from some time long ago. Actually, I had a script that creates, with symlinks, a copy of my mailbox without the dots in the directory names for use with Mutt. I had left out parameter -T from 'ln' so my script also created weird symlinks at ~/Maildir, and thus resurrected the links I thought I cleaned up. And the links were indeed the problem. Everything is working now. Sorry and thanks! -- Perttu ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
'notmuch new' trying to read non-existing files
Hi, I decided to give notmuch a spin and installed version 0.18.1 on my mail provider's shell server. The layout offered by my mail provider's Dovecot is such that INBOX is stored in Maildir format at ~/Maildir and other folders are stored as subfolders of ~/Maildir, filename of each directory beginning with a period. In addition, ~/Maildir contains files 'dovecot-uidlist' and dovecot-uidvalidity', and each subdirectory contains an empty file 'maildirfolder' in addition to the usual cur, new and tmp. I don't know if this is an unusual layout or not. When I run 'notmuch new' I get: Found 9903 total files (that's not much mail). Error reading file /home/users/(username)/Maildir/.act/.act: No such file or directory Processed 1 file in almost no time. Added 1 new message to the database. Note: A fatal error was encountered: Something went wrong trying to read or write a file The subdirectory .act is really the first (alphabetically) subdirectory of ~/Maildir, but .act/.act does not exist and I don't know why notmuch tries to read it. In a following run the .act subdirectory gets replaced by .Drafts, but the error is the same. So for some reason 'notmuch new' tries to read for each subdirectory a deeper subdirectory which does not exist. Only emails in the top-level INBOX folder are added to the database. The same collection of email offlineimap'd to my local computer and with a more plain layout (each folder as a subdirectory of ~/mail, no dots) is read without problems. What could be going wrong here? Is this a layout that should be indexed by notmuch? Please note that I'm not subscribed to this mailing list -- I'm not using notmuch yet so I can't handle the volume :) -- Perttu Luukko ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
'notmuch new' trying to read non-existing files
Hi, I decided to give notmuch a spin and installed version 0.18.1 on my mail provider's shell server. The layout offered by my mail provider's Dovecot is such that INBOX is stored in Maildir format at ~/Maildir and other folders are stored as subfolders of ~/Maildir, filename of each directory beginning with a period. In addition, ~/Maildir contains files 'dovecot-uidlist' and dovecot-uidvalidity', and each subdirectory contains an empty file 'maildirfolder' in addition to the usual cur, new and tmp. I don't know if this is an unusual layout or not. When I run 'notmuch new' I get: Found 9903 total files (that's not much mail). Error reading file /home/users/(username)/Maildir/.act/.act: No such file or directory Processed 1 file in almost no time. Added 1 new message to the database. Note: A fatal error was encountered: Something went wrong trying to read or write a file The subdirectory .act is really the first (alphabetically) subdirectory of ~/Maildir, but .act/.act does not exist and I don't know why notmuch tries to read it. In a following run the .act subdirectory gets replaced by .Drafts, but the error is the same. So for some reason 'notmuch new' tries to read for each subdirectory a deeper subdirectory which does not exist. Only emails in the top-level INBOX folder are added to the database. The same collection of email offlineimap'd to my local computer and with a more plain layout (each folder as a subdirectory of ~/mail, no dots) is read without problems. What could be going wrong here? Is this a layout that should be indexed by notmuch? Please note that I'm not subscribed to this mailing list -- I'm not using notmuch yet so I can't handle the volume :) -- Perttu Luukko