Re: [PATCH] Don't bother checking for mbox files
On Sun, Mar 13 2016, Jani Nikulawrote: > [ text/plain ] > On Sun, 13 Mar 2016, Edward Betts wrote: >> Keith Packard wrote: >>> Postfix adds mbox-style From lines when used in combination with >>> maildrop or .forward files. If they have another line starting with >>> 'From ' in them, notmuch complains about them not being mail files. >>> >>> If we assume the user hasn't screwed up and misconfigured their mail >>> system, then we can safely ignore whether the file started with an >>> mbox header and just parse it as a single-message file. >> >> I think it is fine to go ahead with this change. At the same time the >> behaviour of Postfix should be corrected so it doesn't add mbox-style From >> lines to mails in maildir format. > > I disagree with making the change (as-is, at least). > > In general, Notmuch does not support mboxes. We expect maildir style one > message per file mail storage. We support single-message mboxes as a > special case, in part because, as you note, there's plenty of other > software that adds the mbox "From " line even though delivering to > maildir. > > I think it's misleading and confusing to the users to accept and index > the first message of mboxes, and silently ignore the rest (or worse, > index all of the mbox and associate the text with the first message). I > think we should reject multi-message mboxes, because we have no code to > handle them. This patch throws away that check. > > Now, IIUC, the problem here is not that the files actually are > multi-message mboxes. We could use a sample message (even a crafted one) > that exhibits the problem, so we could add a test case, and fix Notmuch > to deal with it gracefully (if we decide catering to potentially broken > other software is the way to go), while retaining the code to reject > multi-message mboxes. With the test case, we'd also avoid accidentally > breaking this in the future. I agree with Jani; user may accidentally index one mbox with multiple messages as single message if this were merged... We currently have very simple check; just line starting with 'From ' to separate messages (and first line starts with 'From '). After a quick check of these 'mbox*' "specs" this may just be within the "standard". In mboxviewfs I checked whether there is at least one empty line before '^From' (might not be required by the standard, but whatever ;/) and that there is at least 'Date:' header following (needed for file "time")... but even this "heuristics" may not be enough if we wanted to go deep into this (i.e. there are emails which quote beginning of an mbox file (ok, no heuristics can match this unless there is human-level AI working on it ;) OTOH, presumably https://github.com/GNOME/gmime/blob/master/tests/data/mbox/input/substring.mbox contains 3 messages (or what??!!11) ... Perhaps the simplest is to give users possibility to use 'footgun' option in notmuch new (notmuch insert probably doesn't need it ???) which can be used to skip the 'mbox' check (I was going to suggest configuration option, but as we don't support that in bindings, ...). But of course some of the simplicity is gone when one forgets to give the --footgun option -- next notmuch new with the footgun probably will not pick the mail file again (or we have to hold on updating the directory mtime indefinitely -- or do other changes (i.e. more complicated which no-one reviews(*) anyway >;/)) > BR, > Jani. Tomi (*) Although when someone sends less than usual trivial patches which provides significant progression to the functionality those are reviewed promptly with a relatively good number of reviewers... One 'other change' could be e.g. keep a list of files that has been failing due to this and retry those if this footgun option is given. ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH] Don't bother checking for mbox files
On Sun, 13 Mar 2016, Edward Bettswrote: > Keith Packard wrote: >> Postfix adds mbox-style From lines when used in combination with >> maildrop or .forward files. If they have another line starting with >> 'From ' in them, notmuch complains about them not being mail files. >> >> If we assume the user hasn't screwed up and misconfigured their mail >> system, then we can safely ignore whether the file started with an >> mbox header and just parse it as a single-message file. > > I think it is fine to go ahead with this change. At the same time the > behaviour of Postfix should be corrected so it doesn't add mbox-style From > lines to mails in maildir format. I disagree with making the change (as-is, at least). In general, Notmuch does not support mboxes. We expect maildir style one message per file mail storage. We support single-message mboxes as a special case, in part because, as you note, there's plenty of other software that adds the mbox "From " line even though delivering to maildir. I think it's misleading and confusing to the users to accept and index the first message of mboxes, and silently ignore the rest (or worse, index all of the mbox and associate the text with the first message). I think we should reject multi-message mboxes, because we have no code to handle them. This patch throws away that check. Now, IIUC, the problem here is not that the files actually are multi-message mboxes. We could use a sample message (even a crafted one) that exhibits the problem, so we could add a test case, and fix Notmuch to deal with it gracefully (if we decide catering to potentially broken other software is the way to go), while retaining the code to reject multi-message mboxes. With the test case, we'd also avoid accidentally breaking this in the future. BR, Jani. ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH] Don't bother checking for mbox files
Keith Packardwrote: > Postfix adds mbox-style From lines when used in combination with > maildrop or .forward files. If they have another line starting with > 'From ' in them, notmuch complains about them not being mail files. > > If we assume the user hasn't screwed up and misconfigured their mail > system, then we can safely ignore whether the file started with an > mbox header and just parse it as a single-message file. I think it is fine to go ahead with this change. At the same time the behaviour of Postfix should be corrected so it doesn't add mbox-style From lines to mails in maildir format. The same problem existed in the Debian exim4 config. I filed a bug, it was fixed: https://bugs.debian.org/769396 Here is a bug in maildrop suggesting that it should strip the mbox-style From line from the top of mails: https://bugs.debian.org/737383 -- Edward. ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH] Don't bother checking for mbox files
Postfix adds mbox-style From lines when used in combination with maildrop or .forward files. If they have another line starting with 'From ' in them, notmuch complains about them not being mail files. If we assume the user hasn't screwed up and misconfigured their mail system, then we can safely ignore whether the file started with an mbox header and just parse it as a single-message file. I can't see any harm in doing this; in the worst case, you'll have a single 'message' which contains multiple emails now, while the alternative is to miss mail entirely. I could be convinced to add a config option to enable this behaviour, but I'd really rather not. Signed-off-by: Keith Packard--- lib/message-file.c | 29 + 1 file changed, 1 insertion(+), 28 deletions(-) diff --git a/lib/message-file.c b/lib/message-file.c index ee30520..f03e1db 100644 --- a/lib/message-file.c +++ b/lib/message-file.c @@ -98,22 +98,6 @@ _notmuch_message_file_close (notmuch_message_file_t *message) talloc_free (message); } -static notmuch_bool_t -_is_mbox (FILE *file) -{ -char from_buf[5]; -notmuch_bool_t ret = FALSE; - -/* Is this mbox? */ -if (fread (from_buf, sizeof (from_buf), 1, file) == 1 && - strncmp (from_buf, "From ", 5) == 0) - ret = TRUE; - -rewind (file); - -return ret; -} - notmuch_status_t _notmuch_message_file_parse (notmuch_message_file_t *message) { @@ -121,13 +105,10 @@ _notmuch_message_file_parse (notmuch_message_file_t *message) GMimeParser *parser; notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; static int initialized = 0; -notmuch_bool_t is_mbox; if (message->message) return NOTMUCH_STATUS_SUCCESS; -is_mbox = _is_mbox (message->file); - if (! initialized) { g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS); initialized = 1; @@ -144,7 +125,7 @@ _notmuch_message_file_parse (notmuch_message_file_t *message) g_mime_stream_file_set_owner (GMIME_STREAM_FILE (stream), FALSE); parser = g_mime_parser_new_with_stream (stream); -g_mime_parser_set_scan_from (parser, is_mbox); +g_mime_parser_set_scan_from (parser, FALSE); message->message = g_mime_parser_construct_message (parser); if (! message->message) { @@ -152,14 +133,6 @@ _notmuch_message_file_parse (notmuch_message_file_t *message) goto DONE; } -if (is_mbox && ! g_mime_parser_eos (parser)) { - /* -* This is a multi-message mbox. (For historical reasons, we -* do support single-message mboxes.) -*/ - status = NOTMUCH_STATUS_FILE_NOT_EMAIL; -} - DONE: g_object_unref (stream); g_object_unref (parser); -- 2.7.0 -- -keith signature.asc Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch