Re: [PATCH 1/4] Import date/time parser from GNU coreutils
On Mon, 24 Jan 2011, Jameson Rollins wrote: On Sun, 23 Jan 2011 12:47:24 +0100, Michal Sojka sojk...@fel.cvut.cz wrote: This function have quite a lot dependencies. We may reduce them later it it is a problem. --- lib/c-ctype.c | 398 +++ lib/c-ctype.h | 297 + lib/getdate.c | 3497 lib/getdate.h | 22 + lib/getdate.y | 1572 + lib/gettime.c | 48 + lib/intprops.h | 83 ++ lib/timespec.h | 39 + lib/verify.h | 140 +++ 9 files changed, 6096 insertions(+), 0 deletions(-) create mode 100644 lib/c-ctype.c create mode 100644 lib/c-ctype.h create mode 100644 lib/getdate.c create mode 100644 lib/getdate.h create mode 100644 lib/getdate.y create mode 100644 lib/gettime.c create mode 100644 lib/gettime.h create mode 100644 lib/intprops.h create mode 100644 lib/timespec.h create mode 100644 lib/verify.h Hi, Michal. I don't fully understand what's going on here, but it seems like you're embedding code copies from somewhere else. If that's the case, is there a reason that we would need to do that, rather than just linking against an external library? Well, if the embedded code is available in a library, it would be definitely better to just use the library. But the above code is statically linked to things like `date` command and is not available separately. Most of the dependencies could be eliminated since they usually replicate functionality which is available in modern C library and are there only for compatibility reasons. On the other hand, if anybody knows a better date parser, perhaps in a separate library, let me know. -Michal ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: Tag timestamps and synchronization
On Mon, 24 Jan 2011, dm-list-email-notm...@scs.stanford.edu wrote: One of the features I would like to see from notmuch is an easier ability to synchronize tags across machines. At the very least, I would need either incremental dump and restore, or some way to communicate arbitrary tags to a local imap server that shares notmuch's maildir (much as notmuch currently syncs the standard tags), so that I synchronize two maildirs with a tool like offlineimap. [...] In the case of dovecot, the arbitrary tag format is very simple. Each maildir has a file called dovecot-keywords mapping numbers 0, 1, ... to keywords. Then mail file names contain lower-case letters for the flags they are marked with--0 = a, 1 = b, etc.--allowing up to 26 arbitrary tags for each maildir. One could probably sync to dovecot's maildir format relatively easily in a script given incremental dump and restore of tags. Or possibly notmuch could natively support dovecot as one of multiple back-end tag storage schemes. Hi David, here is my idea of solving the problem of synchronizing tags and all message metadata. The problem, it seems, is that every program uses a different format for message metadata. Maybe, it would be useful to define a simple metadata format that could be used by multiple programs (at least by notmuch, dovecot and maybe mutt) and base the synchronization on this format. Currently, I'm thinking about a separate file with the same base name as the message storing message metadata in the same format as message headers so it could look like: tag: inbox tag: notmuch timestamp: 2011-01-25 10:48:00 GMT spam: no ... Then, any program could do whatever it wants with the metadata, e.g. index them in a database etc. In the ideal it would work like this: Dovecot would store the metadata in a file like described above. IMAP protocol would be extended to be able to send such metadata corresponding to a particular UID. offlineimap would be able to retrieve (and synchronize) the metadata files with the IMAP server and notmuch would index the metadata similarly as it index messages and would modify them when it change tags. What do you (and others) think? Is this too wild? Too longterm? Cheers Michal ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: Strange match to my query
On Tue, 25 Jan 2011 19:51:14 -0500, Austin Clements amdra...@gmail.com wrote: Well-constructed test message. Xapian's query parser is actually doing the right thing [1] and this is a bug in the way notmuch indexes address list headers. For each address, _notmuch_message_gen_terms resets the term generator's term position, so your To header indexes with positions as c:1 hello:2 com:3 K:1 R:2 world:3 com:4 Thanks, Austin! I was actually giving a demo of notmuch to someone yesterday who was really interested in the details of how Xapian actually stores things. I dug around a bit with delve and we were both really surprised by the position results we were seeing. Neither of us could make any sense of them at all. And thanks, Mark for the bug report and the nice test case. I'll add this to the test suite, and fix it. And that will give us yet one more reason for all of us to rebuild our databases after the upcoming release. -Carl -- carl.d.wo...@intel.com pgp04iN9DjrgH.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
fix notmuch.vim NM_compuse_get_user_email() (Patch)
Here's a bitty patch to the vim plugin; it now calculates the primary email of the user based on a call to notmuch config. There's still a lot of work that needs to get done on notmuch.vim, e.g., the ability to have multiple emails/accounts. Best, Peter --- notmuch.vim 2010-11-18 17:26:14.0 -0500 +++ notmuch.vim.mine2011-01-25 23:57:50.0 -0500 @@ -18,7 +18,8 @@ along with Notmuch. If not, see http://www.gnu.org/licenses/. Authors: Bart Trojanowski b...@jukie.net - + Contributors: Peter Hartman peterjohnhart...@gmail.com + --- configuration defaults {{{1 let s:notmuch_defaults = { @@ -1024,11 +1025,9 @@ --- --- compose screen helper functions {{{2 function! s:NM_compose_get_user_email() -let name = substitute(system('id -u -n'), '\v(^\s*|\s*$|\n)', '', 'g') -let fqdn = substitute(system('hostname -f'), '\v(^\s*|\s*$|\n)', '', 'g') - - TODO: do this properly -return name . '@' . fqdn + TODO: do this properly (still), i.e., allow for multiple email accounts +let email = substitute(system('notmuch config get user.primary_email'), '\v(^\s*|\s*$|\n)', '', 'g') + return email endfunction function! s:NM_compose_find_line_match(start, pattern, failure) -- sic dicit magister P PhD Candidate Collaborative Programme in Ancient and Medieval Philosophy University of Toronto http://individual.utoronto.ca/peterjh ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Tag timestamps and synchronization
dm-list-email-notmuch at scs.stanford.edu(dm-list-email-notmuch at scs.stanford.edu)@240111-11:10: > One of the features I would like to see from notmuch is an easier > ability to synchronize tags across machines. At the very least, I > would need either incremental dump and restore, or some way to > communicate arbitrary tags to a local imap server that shares > notmuch's maildir (much as notmuch currently syncs the standard tags), > so that I synchronize two maildirs with a tool like offlineimap. David, I do something like this by using some shell scripts with formail, to 'store' notmuch tags into the X-Label headers of the individual mails. Offlineimap then syncs these headers. If I need the tags to become notmuch-ified on the target, I just scan all the mail's X-Label headers. (Actually it's better than this, since I use maildrop to set notmuch tags with notmuch-deliver, *and* set X-Label headers to the same thing, at mail delivery time. Then I use keybindings and shell scripts in mutt such that whenever I retag a message, it is pushed to both notmuch and X-Label.) I'm happy to share this hack glue if it would help. This is not great for a few reasons - there are a ton of moving parts, and some double-work. If notmuch could index X-Label headers (a coming feature I hear) then this would be much cleaner. This is just one way of doing it, that works for me... Tim > As Carl pointed out to me in private email, there has been some > previous discussion in the following thread: > > notmuch show id:87hbfnmiux.fsf at yoom.home.cworth.org > > Based on that thread, there seems to be some desire for notmuch to > keep track of a per-message timestamp when the flags were last > updated. This would allow much easier expiration for people who want > the deleted tag. It would also allow incremental dump and restore of > tags, which is exactly what I need to sync tags across servers with > reasonable amounts of bandwidth. > > Metadata timestamps are one of those things that probably have a lot > of different applications, so since Carl is considering a new database > format for the next release anyway, perhaps it also makes sense to add > a metadata change time for each messages. > > The timestamp would be included in "dump" output, and you could > request a dump of changes since a particular time. On restore, you > might have several options: > > - overwrite: always set the new tags and timestamp in the database > to the value in the restore data. > > - update: always set the tags, but update the to the current time. > > - conditional T: update only if the message metadata has not been > updated since time T. > > To sync flags, then you just need to keep track of the last time you > synced with a particular server--call this time T. Do a dump since > time T, upload to server, do a conditional restore for time T on > server. Finally do a partial dump from time T on the server and an > overwrite import on the client. (This policy makes changes on the > server always override conflicting ones on the client--perhaps people > want other policies, like union of the tags, etc.) > > > Second, there seems to be some desire in that thread to sync with IMAP > flags. This would be particularly great, but the easies way to do it > is probably *not* to try to implement IMAP, but rather to use an > existing IMAP server and just modify the maildir so that the IMAP > server will pick up the flags. > > In the case of dovecot, the arbitrary tag format is very simple. Each > maildir has a file called dovecot-keywords mapping numbers 0, 1, > ... to keywords. Then mail file names contain lower-case letters for > the flags they are marked with--0 => a, 1 => b, etc.--allowing up to > 26 arbitrary tags for each maildir. One could probably sync to > dovecot's maildir format relatively easily in a script given > incremental dump and restore of tags. Or possibly notmuch could > natively support dovecot as one of multiple back-end tag storage > schemes. > > Having a static tag mapping in the .notmuch-config file would be much > better than hard-coding flag2tag. However, I'm not sure it's > sufficient. The reason is that if you ever completely delete a tag > (e.g., you have "todo", and "meeting" tags and periodically have no > messages in either categories in a given mail folder), then an IMAP > server like dovecot might end up re-allocating the letters > corresponding to those tags in a different order. Also, at least for > dovecot, the flag mappings are per-folder, which you kind of want > since you are limited to 26 non-standard tags, so global values might > not work. > > I'm curious to hear people's thoughts/reactions? > > David -- Tim Stoakes
[PATCH] Add --include-duplicates option to a couple of commands.
This adds new functionality under the names of: notmuch search --output=files --include-duplicates notmuch show --include-duplicates notmuch show --format=json --include-duplicates These new commands behave similarly to the existing commands without the --include-duplicates agument. The difference is that with the new argument any duplicate mail files will be included in the output. Here, files are considered duplicates if they contain identical contents for the Message-Id header, (regardless of any other differences in the content of the file). Without the --include-duplicates argument, these commands would emit a single, arbitrary file in the face of duplicates. WARNING: This commit is not yet ready to be pushed to the notmuch repository. There are at least two problems with the commit so far: 1. Nothing has been documented yet. Fixing this shouldn't be too hard. It's mostly just taking the text from above and shoving it into the documentation. I can do this easily enough myself. 2. show --format=json --include-duplicates doesn't work yet This is a more serious problem. I believe the JSON output with this patch is not correct and will likely break a client trying to consume it. It inserts the duplicate message into an array next to the existing message. Our current JSON schema isn't documented formally that I could find, except for a comment in the emacs code that consumes it: A thread is a forest or list of trees. A tree is a two element list where the first element is a message, and the second element is a possibly empty forest of replies. I believe this commit breaks the "two-element list" expectation. What we would want instead is the duplicate message to appear as a peer next to the original message, (and then perhaps have replies appear only to one of the messages). My current need for --include-duplicates was recently satisfied, so I won't likely pursue this further for now. But I wanted to put this code out rather than losing it. If someone wants to fix the patch to do the "right thing" with the JSON output, then that would be great. ALSO NOTE: I left the json.expected-output/notmuch-show-thread-format-json-maildir-storage out of this commit. It has lines in it that are too long to be sent via git-send-email. --- notmuch-search.c | 30 +- notmuch-show.c | 61 +-- test/basic |2 +- test/json | 33 ++- ...-show-thread-include-duplicates-maildir-storage | 94 .../notmuch-show-thread-maildir-storage| 47 test/search-output | 113 7 files changed, 361 insertions(+), 19 deletions(-) create mode 100644 test/json.expected-output/notmuch-show-thread-include-duplicates-maildir-storage create mode 100644 test/json.expected-output/notmuch-show-thread-maildir-storage diff --git a/notmuch-search.c b/notmuch-search.c index c628b36..6d032c2 100644 --- a/notmuch-search.c +++ b/notmuch-search.c @@ -247,7 +247,8 @@ static int do_search_messages (const void *ctx, const search_format_t *format, notmuch_query_t *query, - output_t output) + output_t output, + notmuch_bool_t include_duplicates) { notmuch_message_t *message; notmuch_messages_t *messages; @@ -269,8 +270,25 @@ do_search_messages (const void *ctx, fputs (format->item_sep, stdout); if (output == OUTPUT_FILES) { - format->item_id (ctx, "", -notmuch_message_get_filename (message)); + if (include_duplicates) { + notmuch_filenames_t *filenames; + int first_filename = 1; + + for (filenames = notmuch_message_get_filenames (message); +notmuch_filenames_valid (filenames); +notmuch_filenames_move_to_next (filenames)) + { + if (! first_filename) + fputs (format->item_sep, stdout); + first_filename = 0; + + format->item_id (ctx, "", +notmuch_filenames_get (filenames)); + } + } else { + format->item_id (ctx, "", +notmuch_message_get_filename (message)); + } } else { /* output == OUTPUT_MESSAGES */ format->item_id (ctx, "id:", notmuch_message_get_message_id (message)); @@ -352,6 +370,7 @@ notmuch_search_command (void
[PATCH 1/4] Import date/time parser from GNU coreutils
On Mon, 24 Jan 2011, Jameson Rollins wrote: > On Sun, 23 Jan 2011 12:47:24 +0100, Michal Sojka > wrote: > > This function have quite a lot dependencies. We may reduce them later it > > it is a problem. > > --- > > lib/c-ctype.c | 398 +++ > > lib/c-ctype.h | 297 + > > lib/getdate.c | 3497 > > > > lib/getdate.h | 22 + > > lib/getdate.y | 1572 + > > lib/gettime.c | 48 + > > lib/intprops.h | 83 ++ > > lib/timespec.h | 39 + > > lib/verify.h | 140 +++ > > 9 files changed, 6096 insertions(+), 0 deletions(-) > > create mode 100644 lib/c-ctype.c > > create mode 100644 lib/c-ctype.h > > create mode 100644 lib/getdate.c > > create mode 100644 lib/getdate.h > > create mode 100644 lib/getdate.y > > create mode 100644 lib/gettime.c > > create mode 100644 lib/gettime.h > > create mode 100644 lib/intprops.h > > create mode 100644 lib/timespec.h > > create mode 100644 lib/verify.h > > Hi, Michal. I don't fully understand what's going on here, but it seems > like you're embedding code copies from somewhere else. If that's the > case, is there a reason that we would need to do that, rather than just > linking against an external library? Well, if the embedded code is available in a library, it would be definitely better to just use the library. But the above code is statically linked to things like `date` command and is not available separately. Most of the dependencies could be eliminated since they usually replicate functionality which is available in modern C library and are there only for compatibility reasons. On the other hand, if anybody knows a better date parser, perhaps in a separate library, let me know. -Michal
Tag timestamps and synchronization
On Mon, 24 Jan 2011, dm-list-email-notmuch at scs.stanford.edu wrote: > One of the features I would like to see from notmuch is an easier > ability to synchronize tags across machines. At the very least, I > would need either incremental dump and restore, or some way to > communicate arbitrary tags to a local imap server that shares > notmuch's maildir (much as notmuch currently syncs the standard tags), > so that I synchronize two maildirs with a tool like offlineimap. [...] > In the case of dovecot, the arbitrary tag format is very simple. Each > maildir has a file called dovecot-keywords mapping numbers 0, 1, > ... to keywords. Then mail file names contain lower-case letters for > the flags they are marked with--0 => a, 1 => b, etc.--allowing up to > 26 arbitrary tags for each maildir. One could probably sync to > dovecot's maildir format relatively easily in a script given > incremental dump and restore of tags. Or possibly notmuch could > natively support dovecot as one of multiple back-end tag storage > schemes. Hi David, here is my idea of solving the problem of synchronizing tags and all message metadata. The problem, it seems, is that every program uses a different format for message metadata. Maybe, it would be useful to define a simple metadata format that could be used by multiple programs (at least by notmuch, dovecot and maybe mutt) and base the synchronization on this format. Currently, I'm thinking about a separate file with the same base name as the message storing message metadata in the same format as message headers so it could look like: tag: inbox tag: notmuch timestamp: 2011-01-25 10:48:00 GMT spam: no ... Then, any program could do whatever it wants with the metadata, e.g. index them in a database etc. In the ideal it would work like this: Dovecot would store the metadata in a file like described above. IMAP protocol would be extended to be able to send such metadata corresponding to a particular UID. offlineimap would be able to retrieve (and synchronize) the metadata files with the IMAP server and notmuch would index the metadata similarly as it index messages and would modify them when it change tags. What do you (and others) think? Is this too wild? Too longterm? Cheers Michal
[PATCH 1/3] new: Do not defer maildir flag synchronization during the first run
* flags immediately, while the message is hot in > +* disk cache. */ > + notmuch_message_maildir_flags_to_tags (message); > + } > + } >break; >case NOTMUCH_STATUS_FILE_NOT_EMAIL: >fprintf (stderr, "Note: Ignoring non-mail file: %s\n", > -- > 1.7.2.3 > > ___ > notmuch mailing list > notmuch at notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch > -- next part -- An HTML attachment was scrubbed... URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20110125/434faf0a/attachment.html>
Strange match to my query
Hi guys, What's up? ("Notmuch") Apparently matching on email addresses doesn't work the way I hoped. While debugging why my to:x at y.com search was matching far too many entries, I whittled it down to this: WORD1=hello WORD2=goodbye MSGID=junk$(date +%s) TESTDIR=$(notmuch config get database.path)/.tmp/new TESTMAIL=$TESTDIR/$MSGID:2, mkdir -p $TESTDIR echo Testcase for $WORD1@$WORD2, msgid: $MSGID at junk.com echo "From: nobody at nobody.com To: c@${WORD1}.com, K-R@${WORD2}.com Date: Mon, 24 Jan 2011 23:41:34 -0600 Subject: Error Message-ID: <$MSGID at junk.com> Not empty body.= " > $TESTMAIL notmuch new notmuch search --output=files to:$WORD1@$WORD2 notmuch search --output=files to:\"$WORD1@$WORD2\" Why does that match, but this doesn't? notmuch search --output=files to:\'$WORD1@$WORD2\' Apparently single quotes are the only quote for Xapian's parser? I guess this is a strong vote for the quick integration of the custom parser with optimization passes that turn emails into phrases that can't match across multiple emails. This was just an egregious example of notmuch giving me notmuch of what I wanted, or actually, far too much of what I didn't want. Thanks, -Mark
Strange match to my query
Well-constructed test message. Xapian's query parser is actually doing the right thing [1] and this is a bug in the way notmuch indexes address list headers. For each address, _notmuch_message_gen_terms resets the term generator's term position, so your To header indexes with positions as c:1 hello:2 com:3 K:1 R:2 world:3 com:4 Thus, the phrase query "hello world" matches hello in position 2 and world in position 3. Probably the right thing for notmuch to do is to jump up the term generator position between each address so phrase queries don't cross them or span them. [1] Your to:\'$WORD1@$WORD2\' query didn't work because Xapian doesn't accept a single quote after a prefix. On Tue, Jan 25, 2011 at 6:29 PM, Mark Anderson wrote: > Hi guys, What's up? ("Notmuch") > > Apparently matching on email addresses doesn't work the way I hoped. > > While debugging why my to:x at y.com <to%3Ax at y.com> search was matching far > too many > entries, I whittled it down to this: > > WORD1=hello > WORD2=goodbye > MSGID=junk$(date +%s) > TESTDIR=$(notmuch config get database.path)/.tmp/new > TESTMAIL=$TESTDIR/$MSGID:2, > > mkdir -p $TESTDIR > > echo Testcase for $WORD1@$WORD2, msgid: $MSGID at junk.com > > echo "From: nobody at nobody.com > To: c@${WORD1}.com, K-R@${WORD2}.com > Date: Mon, 24 Jan 2011 23:41:34 -0600 > Subject: Error > Message-ID: <$MSGID at junk.com> > > Not empty body.= > > " > $TESTMAIL > > notmuch new > notmuch search --output=files to:$WORD1@$WORD2 > notmuch search --output=files to:\"$WORD1@$WORD2\" > > Why does that match, but this doesn't? > > notmuch search --output=files to:\'$WORD1@$WORD2\' > > Apparently single quotes are the only quote for Xapian's parser? > > I guess this is a strong vote for the quick integration of the custom > parser with optimization passes that turn emails into phrases that can't > match across multiple emails. > > This was just an egregious example of notmuch giving me notmuch of what > I wanted, or actually, far too much of what I didn't want. > > Thanks, > -Mark > > ___ > notmuch mailing list > notmuch at notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch > -- next part -- An HTML attachment was scrubbed... URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20110125/9247a302/attachment-0001.html>