[notmuch] Rather simple optimization for notmuch tag
I was updating my poll script that tags messages, and a common idiom is to put tag +mytag and not tag:mytag I don't know anything about efficiency, but for the simple single-tag case, couldn't we imply the "and not tag:mytag" from the +mytag action list for the tag command? The similar (dual?, rusty math terminology, beware of Math-tetanus) case of "tag -mytag and tag:mytag" could be similarly optimized, since the tag removal action ought to be a null action in the case that the search terms matched on a thread or message, but the tag to be removed isn't attached to the message/thread returned. Any thoughts on the subject? -Mark
[notmuch] automatically assigning tags to new messages?
Dear notmuch crowd, I heard about notmuch mail a few days ago and I started playing with it. So far, it makes me very happy, but there are some things that I need to learn how to do. I'll start with the most important one: tagging incoming messages automatically. What is the recommended way of achieving this? There is a variety of things that I would like to automatically do to incoming mail: - if it comes from a mailing list (e.g. notmuch), tag it +notmuch and +unread, but not +inbox - if it's one of the numerous pointless weekly newsletters that I'm getting from my university, tag it +unimelb, but not +unread or +inbox - if it is coming from me, tag it +sent, but not +unread or +inbox There's probably more, but this is a good start. Any advice? [My current email setup is as follows: get messages from the Gmail account using fetchmail; run procmail with a bunch of recipes that put the messages into various maildirs; read the results with mutt. (Oh, and send mail with msmtp from mutt.)] Best, Alex -- Alex Ghitza -- Lecturer in Mathematics -- The University of Melbourne -- Australia -- http://www.ms.unimelb.edu.au/~aghitza/
[notmuch] automatically assigning tags to new messages?
On Fri, 18 Dec 2009 22:21:54 +1100, Alex Ghitza wrote: > I heard about notmuch mail a few days ago and I started playing with > it. So far, it makes me very happy, but there are some things that I > need to learn how to do. I'll start with the most important one: > tagging incoming messages automatically. I've got a script somewhere which I invoke after each mail sync. I'll give some examples from that for your list below. > What is the recommended way of achieving this? There is a variety of > things that I would like to automatically do to incoming mail: > - if it comes from a mailing list (e.g. notmuch), tag it +notmuch and > +unread, but not +inbox notmuch tag +list +notmuch -inbox to:notmuch at notmuchmail.organd not tag:notmuch and tag:inbox > - if it's one of the numerous pointless weekly newsletters that I'm > getting from my university, tag it +unimelb, but not +unread or > +inbox notmuch tag +unimelb -unread -inbox to:foo and not tag:unimelb and tag:unread and tag:inbox > - if it is coming from me, tag it +sent, but not +unread or +inbox Not quite sure. Currently I'm not doing this, don't know if this is possible within a single incantation of notmuch-tag. I think you probably need a first search to get message ids, and then tag only those message ids (doing it like the others above would tag all messages in the thread with sent, which is probably not what you want). -- - Marten
[notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
From: Scott Robinson In the case of notmuch-show, "--output=json" also implies "--entire-thread" as the thread structure is implicit in the emitted document tree. As a coincidence to the implementation, multipart message ID numbers are now incremented with each part printed. This changes the previous semantics, which were unclear and not necessary related to the actual ordering of the message parts. Edited-By: David Bremner Reviewed-By: David Bremner --- It took me a little work to apply Scott's patch, so rather than asking him to resend it from git-send-email, I am just sending. I hope no-one is offended (much). Other than manually extracting the patch from the output of notmuch show (for me the message arrived base64 encoded), I deleted trailing whitespace on line 465. It compiles, it doesn't seem to screw up the original output, and at least in a few tests, it generates parseable json. Yay!. I'm thinking that the patch I sent out last night to only dump message ids could be reworked to use the framework of this patch. I also think it would be reasonably simple to add an --output=mbox option, for archiving and so on. Makefile.local |3 +- json.c | 73 ++ notmuch-client.h |3 + notmuch-search.c | 163 +--- notmuch-show.c | 275 ++ notmuch.c| 24 -- show-message.c |4 +- 7 files changed, 481 insertions(+), 64 deletions(-) create mode 100644 json.c diff --git a/Makefile.local b/Makefile.local index 933ff4c..53b474b 100644 --- a/Makefile.local +++ b/Makefile.local @@ -18,7 +18,8 @@ notmuch_client_srcs = \ notmuch-tag.c \ notmuch-time.c \ query-string.c \ - show-message.c + show-message.c \ + json.c notmuch_client_modules = $(notmuch_client_srcs:.c=.o) notmuch: $(notmuch_client_modules) lib/notmuch.a diff --git a/json.c b/json.c new file mode 100644 index 000..ee563d6 --- /dev/null +++ b/json.c @@ -0,0 +1,73 @@ +/* notmuch - Not much of an email program, (just index and search) + * + * Copyright ?? 2009 Carl Worth + * Copyright ?? 2009 Keith Packard + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/ . + * + * Authors: Carl Worth + * Keith Packard + */ + +#include "notmuch-client.h" + +/* + * json_quote_str derived from cJSON's print_string_ptr, + * Copyright (c) 2009 Dave Gamble + */ + +char * +json_quote_str(const void *ctx, const char *str) +{ +const char *ptr; +char *ptr2; +char *out; +int len = 0; + +if (!str) + return NULL; + +for (ptr = str; *ptr; len++, ptr++) { + if (*ptr < 32 || *ptr == '\"' || *ptr == '\\') + len++; +} + +out = talloc_array (ctx, char, len + 3); + +ptr = str; +ptr2 = out; + +*ptr2++ = '\"'; +while (*ptr) { + if (*ptr > 31 && *ptr != '\"' && *ptr != '\\') { + *ptr2++ = *ptr++; + } else { + *ptr2++ = '\\'; + switch (*ptr++) { + case '\"': *ptr2++ = '\"'; break; + case '\\': *ptr2++ = '\\'; break; + case '\b': *ptr2++ = 'b'; break; + case '\f': *ptr2++ = 'f'; break; + case '\n': *ptr2++ = 'n'; break; + case '\r': *ptr2++ = 'r'; break; + case '\t': *ptr2++ = 't'; break; + default: ptr2--;break; + } + } +} +*ptr2++ = '\"'; +*ptr2++ = '\0'; + +return out; +} diff --git a/notmuch-client.h b/notmuch-client.h index 50a30fe..7b844b9 100644 --- a/notmuch-client.h +++ b/notmuch-client.h @@ -143,6 +143,9 @@ notmuch_status_t show_message_body (const char *filename, void (*show_part) (GMimeObject *part, int *part_count)); +char * +json_quote_str (const void *ctx, const char *str); + /* notmuch-config.c */ typedef struct _notmuch_config notmuch_config_t; diff --git a/notmuch-search.c b/notmuch-search.c index dc44eb6..e243747 100644 --- a/notmuch-search.c +++ b/notmuch-search.c @@ -20,8 +20,120 @@ #include "notmuch-client.h" +typedef struct search_format { +const char *results_start; +const char *thread_start; +void (*thread) (const void *ctx, + const char *id,
[notmuch] First attempt to add smart completion in notmuch-search
Hi! Here is a first attempt to add "smart completion" to notmuch-search. What it does is that: - It tries to detects when you hit tab to complete a search term prefix (like to, from, subject...) , and if so completes your prefix; - If you try to complete a search term usign a specific prefix (like to), it helps you complete it using prefix founds in the messages matching the current search string. The patch has a Emacs-lisp part and a C part. I'm submitting the patch to know what people think of it, but there are some things still uncomplete: - I think headers should be parsed so that a better output is proposed, but I don't know which one yet; - There may be performance issues (because fetching headers require looking them in each message file), but I'm not sure if this is really a problem (since we already fetch headers to display it in the summary buffer). - I have a problem that notmuch-search-query-string is buffer-local and I can't access it from the minibuffer, which I don't know how to solve properly - There are still some glitches with the "". - There may be many other improvements that I haven't thought of. Anyway, I hope it's useful; I'll continue working on it when I come back from Holidays. Thanks for your input, Matthieu
[notmuch] [PATCH] JSON output for notmuch-search and notmuch-show.
On Thu, 17 Dec 2009 21:33:54 -0800, Scott Robinson wrote: > I took an earlier suggestion and didn't use cJSON, instead writing custom code > for emitting the new format. Nice! I have a few comments below. > Added an "--output=(json|text|)" command-line option to both > notmuch-search and notmuch-show. I don't know why, but I think I'd prefer --format for the name here. > In the case of notmuch-show, "--output=json" also implies > "--entire-thread" as the thread structure is implicit in the emitted > document tree. It looks like the new documentation is missing that point, (and the man page in notmuch.1 is missing an update as well). > As a coincidence to the implementation, multipart message ID numbers are > now incremented with each part printed. This changes the previous > semantics, which were unclear and not necessary related to the actual > ordering of the message parts. That's just fine. The old numbering semantics were quite bizarre and nothing I wanted to set it stone. -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/a325420c/attachment.pgp>
[notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
On Fri, 18 Dec 2009 08:59:55 -0400, david at tethera.net wrote: > It took me a little work to apply Scott's patch, so rather than asking > him to resend it from git-send-email, I am just sending. I hope no-one > is offended (much). I think that's great! Collaboration is what this is all about. > I'm thinking that the patch I sent out last night to only dump message > ids could be reworked to use the framework of this patch. I also > think it would be reasonably simple to add an --output=mbox option, > for archiving and so on. I think that selecting *what* to emit is orthogonal from selecting *how* to format that output. See some ideas in the TODO file, (where I proposed --for and --format options for these). Having a way to do mbox output for export would indeed be very nice. -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/55d68a46/attachment.pgp>
[notmuch] Rather simple optimization for notmuch tag
On Fri, 18 Dec 2009 00:49:00 -0700, Mark Anderson wrote: > I was updating my poll script that tags messages, and a common idiom is > to put > tag +mytag and not tag:mytag > > I don't know anything about efficiency, but for the simple single-tag > case, couldn't we imply the "and not tag:mytag" from the +mytag action > list for the tag command? On one level, it really shouldn't be a performance issue to tag messages that already have a particular tag. (And in fact, the recently proposed patches to fix Xapian defect 250 even address this I think.) In the meantime, it is fairly annoying to have to type this, and yes, the tag command could infer that and append it to the search string automatically. That's a good idea, really. > The similar (dual?, rusty math terminology, beware of Math-tetanus) case > of "tag -mytag and tag:mytag" could be similarly optimized, > since the tag removal action ought to be a null action in the case that > the search terms matched on a thread or message, but the tag to be > removed isn't attached to the message/thread returned. Yes, that would work too. One potential snag with both ideas is that the "notmuch tag" command-line as currently implemented allows for multiple tag additions and removals with a single search. So the optimization here couldn't be used unless there was just a single tag action. So that's another reason to really just want the lower-level optimization to be in place. -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/00565d5b/attachment.pgp>
[notmuch] automatically assigning tags to new messages?
On Fri, 18 Dec 2009 12:54:39 +0100, Marten Veldthuis wrote: > > - if it is coming from me, tag it +sent, but not +unread or +inbox > > Not quite sure. Currently I'm not doing this, don't know if this is > possible within a single incantation of notmuch-tag. I think you > probably need a first search to get message ids, and then tag only those > message ids (doing it like the others above would tag all messages in > the thread with sent, which is probably not what you want). Hi Marten, I'm not sure what's different about this case. A command like those you provided earlier should work fine. The "notmuch tag" command only tags individual messages explicitly matched by the search terms. It never expands the tagging to unmatched messages in the same thread. -Carl PS. I've talked before about allowing for the configuration file to do automatic tagging of messages. I've also talked about making something like "virtual tags" where any automatically-applied tags would act somehow differently than standard flags. More recently, my thinking is taking me away from both of those ideas. I think now that what I want in the configuration file is simply a set of saved search strings. Something like: [search] interesting = to:notmuchmail.org and not from:cworth and then this could be used within a search string such as: notmuch show search:interesting This would make it very clear that "saved searches" are separate from tags, and you might very well want to combine them in a single search: notmuch show search:interesting or tag:interesting As I think about this, I think these saved searches could displace much of my use of tags, (at least all of the tags which I'm automatically applying in the script I run after "notmuch new"). The big difference would be that the UI wouldn't provide an indication of a message matching particular saved searches the way it does for tags. But I might actually prefer that, (since currently, I have so many automatically-applied tags on every message that the display is often just a lot of noise). Anyway, that's something I plan to experiment with. -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/bcd45838/attachment.pgp>
[notmuch] First attempt to add smart completion in notmuch-search
On Fri, 18 Dec 2009 16:00:49 +0100, racin at free.fr wrote: > Here is a first attempt to add "smart completion" to notmuch-search. Hi Matthieu, This all sounds quite interesting! I look forward to actually seeing the patch. ;-) -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/a322e374/attachment.pgp>
[notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
Excerpts from Carl Worth's message of Fri Dec 18 09:33:43 -0800 2009: > On Fri, 18 Dec 2009 08:59:55 -0400, david at tethera.net wrote: > > It took me a little work to apply Scott's patch, so rather than asking > > him to resend it from git-send-email, I am just sending. I hope no-one > > is offended (much). > > I think that's great! Collaboration is what this is all about. Me too! I've never used git-send-email. I'll give it a whirl on my next patch. > > I'm thinking that the patch I sent out last night to only dump message > > ids could be reworked to use the framework of this patch. I also > > think it would be reasonably simple to add an --output=mbox option, > > for archiving and so on. > > I think that selecting *what* to emit is orthogonal from selecting *how* > to format that output. See some ideas in the TODO file, (where I > proposed --for and --format options for these). Having a way to do mbox > output for export would indeed be very nice. Haha! I originally used "--format" and changed for some reason that escapes me now. Implementing an "mbox" formatted output in the current logic wouldn't be archive perfect. The message body is emitted on a per-part basis. What I would do is change the semantics of format->body to be called from show_message. Then the text and json parts would point at the original implementation passing off their per-part function pointers. And, a new mbox implementation would just dump the full message body. -- Scott Robinson | http://quadhome.com/ Q: Why are my replies five sentences or less? A: http://five.sentenc.es/
[notmuch] [PATCH] JSON output for notmuch-search and notmuch-show.
Excerpts from Carl Worth's message of Fri Dec 18 09:31:39 -0800 2009: > [...] > I don't know why, but I think I'd prefer --format for the name here. ACK > [...] > It looks like the new documentation is missing that point, (and the man > page in notmuch.1 is missing an update as well). ACK > [...] > That's just fine. The old numbering semantics were quite bizarre and > nothing I wanted to set it stone. Cool. :-) Resubmit a full patch, or submit another one on top of it? -- Scott Robinson | http://quadhome.com/ Q: Why are my replies five sentences or less? A: http://five.sentenc.es/
[notmuch] Missing messages breaking threads
Hi, I like the architecture of notmuch, and have just switched to using it as my primary client, so thanks. I however have discovered one issue that is a pain. I use a bugtracker a lot which has workflows that mean that you don't always get teh initial messages from a bug. To put it in more common terms, imagine this: * Alice sends a mail to a large group of your friends, but not you. * Each of these friends replies, and puts you in Cc for the reply. This will mean that you get several messages that all have References and In-Reply-To set to ids that aren't known to notmuch. This means that it doesn't thread them, and so they aren't grouped in the UI. This becomes painful when you are dealing with several bugs like this. It's almost like being back in the days of a message oriented client, and we know that's not the way to do it. Therefore I'd like to fix this. The obvious way is to introduce documents in to the db for each id we see, and threading should then naturally work better. The only issue I see with doing this is with mail delays. Once we do this we will sometimes receive a message that already has a dummy document. What happens currently with message-id collisions? Therefore I would propose this: * When doing a thread resolution and we have ids that we don't know, add a document to the db that is something like {id: thread: dummy: True content: "Messages missing" } * When we get a message-id conflict check for dummy:True and replace the document if it is there. How does this sound? There could be an issue with synthesising too many threads and then ending up having to try and put a message in two threads? I see there is code for merging threads, would that handle this? Thanks, James
[notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 19:02:21 +, James Westby wrote: > I like the architecture of notmuch, and have just switched > to using it as my primary client, so thanks. You're quite welcome, James. Welcome to notmuch! > Therefore I'd like to fix this. The obvious way is to > introduce documents in to the db for each id we see, and > threading should then naturally work better. That sounds like a fine idea. > The only issue I see with doing this is with mail delays. > Once we do this we will sometimes receive a message that > already has a dummy document. What happens currently with > message-id collisions? The current message-ID collision logic is pretty brain-dead. It just says "Oh, I've seen a file with this message before, so I'll skip this additional file". But I'm just putting the finishing touches on a patch that instead does: Oh, and here's an additional filename for that message ID. Add that too, please. Beyond that, all we would need to do as well is to also index the new content. I don't want to do useless re-indexing when files just get renamed. So maybe all we need to do is to save the filesize of the last-indexed file for a document and then when we encounter a file with the same message ID and a larger file size, then index it as well? That would even take care of providing the opportunity to index additional mailing-list-added content for messages also sent directly via CC. The file-size heuristic wouldn't be perfect for these other cases. I guess we save a list of sha-1 sums for indexed files or so, (assuming that's cheaper than just re-indexing---before the Xapian Defect 250 fix I'm sure it is, but after I'm not sure---we maybe should just always re-index---but I think I have seen the TermGenerator appear in profiles of indexing runs.) > * When we get a message-id conflict check for dummy:True > and replace the document if it is there. > > How does this sound? That sounds fine. It's the same as what I propose above with "filesize:0" instead of "dummy:true". > There could be an issue with synthesising too many threads > and then ending up having to try and put a message in two > threads? I see there is code for merging threads, would that > handle this? It should, yes. The current logic is that a message can only appear in a single thread. So if a message has children or parents with distinct thread IDs then those threads are merged. I can imagine some strange cross-posting scenario where one could argue that the merging shouldn't happen, but I'm not sure we want to try to respect that. -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/5cda441f/attachment.pgp>
[notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 11:41:18 -0800, Carl Worth wrote: > On Fri, 18 Dec 2009 19:02:21 +, James Westby jameswestby.net> wrote: > > Therefore I'd like to fix this. The obvious way is to > > introduce documents in to the db for each id we see, and > > threading should then naturally work better. > > That sounds like a fine idea. Good, at least I'm not totally off the map. > > The only issue I see with doing this is with mail delays. > > Once we do this we will sometimes receive a message that > > already has a dummy document. What happens currently with > > message-id collisions? > > The current message-ID collision logic is pretty brain-dead. It just > says "Oh, I've seen a file with this message before, so I'll skip this > additional file". > > But I'm just putting the finishing touches on a patch that instead does: > > Oh, and here's an additional filename for that message ID. Add > that too, please. > > Beyond that, all we would need to do as well is to also index the new > content. I don't want to do useless re-indexing when files just get > renamed. So maybe all we need to do is to save the filesize of the > last-indexed file for a document and then when we encounter a file with > the same message ID and a larger file size, then index it as well? I would say different file size, but I imagine larger is the majority of interesting cases. > That would even take care of providing the opportunity to index > additional mailing-list-added content for messages also sent directly > via CC. > > The file-size heuristic wouldn't be perfect for these other cases. I > guess we save a list of sha-1 sums for indexed files or so, (assuming > that's cheaper than just re-indexing---before the Xapian Defect 250 fix > I'm sure it is, but after I'm not sure---we maybe should just always > re-index---but I think I have seen the TermGenerator appear in profiles > of indexing runs.) I'm not sure this is needed too much, but would obviously be correct. On Xapian 250, I have a very slow spinning disk, and it was hitting me hard, making processing my inbox far too slow. I built Xapian SVN with the patch from the bug and it is now lightning fast, so consider this another endorsement. I also tried the supplemental patch and it showed no further improvement for notmuch tag. > > * When we get a message-id conflict check for dummy:True > > and replace the document if it is there. > > > > How does this sound? > > That sounds fine. It's the same as what I propose above with > "filesize:0" instead of "dummy:true". That works. However, we would want the old content to go away in these cases wouldn't we. Or do we not index whatever dummy text we add? Or do we not even put it in? Or not even show it at all? I was just thinking of having "Missing messages..." showing up as the start of the thread, but maybe it's no needed. > > There could be an issue with synthesising too many threads > > and then ending up having to try and put a message in two > > threads? I see there is code for merging threads, would that > > handle this? > > It should, yes. > > The current logic is that a message can only appear in a single > thread. So if a message has children or parents with distinct thread IDs > then those threads are merged. > > I can imagine some strange cross-posting scenario where one could argue > that the merging shouldn't happen, but I'm not sure we want to try to > respect that. Fair enough. So, to summarise, I should first look at storing filesizes, then the collision code to make it index further when the filesize grows, and then finally the code to add documents for missing messages? The only thing I am unclear on is how to handle existing databases? Do we have any concept of versioning? Or should I just assume that filesize: may not be in the document and act appropriately? Thanks, James
[notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 19:53:13 +, James Westby wrote: > Or do we not index whatever dummy text we add? Or do we not > even put it in? Or not even show it at all? I was just thinking > of having "Missing messages..." showing up as the start of > the thread, but maybe it's no needed. Oh, I was assuming you wouldn't index any text. The UI can add "missing message" for a document with no filename, for example. > So, to summarise, I should first look at storing filesizes, then > the collision code to make it index further when the filesize grows, > and then finally the code to add documents for missing messages? Some of the code areas to be touched will be changing soon, (at least as far as when filenames appear and disappear). Hopefully I'll have something posted for that sooner rather than later to avoid having to redo too much work. > The only thing I am unclear on is how to handle existing databases? > Do we have any concept of versioning? Or should I just assume that > filesize: may not be in the document and act appropriately? My current, outstanding patch is going to be the first trigger for a "flag day" where we'll all need to rewrite our databases. We don't have any concept of versioning yet, but it would obviously be easy to have a new version document with an increasing integer. But even with my current patch I'm considering doing a graceful upgrade of the database in-place rather than making the user do something like a dump, delete, rebuild, restore. That would give a much better experience than "Your database is out-of-date, please rebuild it", so we'll see if I pursue that in the end. -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/185328f8/attachment.pgp>
[notmuch] [PATCH] Store the size of the file for each message
When indexing a message store the filesize along with it so that when we store all the filenames for a message-id we can know if any of them have different content cheaply. The value stored is defined to be the largest filesize of any of the files for that message. This changes the API for efficiency reasons. The size is often known to the caller, and so we save a second stat by asking them to provide it. If they don't know it they can pass -1 and the stat will be done for them. We store the filesize such that we can query a range. Thus it would be possible to query "filesize:0..100" if you somehow knew the raw message was less that 100 bytes. --- Here's the first part, storing the filesize. I'm using add_value so that we can make it sortable, is that valid for retrieving it as well? The only thing I'm not sure about is if it works. Is there a way to inspect a document to see the values that are stored? Doing a search isn't working, so I imagine I made a mistake. Thanks, James lib/database.cc | 17 + lib/message.cc| 25 + lib/notmuch-private.h |8 +++- lib/notmuch.h |5 + notmuch-new.c |2 +- 5 files changed, 55 insertions(+), 2 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index b6c4d07..0ec77cd 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -454,6 +454,17 @@ notmuch_database_create (const char *path) return notmuch; } +struct FilesizeValueRangeProcessor : public Xapian::ValueRangeProcessor { +FilesizeValueRangeProcessor() {} + +Xapian::valueno operator()(std::string &begin, std::string &) { +if (begin.substr(0, 9) != "filesize:") +return Xapian::BAD_VALUENO; +begin.erase(0, 9); +return NOTMUCH_VALUE_FILESIZE; +} +}; + notmuch_database_t * notmuch_database_open (const char *path, notmuch_database_mode_t mode) @@ -463,6 +474,7 @@ notmuch_database_open (const char *path, struct stat st; int err; unsigned int i; +FilesizeValueRangeProcessor filesize_proc; if (asprintf (¬much_path, "%s/%s", path, ".notmuch") == -1) { notmuch_path = NULL; @@ -508,6 +520,7 @@ notmuch_database_open (const char *path, notmuch->query_parser->set_stemmer (Xapian::Stem ("english")); notmuch->query_parser->set_stemming_strategy (Xapian::QueryParser::STEM_SOME); notmuch->query_parser->add_valuerangeprocessor (notmuch->value_range_processor); + notmuch->query_parser->add_valuerangeprocessor (&filesize_proc); for (i = 0; i < ARRAY_SIZE (BOOLEAN_PREFIX_EXTERNAL); i++) { prefix_t *prefix = &BOOLEAN_PREFIX_EXTERNAL[i]; @@ -889,6 +902,7 @@ _notmuch_database_link_message (notmuch_database_t *notmuch, notmuch_status_t notmuch_database_add_message (notmuch_database_t *notmuch, const char *filename, + const off_t size, notmuch_message_t **message_ret) { notmuch_message_file_t *message_file; @@ -992,6 +1006,9 @@ notmuch_database_add_message (notmuch_database_t *notmuch, if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { _notmuch_message_set_filename (message, filename); _notmuch_message_add_term (message, "type", "mail"); + ret = _notmuch_message_set_filesize (message, filename, size); + if (ret) + goto DONE; } else { ret = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; goto DONE; diff --git a/lib/message.cc b/lib/message.cc index 49519f1..2bfc5ed 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -426,6 +426,31 @@ _notmuch_message_set_filename (notmuch_message_t *message, message->doc.set_data (s); } +notmuch_status_t +_notmuch_message_set_filesize (notmuch_message_t *message, + const char *filename, + const off_t size) +{ +struct stat st; +off_t realsize = size; +notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS; + +if (realsize < 0) { + if (stat (filename, &st)) { + ret = NOTMUCH_STATUS_FILE_ERROR; + goto DONE; + } else { + realsize = st.st_size; + } +} + +message->doc.add_value (NOTMUCH_VALUE_FILESIZE, +Xapian::sortable_serialise (realsize)); + + DONE: +return ret; +} + const char * notmuch_message_get_filename (notmuch_message_t *message) { diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index 116f63d..1ba3055 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -100,7 +100,8 @@ _internal_error (const char *format, ...) PRINTF_ATTRIBUTE (1, 2); typedef enum { NOTMUCH_VALUE_TIMESTAMP = 0, -NOTMUCH_VALUE_MESSAGE_ID +NOTMUCH_VALUE_MESSAGE_ID, +NOTMUCH_VALUE_FILESIZE } notmuch_value_t; /* Xapian (with flint backend) complains if we provi
[notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 12:52:58 -0800, Carl Worth wrote: > On Fri, 18 Dec 2009 19:53:13 +, James Westby jameswestby.net> wrote: > Oh, I was assuming you wouldn't index any text. The UI can add "missing > message" for a document with no filename, for example. Works for me. > > So, to summarise, I should first look at storing filesizes, then > > the collision code to make it index further when the filesize grows, > > and then finally the code to add documents for missing messages? > > Some of the code areas to be touched will be changing soon, (at least as > far as when filenames appear and disappear). Hopefully I'll have > something posted for that sooner rather than later to avoid having to > redo too much work. That would be great. I'm learning all the code anyway, so there's not a whole lot of knowledge being thrown away. I've just sent an initial cut at the fist step. > > The only thing I am unclear on is how to handle existing databases? > > Do we have any concept of versioning? Or should I just assume that > > filesize: may not be in the document and act appropriately? > > My current, outstanding patch is going to be the first trigger for a > "flag day" where we'll all need to rewrite our databases. > > We don't have any concept of versioning yet, but it would obviously be > easy to have a new version document with an increasing integer. > > But even with my current patch I'm considering doing a graceful upgrade > of the database in-place rather than making the user do something like a > dump, delete, rebuild, restore. That would give a much better experience > than "Your database is out-of-date, please rebuild it", so we'll see if > I pursue that in the end. That sounds nice, I'd certainly prefer this sort of thing as it evolves. Thanks, James
[notmuch] [PATCH] Store the size of the file for each message
On Fri, 18 Dec 2009 21:21:03 +, James Westby wrote: > Here's the first part, storing the filesize. I'm using > add_value so that we can make it sortable, is that valid > for retrieving it as well? Yes, a value makes sense here and should make the value easy to retrieve. > The only thing I'm not sure about is if it works. Is there > a way to inspect a document to see the values that are > stored? I usually use a little tool I wrote called xapian-dump. It currently exists only in the git history of notmuch. Look at commit: 22691064666c03c5e76bc787395bfe586929f4cc or so. > Doing a search isn't working, so I imagine I made a mistake. Let's see... (just reviewing here, not testing).. > +struct FilesizeValueRangeProcessor : public Xapian::ValueRangeProcessor { > +FilesizeValueRangeProcessor() {} > + > +Xapian::valueno operator()(std::string &begin, std::string &) { > +if (begin.substr(0, 9) != "filesize:") > +return Xapian::BAD_VALUENO; > +begin.erase(0, 9); > +return NOTMUCH_VALUE_FILESIZE; > +} > +}; If the file size is just an integer, then you shouldn't need a custom ValueRangeProcessor. One of the existing processors in Xapian should work fine. Having not ever written a custom processor, I can't say whether the one above is correct or not. -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/8d9da30f/attachment.pgp>
[notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
On Fri, 18 Dec 2009 09:33:43 -0800, Carl Worth wrote: > I think that selecting *what* to emit is orthogonal from selecting *how* > to format that output. I can see that point of view. > See some ideas in the TODO file, (where I proposed --for and --format > options for these). It's a detail, but could you choose two names that are not substrings of each other? Eventually we do want tab completion on the command line to work :). Also, "search --for tags foo" suggests to me that searching for tags matching foo. What about using --output for that? One thing that is not completely clear to me at this point is what the difference is between notmuch search --for messages search-terms and notmuch show search-terms David
[notmuch] wish: more informative citations
Wouldn't it be nice if citations showed the first line or so of the text being cited? Stealing text from another thread, > In case of a citation following immediately new contents. When the citation > was collapsed: > > [1-line citation. Click/Enter to show.] > Lorem ipsum dolor sit amet, consectetur adipisicin Would be displayed as something like: > In case of a citation following [ Click/Enter to show 3 more lines ] Actually I'm not too sure about the format, but I thought I'd through that out there. Happy hacking, David
[notmuch] [PATCH] Store the size of the file for each message
On Sat, 19 Dec 2009 00:08:24 +, James Westby wrote: > Thanks, I found delve, which at least showed that something was > being stored. It's in the xapian-tools package, and > >delve -V2 > > prints out the filesize value for each document. Ah, right. I had forgotten about that. > It would be great if we could specify an alternative configuration > file for testing so that I can set up a small maildir and test > against that. You can, actually. Just set the NOTMUCH_CONFIG environment variable to your alternate configuration file. (And yes, we're missing any mention of this in our documentation.) > Correct, I hadn't read the documentation closely enough. After fixing > that and doing some testing I have this working now. Patch incoming. Cool! -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/b5785db1/attachment.pgp>
[notmuch] wish: more informative citations
On Fri, 18 Dec 2009 20:47:20 -0400, David Bremner wrote: > Would be displayed as something like: > > > In case of a citation following [ Click/Enter to show 3 more lines ] > > Actually I'm not too sure about the format, but I thought I'd through > that out there. That's a fine idea. Along with this would be getting rid of the stupidity of displaying [1 line citation] rather than just displaying the citation itself! And I really want my keybinding for displaying all the citations in the current message. And code to recognize top-posted copies as citations and hiding that. And, and... -Carl ...and more time to do all this stuff. I've got an ever-growing TODO list and a backlog of patches to be reviewed that goes back several weeks now. I'm hoping that I'll be able to sneak some time over the upcoming holidays... -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/0ce61e8b/attachment.pgp>
[notmuch] [PATCH] Store the size of the file for each message
On Sat, 19 Dec 2009 01:35:46 +, James Westby wrote: > On Fri, 18 Dec 2009 16:57:16 -0800, Carl Worth wrote: > > You can, actually. Just set the NOTMUCH_CONFIG environment variable to > > your alternate configuration file. (And yes, we're missing any mention > > of this in our documentation.) > > Sweet. Where would be the best place to document it? Just in the > man page? Currently we're replicating all of our documentation both in the man page and in the output from "notmuch help". It's annoying to have to add everything in two places, but I don't have a good idea for making that sharable yet. Anyone have a solution here? -Carl -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20091218/3cae398e/attachment-0001.pgp>
[notmuch] automatically assigning tags to new messages?
Dear notmuch crowd, I heard about notmuch mail a few days ago and I started playing with it. So far, it makes me very happy, but there are some things that I need to learn how to do. I'll start with the most important one: tagging incoming messages automatically. What is the recommended way of achieving this? There is a variety of things that I would like to automatically do to incoming mail: - if it comes from a mailing list (e.g. notmuch), tag it +notmuch and +unread, but not +inbox - if it's one of the numerous pointless weekly newsletters that I'm getting from my university, tag it +unimelb, but not +unread or +inbox - if it is coming from me, tag it +sent, but not +unread or +inbox There's probably more, but this is a good start. Any advice? [My current email setup is as follows: get messages from the Gmail account using fetchmail; run procmail with a bunch of recipes that put the messages into various maildirs; read the results with mutt. (Oh, and send mail with msmtp from mutt.)] Best, Alex -- Alex Ghitza -- Lecturer in Mathematics -- The University of Melbourne -- Australia -- http://www.ms.unimelb.edu.au/~aghitza/ ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] automatically assigning tags to new messages?
On Fri, 18 Dec 2009 22:21:54 +1100, Alex Ghitza wrote: > I heard about notmuch mail a few days ago and I started playing with > it. So far, it makes me very happy, but there are some things that I > need to learn how to do. I'll start with the most important one: > tagging incoming messages automatically. I've got a script somewhere which I invoke after each mail sync. I'll give some examples from that for your list below. > What is the recommended way of achieving this? There is a variety of > things that I would like to automatically do to incoming mail: > - if it comes from a mailing list (e.g. notmuch), tag it +notmuch and > +unread, but not +inbox notmuch tag +list +notmuch -inbox to:notmuch@notmuchmail.organd not tag:notmuch and tag:inbox > - if it's one of the numerous pointless weekly newsletters that I'm > getting from my university, tag it +unimelb, but not +unread or > +inbox notmuch tag +unimelb -unread -inbox to:foo and not tag:unimelb and tag:unread and tag:inbox > - if it is coming from me, tag it +sent, but not +unread or +inbox Not quite sure. Currently I'm not doing this, don't know if this is possible within a single incantation of notmuch-tag. I think you probably need a first search to get message ids, and then tag only those message ids (doing it like the others above would tag all messages in the thread with sent, which is probably not what you want). -- - Marten ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
From: Scott Robinson In the case of notmuch-show, "--output=json" also implies "--entire-thread" as the thread structure is implicit in the emitted document tree. As a coincidence to the implementation, multipart message ID numbers are now incremented with each part printed. This changes the previous semantics, which were unclear and not necessary related to the actual ordering of the message parts. Edited-By: David Bremner Reviewed-By: David Bremner --- It took me a little work to apply Scott's patch, so rather than asking him to resend it from git-send-email, I am just sending. I hope no-one is offended (much). Other than manually extracting the patch from the output of notmuch show (for me the message arrived base64 encoded), I deleted trailing whitespace on line 465. It compiles, it doesn't seem to screw up the original output, and at least in a few tests, it generates parseable json. Yay!. I'm thinking that the patch I sent out last night to only dump message ids could be reworked to use the framework of this patch. I also think it would be reasonably simple to add an --output=mbox option, for archiving and so on. Makefile.local |3 +- json.c | 73 ++ notmuch-client.h |3 + notmuch-search.c | 163 +--- notmuch-show.c | 275 ++ notmuch.c| 24 -- show-message.c |4 +- 7 files changed, 481 insertions(+), 64 deletions(-) create mode 100644 json.c diff --git a/Makefile.local b/Makefile.local index 933ff4c..53b474b 100644 --- a/Makefile.local +++ b/Makefile.local @@ -18,7 +18,8 @@ notmuch_client_srcs = \ notmuch-tag.c \ notmuch-time.c \ query-string.c \ - show-message.c + show-message.c \ + json.c notmuch_client_modules = $(notmuch_client_srcs:.c=.o) notmuch: $(notmuch_client_modules) lib/notmuch.a diff --git a/json.c b/json.c new file mode 100644 index 000..ee563d6 --- /dev/null +++ b/json.c @@ -0,0 +1,73 @@ +/* notmuch - Not much of an email program, (just index and search) + * + * Copyright © 2009 Carl Worth + * Copyright © 2009 Keith Packard + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/ . + * + * Authors: Carl Worth + * Keith Packard + */ + +#include "notmuch-client.h" + +/* + * json_quote_str derived from cJSON's print_string_ptr, + * Copyright (c) 2009 Dave Gamble + */ + +char * +json_quote_str(const void *ctx, const char *str) +{ +const char *ptr; +char *ptr2; +char *out; +int len = 0; + +if (!str) + return NULL; + +for (ptr = str; *ptr; len++, ptr++) { + if (*ptr < 32 || *ptr == '\"' || *ptr == '\\') + len++; +} + +out = talloc_array (ctx, char, len + 3); + +ptr = str; +ptr2 = out; + +*ptr2++ = '\"'; +while (*ptr) { + if (*ptr > 31 && *ptr != '\"' && *ptr != '\\') { + *ptr2++ = *ptr++; + } else { + *ptr2++ = '\\'; + switch (*ptr++) { + case '\"': *ptr2++ = '\"'; break; + case '\\': *ptr2++ = '\\'; break; + case '\b': *ptr2++ = 'b'; break; + case '\f': *ptr2++ = 'f'; break; + case '\n': *ptr2++ = 'n'; break; + case '\r': *ptr2++ = 'r'; break; + case '\t': *ptr2++ = 't'; break; + default: ptr2--;break; + } + } +} +*ptr2++ = '\"'; +*ptr2++ = '\0'; + +return out; +} diff --git a/notmuch-client.h b/notmuch-client.h index 50a30fe..7b844b9 100644 --- a/notmuch-client.h +++ b/notmuch-client.h @@ -143,6 +143,9 @@ notmuch_status_t show_message_body (const char *filename, void (*show_part) (GMimeObject *part, int *part_count)); +char * +json_quote_str (const void *ctx, const char *str); + /* notmuch-config.c */ typedef struct _notmuch_config notmuch_config_t; diff --git a/notmuch-search.c b/notmuch-search.c index dc44eb6..e243747 100644 --- a/notmuch-search.c +++ b/notmuch-search.c @@ -20,8 +20,120 @@ #include "notmuch-client.h" +typedef struct search_format { +const char *results_start; +const char *thread_start; +void (*thread) (const void *ctx, + const char
[notmuch] First attempt to add smart completion in notmuch-search
Hi! Here is a first attempt to add "smart completion" to notmuch-search. What it does is that: - It tries to detects when you hit tab to complete a search term prefix (like to, from, subject...) , and if so completes your prefix; - If you try to complete a search term usign a specific prefix (like to), it helps you complete it using prefix founds in the messages matching the current search string. The patch has a Emacs-lisp part and a C part. I'm submitting the patch to know what people think of it, but there are some things still uncomplete: - I think headers should be parsed so that a better output is proposed, but I don't know which one yet; - There may be performance issues (because fetching headers require looking them in each message file), but I'm not sure if this is really a problem (since we already fetch headers to display it in the summary buffer). - I have a problem that notmuch-search-query-string is buffer-local and I can't access it from the minibuffer, which I don't know how to solve properly - There are still some glitches with the "". - There may be many other improvements that I haven't thought of. Anyway, I hope it's useful; I'll continue working on it when I come back from Holidays. Thanks for your input, Matthieu ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] JSON output for notmuch-search and notmuch-show.
On Thu, 17 Dec 2009 21:33:54 -0800, Scott Robinson wrote: > I took an earlier suggestion and didn't use cJSON, instead writing custom code > for emitting the new format. Nice! I have a few comments below. > Added an "--output=(json|text|)" command-line option to both > notmuch-search and notmuch-show. I don't know why, but I think I'd prefer --format for the name here. > In the case of notmuch-show, "--output=json" also implies > "--entire-thread" as the thread structure is implicit in the emitted > document tree. It looks like the new documentation is missing that point, (and the man page in notmuch.1 is missing an update as well). > As a coincidence to the implementation, multipart message ID numbers are > now incremented with each part printed. This changes the previous > semantics, which were unclear and not necessary related to the actual > ordering of the message parts. That's just fine. The old numbering semantics were quite bizarre and nothing I wanted to set it stone. -Carl pgpQHJQbXaM0f.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
On Fri, 18 Dec 2009 08:59:55 -0400, da...@tethera.net wrote: > It took me a little work to apply Scott's patch, so rather than asking > him to resend it from git-send-email, I am just sending. I hope no-one > is offended (much). I think that's great! Collaboration is what this is all about. > I'm thinking that the patch I sent out last night to only dump message > ids could be reworked to use the framework of this patch. I also > think it would be reasonably simple to add an --output=mbox option, > for archiving and so on. I think that selecting *what* to emit is orthogonal from selecting *how* to format that output. See some ideas in the TODO file, (where I proposed --for and --format options for these). Having a way to do mbox output for export would indeed be very nice. -Carl pgp3wBJvDcTzl.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] Rather simple optimization for notmuch tag
On Fri, 18 Dec 2009 00:49:00 -0700, Mark Anderson wrote: > I was updating my poll script that tags messages, and a common idiom is > to put > tag +mytag and not tag:mytag > > I don't know anything about efficiency, but for the simple single-tag > case, couldn't we imply the "and not tag:mytag" from the +mytag action > list for the tag command? On one level, it really shouldn't be a performance issue to tag messages that already have a particular tag. (And in fact, the recently proposed patches to fix Xapian defect 250 even address this I think.) In the meantime, it is fairly annoying to have to type this, and yes, the tag command could infer that and append it to the search string automatically. That's a good idea, really. > The similar (dual?, rusty math terminology, beware of Math-tetanus) case > of "tag -mytag and tag:mytag" could be similarly optimized, > since the tag removal action ought to be a null action in the case that > the search terms matched on a thread or message, but the tag to be > removed isn't attached to the message/thread returned. Yes, that would work too. One potential snag with both ideas is that the "notmuch tag" command-line as currently implemented allows for multiple tag additions and removals with a single search. So the optimization here couldn't be used unless there was just a single tag action. So that's another reason to really just want the lower-level optimization to be in place. -Carl pgpX3lcvcBjhJ.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] automatically assigning tags to new messages?
On Fri, 18 Dec 2009 12:54:39 +0100, Marten Veldthuis wrote: > > - if it is coming from me, tag it +sent, but not +unread or +inbox > > Not quite sure. Currently I'm not doing this, don't know if this is > possible within a single incantation of notmuch-tag. I think you > probably need a first search to get message ids, and then tag only those > message ids (doing it like the others above would tag all messages in > the thread with sent, which is probably not what you want). Hi Marten, I'm not sure what's different about this case. A command like those you provided earlier should work fine. The "notmuch tag" command only tags individual messages explicitly matched by the search terms. It never expands the tagging to unmatched messages in the same thread. -Carl PS. I've talked before about allowing for the configuration file to do automatic tagging of messages. I've also talked about making something like "virtual tags" where any automatically-applied tags would act somehow differently than standard flags. More recently, my thinking is taking me away from both of those ideas. I think now that what I want in the configuration file is simply a set of saved search strings. Something like: [search] interesting = to:notmuchmail.org and not from:cworth and then this could be used within a search string such as: notmuch show search:interesting This would make it very clear that "saved searches" are separate from tags, and you might very well want to combine them in a single search: notmuch show search:interesting or tag:interesting As I think about this, I think these saved searches could displace much of my use of tags, (at least all of the tags which I'm automatically applying in the script I run after "notmuch new"). The big difference would be that the UI wouldn't provide an indication of a message matching particular saved searches the way it does for tags. But I might actually prefer that, (since currently, I have so many automatically-applied tags on every message that the display is often just a lot of noise). Anyway, that's something I plan to experiment with. pgpmCcEFDuItb.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] First attempt to add smart completion in notmuch-search
On Fri, 18 Dec 2009 16:00:49 +0100, ra...@free.fr wrote: > Here is a first attempt to add "smart completion" to notmuch-search. Hi Matthieu, This all sounds quite interesting! I look forward to actually seeing the patch. ;-) -Carl pgpgoqMIOyuto.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
Excerpts from Carl Worth's message of Fri Dec 18 09:33:43 -0800 2009: > On Fri, 18 Dec 2009 08:59:55 -0400, da...@tethera.net wrote: > > It took me a little work to apply Scott's patch, so rather than asking > > him to resend it from git-send-email, I am just sending. I hope no-one > > is offended (much). > > I think that's great! Collaboration is what this is all about. Me too! I've never used git-send-email. I'll give it a whirl on my next patch. > > I'm thinking that the patch I sent out last night to only dump message > > ids could be reworked to use the framework of this patch. I also > > think it would be reasonably simple to add an --output=mbox option, > > for archiving and so on. > > I think that selecting *what* to emit is orthogonal from selecting *how* > to format that output. See some ideas in the TODO file, (where I > proposed --for and --format options for these). Having a way to do mbox > output for export would indeed be very nice. Haha! I originally used "--format" and changed for some reason that escapes me now. Implementing an "mbox" formatted output in the current logic wouldn't be archive perfect. The message body is emitted on a per-part basis. What I would do is change the semantics of format->body to be called from show_message. Then the text and json parts would point at the original implementation passing off their per-part function pointers. And, a new mbox implementation would just dump the full message body. -- Scott Robinson | http://quadhome.com/ Q: Why are my replies five sentences or less? A: http://five.sentenc.es/ ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] JSON output for notmuch-search and notmuch-show.
Excerpts from Carl Worth's message of Fri Dec 18 09:31:39 -0800 2009: > [...] > I don't know why, but I think I'd prefer --format for the name here. ACK > [...] > It looks like the new documentation is missing that point, (and the man > page in notmuch.1 is missing an update as well). ACK > [...] > That's just fine. The old numbering semantics were quite bizarre and > nothing I wanted to set it stone. Cool. :-) Resubmit a full patch, or submit another one on top of it? -- Scott Robinson | http://quadhome.com/ Q: Why are my replies five sentences or less? A: http://five.sentenc.es/ ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] Missing messages breaking threads
Hi, I like the architecture of notmuch, and have just switched to using it as my primary client, so thanks. I however have discovered one issue that is a pain. I use a bugtracker a lot which has workflows that mean that you don't always get teh initial messages from a bug. To put it in more common terms, imagine this: * Alice sends a mail to a large group of your friends, but not you. * Each of these friends replies, and puts you in Cc for the reply. This will mean that you get several messages that all have References and In-Reply-To set to ids that aren't known to notmuch. This means that it doesn't thread them, and so they aren't grouped in the UI. This becomes painful when you are dealing with several bugs like this. It's almost like being back in the days of a message oriented client, and we know that's not the way to do it. Therefore I'd like to fix this. The obvious way is to introduce documents in to the db for each id we see, and threading should then naturally work better. The only issue I see with doing this is with mail delays. Once we do this we will sometimes receive a message that already has a dummy document. What happens currently with message-id collisions? Therefore I would propose this: * When doing a thread resolution and we have ids that we don't know, add a document to the db that is something like {id: thread: dummy: True content: "Messages missing" } * When we get a message-id conflict check for dummy:True and replace the document if it is there. How does this sound? There could be an issue with synthesising too many threads and then ending up having to try and put a message in two threads? I see there is code for merging threads, would that handle this? Thanks, James ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 19:02:21 +, James Westby wrote: > I like the architecture of notmuch, and have just switched > to using it as my primary client, so thanks. You're quite welcome, James. Welcome to notmuch! > Therefore I'd like to fix this. The obvious way is to > introduce documents in to the db for each id we see, and > threading should then naturally work better. That sounds like a fine idea. > The only issue I see with doing this is with mail delays. > Once we do this we will sometimes receive a message that > already has a dummy document. What happens currently with > message-id collisions? The current message-ID collision logic is pretty brain-dead. It just says "Oh, I've seen a file with this message before, so I'll skip this additional file". But I'm just putting the finishing touches on a patch that instead does: Oh, and here's an additional filename for that message ID. Add that too, please. Beyond that, all we would need to do as well is to also index the new content. I don't want to do useless re-indexing when files just get renamed. So maybe all we need to do is to save the filesize of the last-indexed file for a document and then when we encounter a file with the same message ID and a larger file size, then index it as well? That would even take care of providing the opportunity to index additional mailing-list-added content for messages also sent directly via CC. The file-size heuristic wouldn't be perfect for these other cases. I guess we save a list of sha-1 sums for indexed files or so, (assuming that's cheaper than just re-indexing---before the Xapian Defect 250 fix I'm sure it is, but after I'm not sure---we maybe should just always re-index---but I think I have seen the TermGenerator appear in profiles of indexing runs.) > * When we get a message-id conflict check for dummy:True > and replace the document if it is there. > > How does this sound? That sounds fine. It's the same as what I propose above with "filesize:0" instead of "dummy:true". > There could be an issue with synthesising too many threads > and then ending up having to try and put a message in two > threads? I see there is code for merging threads, would that > handle this? It should, yes. The current logic is that a message can only appear in a single thread. So if a message has children or parents with distinct thread IDs then those threads are merged. I can imagine some strange cross-posting scenario where one could argue that the merging shouldn't happen, but I'm not sure we want to try to respect that. -Carl pgpljkeHch1Gq.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 11:41:18 -0800, Carl Worth wrote: > On Fri, 18 Dec 2009 19:02:21 +, James Westby > wrote: > > Therefore I'd like to fix this. The obvious way is to > > introduce documents in to the db for each id we see, and > > threading should then naturally work better. > > That sounds like a fine idea. Good, at least I'm not totally off the map. > > The only issue I see with doing this is with mail delays. > > Once we do this we will sometimes receive a message that > > already has a dummy document. What happens currently with > > message-id collisions? > > The current message-ID collision logic is pretty brain-dead. It just > says "Oh, I've seen a file with this message before, so I'll skip this > additional file". > > But I'm just putting the finishing touches on a patch that instead does: > > Oh, and here's an additional filename for that message ID. Add > that too, please. > > Beyond that, all we would need to do as well is to also index the new > content. I don't want to do useless re-indexing when files just get > renamed. So maybe all we need to do is to save the filesize of the > last-indexed file for a document and then when we encounter a file with > the same message ID and a larger file size, then index it as well? I would say different file size, but I imagine larger is the majority of interesting cases. > That would even take care of providing the opportunity to index > additional mailing-list-added content for messages also sent directly > via CC. > > The file-size heuristic wouldn't be perfect for these other cases. I > guess we save a list of sha-1 sums for indexed files or so, (assuming > that's cheaper than just re-indexing---before the Xapian Defect 250 fix > I'm sure it is, but after I'm not sure---we maybe should just always > re-index---but I think I have seen the TermGenerator appear in profiles > of indexing runs.) I'm not sure this is needed too much, but would obviously be correct. On Xapian 250, I have a very slow spinning disk, and it was hitting me hard, making processing my inbox far too slow. I built Xapian SVN with the patch from the bug and it is now lightning fast, so consider this another endorsement. I also tried the supplemental patch and it showed no further improvement for notmuch tag. > > * When we get a message-id conflict check for dummy:True > > and replace the document if it is there. > > > > How does this sound? > > That sounds fine. It's the same as what I propose above with > "filesize:0" instead of "dummy:true". That works. However, we would want the old content to go away in these cases wouldn't we. Or do we not index whatever dummy text we add? Or do we not even put it in? Or not even show it at all? I was just thinking of having "Missing messages..." showing up as the start of the thread, but maybe it's no needed. > > There could be an issue with synthesising too many threads > > and then ending up having to try and put a message in two > > threads? I see there is code for merging threads, would that > > handle this? > > It should, yes. > > The current logic is that a message can only appear in a single > thread. So if a message has children or parents with distinct thread IDs > then those threads are merged. > > I can imagine some strange cross-posting scenario where one could argue > that the merging shouldn't happen, but I'm not sure we want to try to > respect that. Fair enough. So, to summarise, I should first look at storing filesizes, then the collision code to make it index further when the filesize grows, and then finally the code to add documents for missing messages? The only thing I am unclear on is how to handle existing databases? Do we have any concept of versioning? Or should I just assume that filesize: may not be in the document and act appropriately? Thanks, James ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 19:53:13 +, James Westby wrote: > Or do we not index whatever dummy text we add? Or do we not > even put it in? Or not even show it at all? I was just thinking > of having "Missing messages..." showing up as the start of > the thread, but maybe it's no needed. Oh, I was assuming you wouldn't index any text. The UI can add "missing message" for a document with no filename, for example. > So, to summarise, I should first look at storing filesizes, then > the collision code to make it index further when the filesize grows, > and then finally the code to add documents for missing messages? Some of the code areas to be touched will be changing soon, (at least as far as when filenames appear and disappear). Hopefully I'll have something posted for that sooner rather than later to avoid having to redo too much work. > The only thing I am unclear on is how to handle existing databases? > Do we have any concept of versioning? Or should I just assume that > filesize: may not be in the document and act appropriately? My current, outstanding patch is going to be the first trigger for a "flag day" where we'll all need to rewrite our databases. We don't have any concept of versioning yet, but it would obviously be easy to have a new version document with an increasing integer. But even with my current patch I'm considering doing a graceful upgrade of the database in-place rather than making the user do something like a dump, delete, rebuild, restore. That would give a much better experience than "Your database is out-of-date, please rebuild it", so we'll see if I pursue that in the end. -Carl pgpXvyQBVFou6.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] [PATCH] Store the size of the file for each message
When indexing a message store the filesize along with it so that when we store all the filenames for a message-id we can know if any of them have different content cheaply. The value stored is defined to be the largest filesize of any of the files for that message. This changes the API for efficiency reasons. The size is often known to the caller, and so we save a second stat by asking them to provide it. If they don't know it they can pass -1 and the stat will be done for them. We store the filesize such that we can query a range. Thus it would be possible to query "filesize:0..100" if you somehow knew the raw message was less that 100 bytes. --- Here's the first part, storing the filesize. I'm using add_value so that we can make it sortable, is that valid for retrieving it as well? The only thing I'm not sure about is if it works. Is there a way to inspect a document to see the values that are stored? Doing a search isn't working, so I imagine I made a mistake. Thanks, James lib/database.cc | 17 + lib/message.cc| 25 + lib/notmuch-private.h |8 +++- lib/notmuch.h |5 + notmuch-new.c |2 +- 5 files changed, 55 insertions(+), 2 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index b6c4d07..0ec77cd 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -454,6 +454,17 @@ notmuch_database_create (const char *path) return notmuch; } +struct FilesizeValueRangeProcessor : public Xapian::ValueRangeProcessor { +FilesizeValueRangeProcessor() {} + +Xapian::valueno operator()(std::string &begin, std::string &) { +if (begin.substr(0, 9) != "filesize:") +return Xapian::BAD_VALUENO; +begin.erase(0, 9); +return NOTMUCH_VALUE_FILESIZE; +} +}; + notmuch_database_t * notmuch_database_open (const char *path, notmuch_database_mode_t mode) @@ -463,6 +474,7 @@ notmuch_database_open (const char *path, struct stat st; int err; unsigned int i; +FilesizeValueRangeProcessor filesize_proc; if (asprintf (¬much_path, "%s/%s", path, ".notmuch") == -1) { notmuch_path = NULL; @@ -508,6 +520,7 @@ notmuch_database_open (const char *path, notmuch->query_parser->set_stemmer (Xapian::Stem ("english")); notmuch->query_parser->set_stemming_strategy (Xapian::QueryParser::STEM_SOME); notmuch->query_parser->add_valuerangeprocessor (notmuch->value_range_processor); + notmuch->query_parser->add_valuerangeprocessor (&filesize_proc); for (i = 0; i < ARRAY_SIZE (BOOLEAN_PREFIX_EXTERNAL); i++) { prefix_t *prefix = &BOOLEAN_PREFIX_EXTERNAL[i]; @@ -889,6 +902,7 @@ _notmuch_database_link_message (notmuch_database_t *notmuch, notmuch_status_t notmuch_database_add_message (notmuch_database_t *notmuch, const char *filename, + const off_t size, notmuch_message_t **message_ret) { notmuch_message_file_t *message_file; @@ -992,6 +1006,9 @@ notmuch_database_add_message (notmuch_database_t *notmuch, if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { _notmuch_message_set_filename (message, filename); _notmuch_message_add_term (message, "type", "mail"); + ret = _notmuch_message_set_filesize (message, filename, size); + if (ret) + goto DONE; } else { ret = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; goto DONE; diff --git a/lib/message.cc b/lib/message.cc index 49519f1..2bfc5ed 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -426,6 +426,31 @@ _notmuch_message_set_filename (notmuch_message_t *message, message->doc.set_data (s); } +notmuch_status_t +_notmuch_message_set_filesize (notmuch_message_t *message, + const char *filename, + const off_t size) +{ +struct stat st; +off_t realsize = size; +notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS; + +if (realsize < 0) { + if (stat (filename, &st)) { + ret = NOTMUCH_STATUS_FILE_ERROR; + goto DONE; + } else { + realsize = st.st_size; + } +} + +message->doc.add_value (NOTMUCH_VALUE_FILESIZE, +Xapian::sortable_serialise (realsize)); + + DONE: +return ret; +} + const char * notmuch_message_get_filename (notmuch_message_t *message) { diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index 116f63d..1ba3055 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -100,7 +100,8 @@ _internal_error (const char *format, ...) PRINTF_ATTRIBUTE (1, 2); typedef enum { NOTMUCH_VALUE_TIMESTAMP = 0, -NOTMUCH_VALUE_MESSAGE_ID +NOTMUCH_VALUE_MESSAGE_ID, +NOTMUCH_VALUE_FILESIZE } notmuch_value_t; /* Xapian (with flint backend) complains if we
Re: [notmuch] Missing messages breaking threads
On Fri, 18 Dec 2009 12:52:58 -0800, Carl Worth wrote: > On Fri, 18 Dec 2009 19:53:13 +, James Westby > wrote: > Oh, I was assuming you wouldn't index any text. The UI can add "missing > message" for a document with no filename, for example. Works for me. > > So, to summarise, I should first look at storing filesizes, then > > the collision code to make it index further when the filesize grows, > > and then finally the code to add documents for missing messages? > > Some of the code areas to be touched will be changing soon, (at least as > far as when filenames appear and disappear). Hopefully I'll have > something posted for that sooner rather than later to avoid having to > redo too much work. That would be great. I'm learning all the code anyway, so there's not a whole lot of knowledge being thrown away. I've just sent an initial cut at the fist step. > > The only thing I am unclear on is how to handle existing databases? > > Do we have any concept of versioning? Or should I just assume that > > filesize: may not be in the document and act appropriately? > > My current, outstanding patch is going to be the first trigger for a > "flag day" where we'll all need to rewrite our databases. > > We don't have any concept of versioning yet, but it would obviously be > easy to have a new version document with an increasing integer. > > But even with my current patch I'm considering doing a graceful upgrade > of the database in-place rather than making the user do something like a > dump, delete, rebuild, restore. That would give a much better experience > than "Your database is out-of-date, please rebuild it", so we'll see if > I pursue that in the end. That sounds nice, I'd certainly prefer this sort of thing as it evolves. Thanks, James ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] Store the size of the file for each message
On Fri, 18 Dec 2009 21:21:03 +, James Westby wrote: > Here's the first part, storing the filesize. I'm using > add_value so that we can make it sortable, is that valid > for retrieving it as well? Yes, a value makes sense here and should make the value easy to retrieve. > The only thing I'm not sure about is if it works. Is there > a way to inspect a document to see the values that are > stored? I usually use a little tool I wrote called xapian-dump. It currently exists only in the git history of notmuch. Look at commit: 22691064666c03c5e76bc787395bfe586929f4cc or so. > Doing a search isn't working, so I imagine I made a mistake. Let's see... (just reviewing here, not testing).. > +struct FilesizeValueRangeProcessor : public Xapian::ValueRangeProcessor { > +FilesizeValueRangeProcessor() {} > + > +Xapian::valueno operator()(std::string &begin, std::string &) { > +if (begin.substr(0, 9) != "filesize:") > +return Xapian::BAD_VALUENO; > +begin.erase(0, 9); > +return NOTMUCH_VALUE_FILESIZE; > +} > +}; If the file size is just an integer, then you shouldn't need a custom ValueRangeProcessor. One of the existing processors in Xapian should work fine. Having not ever written a custom processor, I can't say whether the one above is correct or not. -Carl pgp7QrUqZ9sn5.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] Store the size of the file for each message
On Fri, 18 Dec 2009 14:29:21 -0800, Carl Worth wrote: > On Fri, 18 Dec 2009 21:21:03 +, James Westby > wrote: > Yes, a value makes sense here and should make the value easy to > retrieve. Excellent. > I usually use a little tool I wrote called xapian-dump. It currently > exists only in the git history of notmuch. Look at commit: > > 22691064666c03c5e76bc787395bfe586929f4cc > > or so. Thanks, I found delve, which at least showed that something was being stored. It's in the xapian-tools package, and delve -V2 prints out the filesize value for each document. It would be great if we could specify an alternative configuration file for testing so that I can set up a small maildir and test against that. > If the file size is just an integer, then you shouldn't need a custom > ValueRangeProcessor. One of the existing processors in Xapian should > work fine. Correct, I hadn't read the documentation closely enough. After fixing that and doing some testing I have this working now. Patch incoming. Thanks, James ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] [PATCH v2] Store the size of the file for each message
When indexing a message store the filesize along with it so that when we store all the filenames for a message-id we can know if any of them have different content cheaply. The value stored is defined to be the largest filesize of any of the files for that message. This changes the API for efficiency reasons. The size is often known to the caller, and so we save a second stat by asking them to provide it. If they don't know it they can pass -1 and the stat will be done for them. We store the filesize such that we can query a range. Thus it would be possible to query "filesize:0..100" if you somehow knew the raw message was less that 100 bytes. --- With new, improved, working, filesize:.. search. lib/database.cc |7 +++ lib/message.cc| 25 + lib/notmuch-private.h |8 +++- lib/notmuch.h |5 + notmuch-new.c |2 +- 5 files changed, 45 insertions(+), 2 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index b6c4d07..d834d94 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -463,6 +463,8 @@ notmuch_database_open (const char *path, struct stat st; int err; unsigned int i; +Xapian::NumberValueRangeProcessor *filesize_proc = new Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_FILESIZE, +"filesize:", true); if (asprintf (¬much_path, "%s/%s", path, ".notmuch") == -1) { notmuch_path = NULL; @@ -508,6 +510,7 @@ notmuch_database_open (const char *path, notmuch->query_parser->set_stemmer (Xapian::Stem ("english")); notmuch->query_parser->set_stemming_strategy (Xapian::QueryParser::STEM_SOME); notmuch->query_parser->add_valuerangeprocessor (notmuch->value_range_processor); + notmuch->query_parser->add_valuerangeprocessor (filesize_proc); for (i = 0; i < ARRAY_SIZE (BOOLEAN_PREFIX_EXTERNAL); i++) { prefix_t *prefix = &BOOLEAN_PREFIX_EXTERNAL[i]; @@ -889,6 +892,7 @@ _notmuch_database_link_message (notmuch_database_t *notmuch, notmuch_status_t notmuch_database_add_message (notmuch_database_t *notmuch, const char *filename, + const off_t size, notmuch_message_t **message_ret) { notmuch_message_file_t *message_file; @@ -992,6 +996,9 @@ notmuch_database_add_message (notmuch_database_t *notmuch, if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { _notmuch_message_set_filename (message, filename); _notmuch_message_add_term (message, "type", "mail"); + ret = _notmuch_message_set_filesize (message, filename, size); + if (ret) + goto DONE; } else { ret = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; goto DONE; diff --git a/lib/message.cc b/lib/message.cc index 49519f1..2bfc5ed 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -426,6 +426,31 @@ _notmuch_message_set_filename (notmuch_message_t *message, message->doc.set_data (s); } +notmuch_status_t +_notmuch_message_set_filesize (notmuch_message_t *message, + const char *filename, + const off_t size) +{ +struct stat st; +off_t realsize = size; +notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS; + +if (realsize < 0) { + if (stat (filename, &st)) { + ret = NOTMUCH_STATUS_FILE_ERROR; + goto DONE; + } else { + realsize = st.st_size; + } +} + +message->doc.add_value (NOTMUCH_VALUE_FILESIZE, +Xapian::sortable_serialise (realsize)); + + DONE: +return ret; +} + const char * notmuch_message_get_filename (notmuch_message_t *message) { diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index 116f63d..1ba3055 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -100,7 +100,8 @@ _internal_error (const char *format, ...) PRINTF_ATTRIBUTE (1, 2); typedef enum { NOTMUCH_VALUE_TIMESTAMP = 0, -NOTMUCH_VALUE_MESSAGE_ID +NOTMUCH_VALUE_MESSAGE_ID, +NOTMUCH_VALUE_FILESIZE } notmuch_value_t; /* Xapian (with flint backend) complains if we provide a term longer @@ -193,6 +194,11 @@ void _notmuch_message_set_filename (notmuch_message_t *message, const char *filename); +notmuch_status_t +_notmuch_message_set_filesize (notmuch_message_t *message, + const char *filename, + const off_t size); + void _notmuch_message_ensure_thread_id (notmuch_message_t *message); diff --git a/lib/notmuch.h b/lib/notmuch.h index 60834fb..5d0d224 100644 --- a/lib/notmuch.h +++ b/lib/notmuch.h @@ -32,6 +32,7 @@ NOTMUCH_BEGIN_DECLS #include +#include #ifndef FALSE #define FALSE 0 @@ -241,6 +242,9 @@ notmuch_database_get_timestamp (notmuch_database_t *database, * notmuch database will reference the filename
Re: [notmuch] [PATCH] Add an "--output=(json|text|)" command-line option to both notmuch-search and notmuch-show.
On Fri, 18 Dec 2009 09:33:43 -0800, Carl Worth wrote: > I think that selecting *what* to emit is orthogonal from selecting *how* > to format that output. I can see that point of view. > See some ideas in the TODO file, (where I proposed --for and --format > options for these). It's a detail, but could you choose two names that are not substrings of each other? Eventually we do want tab completion on the command line to work :). Also, "search --for tags foo" suggests to me that searching for tags matching foo. What about using --output for that? One thing that is not completely clear to me at this point is what the difference is between notmuch search --for messages search-terms and notmuch show search-terms David ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] wish: more informative citations
Wouldn't it be nice if citations showed the first line or so of the text being cited? Stealing text from another thread, > In case of a citation following immediately new contents. When the citation > was collapsed: > > [1-line citation. Click/Enter to show.] > Lorem ipsum dolor sit amet, consectetur adipisicin Would be displayed as something like: > In case of a citation following [ Click/Enter to show 3 more lines ] Actually I'm not too sure about the format, but I thought I'd through that out there. Happy hacking, David ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] Store the size of the file for each message
On Sat, 19 Dec 2009 00:08:24 +, James Westby wrote: > Thanks, I found delve, which at least showed that something was > being stored. It's in the xapian-tools package, and > >delve -V2 > > prints out the filesize value for each document. Ah, right. I had forgotten about that. > It would be great if we could specify an alternative configuration > file for testing so that I can set up a small maildir and test > against that. You can, actually. Just set the NOTMUCH_CONFIG environment variable to your alternate configuration file. (And yes, we're missing any mention of this in our documentation.) > Correct, I hadn't read the documentation closely enough. After fixing > that and doing some testing I have this working now. Patch incoming. Cool! -Carl pgpOjgsJsdD9q.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] wish: more informative citations
On Fri, 18 Dec 2009 20:47:20 -0400, David Bremner wrote: > Would be displayed as something like: > > > In case of a citation following [ Click/Enter to show 3 more lines ] > > Actually I'm not too sure about the format, but I thought I'd through > that out there. That's a fine idea. Along with this would be getting rid of the stupidity of displaying [1 line citation] rather than just displaying the citation itself! And I really want my keybinding for displaying all the citations in the current message. And code to recognize top-posted copies as citations and hiding that. And, and... -Carl ...and more time to do all this stuff. I've got an ever-growing TODO list and a backlog of patches to be reviewed that goes back several weeks now. I'm hoping that I'll be able to sneak some time over the upcoming holidays... pgpwA058knU9H.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] [PATCH] Reindex larger files that duplicate ids we have
When we see a message where we already have the file id stored, check if the size is larger. If it is then re-index and set the file size and name to be the new message. --- Here's the (quite simple) patch to implement indexing the largest copy of each mail that we have. Does the re-indexing replace the old terms? In the case where you had a collision with different text this could make a search return mails that don't contain that text. I don't think it's a big issue though, even if that is the case. Thanks, James lib/database.cc |4 +++- lib/index.cc | 27 +++ lib/message.cc| 31 ++- lib/notmuch-private.h | 13 + lib/notmuch.h |5 +++-- 5 files changed, 72 insertions(+), 8 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index d834d94..64f29b9 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -1000,7 +1000,9 @@ notmuch_database_add_message (notmuch_database_t *notmuch, if (ret) goto DONE; } else { - ret = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; + ret = _notmuch_message_possibly_reindex (message, filename, size); + if (!ret) + ret = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; goto DONE; } diff --git a/lib/index.cc b/lib/index.cc index 125fa6c..14c3268 100644 --- a/lib/index.cc +++ b/lib/index.cc @@ -312,3 +312,30 @@ _notmuch_message_index_file (notmuch_message_t *message, return ret; } + +notmuch_status_t +_notmuch_message_possibly_reindex (notmuch_message_t *message, +const char *filename, +const off_t size) +{ +off_t realsize = size; +off_t stored_size; +notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS; + +ret = _notmuch_message_size_on_disk (message, filename, &realsize); +if (ret) +goto DONE; +stored_size = _notmuch_message_get_filesize (message); +if (realsize > stored_size) { + ret = _notmuch_message_index_file (message, filename); + if (ret) + goto DONE; + ret = _notmuch_message_set_filesize (message, filename, realsize); + _notmuch_message_set_filename (message, filename); + _notmuch_message_sync (message); +} + + DONE: +return ret; + +} diff --git a/lib/message.cc b/lib/message.cc index 2bfc5ed..cc32741 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -427,23 +427,38 @@ _notmuch_message_set_filename (notmuch_message_t *message, } notmuch_status_t -_notmuch_message_set_filesize (notmuch_message_t *message, +_notmuch_message_size_on_disk (notmuch_message_t *message, const char *filename, - const off_t size) + off_t *size) { struct stat st; -off_t realsize = size; notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS; -if (realsize < 0) { +if (*size < 0) { if (stat (filename, &st)) { ret = NOTMUCH_STATUS_FILE_ERROR; goto DONE; } else { - realsize = st.st_size; + *size = st.st_size; } } + DONE: +return ret; +} + +notmuch_status_t +_notmuch_message_set_filesize (notmuch_message_t *message, + const char *filename, + const off_t size) +{ +off_t realsize = size; +notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS; + +ret = _notmuch_message_size_on_disk (message, filename, &realsize); +if (ret) +goto DONE; + message->doc.add_value (NOTMUCH_VALUE_FILESIZE, Xapian::sortable_serialise (realsize)); @@ -451,6 +466,12 @@ _notmuch_message_set_filesize (notmuch_message_t *message, return ret; } +off_t +_notmuch_message_get_filesize (notmuch_message_t *message) +{ +return Xapian::sortable_unserialise (message->doc.get_value (NOTMUCH_VALUE_FILESIZE)); +} + const char * notmuch_message_get_filename (notmuch_message_t *message) { diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index 1ba3055..cf65fd9 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -199,6 +199,14 @@ _notmuch_message_set_filesize (notmuch_message_t *message, const char *filename, const off_t size); +off_t +_notmuch_message_get_filesize (notmuch_message_t *message); + +notmuch_status_t +_notmuch_message_size_on_disk (notmuch_message_t *message, + const char *filename, + off_t *size); + void _notmuch_message_ensure_thread_id (notmuch_message_t *message); @@ -218,6 +226,11 @@ notmuch_status_t _notmuch_message_index_file (notmuch_message_t *message, const char *filename); +notmuch_status_t +_notmuch_message_possibly_reindex (notmuch_message_t *message, +c
Re: [notmuch] [PATCH] Store the size of the file for each message
On Fri, 18 Dec 2009 16:57:16 -0800, Carl Worth wrote: > You can, actually. Just set the NOTMUCH_CONFIG environment variable to > your alternate configuration file. (And yes, we're missing any mention > of this in our documentation.) Sweet. Where would be the best place to document it? Just in the man page? Thanks, James ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] [PATCH] Fix-up some outdated comments.
--- lib/message.cc |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index cc32741..7129d59 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -391,7 +391,7 @@ notmuch_message_get_replies (notmuch_message_t *message) * multiple filenames for email messages with identical message IDs. * * This change will not be reflected in the database until the next - * call to _notmuch_message_set_sync. */ + * call to _notmuch_message_sync. */ void _notmuch_message_set_filename (notmuch_message_t *message, const char *filename) @@ -622,7 +622,7 @@ _notmuch_message_close (notmuch_message_t *message) * names to prefix values. * * This change will not be reflected in the database until the next - * call to _notmuch_message_set_sync. */ + * call to _notmuch_message_sync. */ notmuch_private_status_t _notmuch_message_add_term (notmuch_message_t *message, const char *prefix_name, @@ -679,7 +679,7 @@ _notmuch_message_gen_terms (notmuch_message_t *message, * names to prefix values. * * This change will not be reflected in the database until the next - * call to _notmuch_message_set_sync. */ + * call to _notmuch_message_sync. */ notmuch_private_status_t _notmuch_message_remove_term (notmuch_message_t *message, const char *prefix_name, -- 1.6.3.3 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[notmuch] keeping a copy of sent mail locally
Hello, Many thanks to Marten and Carl for the advice on using scripts for assigning tags automatically. It works like a charm. The next hurdle seems to be dealing with sent mail. I would like each message that I send to be saved in my local mail folder and treated the same as all my other messages -- so it will get indexed and put in the right thread, etc. (For example, right now the thread that started with my question about automatic tags only has the two replies in it, and its subject is "Re: [notmuch] automatically...") Bcc-ing myself on every sent message is suboptimal for a number of reasons: (1) gmail throws away the bcc-ed copy since it has the same message id as the one sitting in the gmail sent mail, and so the bcc-ed copy never makes it back to my local mail; (2) even if this was working, it would be an unnecessary waste of bandwidth. After looking around for a little bit, the only other option I could see was to use the FCC mail header. Unfortunately this wants a filename to save to (rather than just a directory); so I have to manually add the FCC: header, put in a filename that doesn't yet exist, type 'y' to confirm that I want the file to be created. It would be great if I could just set the directory where sent mail should go to as a global option, and then everything would happen automatically without any more effort from me. I realise that this is more of an emacs question than a notmuch question, but I'm hoping that somebody on this list has an elegant solution to this. Best, Alex -- Alex Ghitza -- Lecturer in Mathematics -- The University of Melbourne -- Australia -- http://www.ms.unimelb.edu.au/~aghitza/ ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] [PATCH] Store the size of the file for each message
On Sat, 19 Dec 2009 01:35:46 +, James Westby wrote: > On Fri, 18 Dec 2009 16:57:16 -0800, Carl Worth wrote: > > You can, actually. Just set the NOTMUCH_CONFIG environment variable to > > your alternate configuration file. (And yes, we're missing any mention > > of this in our documentation.) > > Sweet. Where would be the best place to document it? Just in the > man page? Currently we're replicating all of our documentation both in the man page and in the output from "notmuch help". It's annoying to have to add everything in two places, but I don't have a good idea for making that sharable yet. Anyone have a solution here? -Carl pgpuPNXOEz6px.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch