[notmuch] Idea for storing tags

2010-01-11 Thread Scott Morrison

Thought you would be interested in my experiences and thoughts from actually 
doing this kind of stuff.  

With my software MailTags (www.indev.ca/MailTags.html) and I have looked at all 
these options and decided to go with storing tags in headers (in  json 
formatted data for the X-MailTags header)

I have thought seriously about using pseudo emails stored in a specially named 
directory but feel there are a couple of issues with this.
1.  synchronization of tag data with emails -- if they are in a 
subfolder then it presents the issue of maintaining this subfolder when 
managing emails (moving, deleting, duplicating etc) and any .tag folder unaware 
clients are likely cause an breakage in tagdata/message association.  One way 
of doing this is to have a global .tag folder.

2. what happens if that message is archived or moved to an exclusively 
local cache -- eg. Mail.app on OS X can easily move IMAP messages to a folder 
resident on the computers computers? -- 
3. what happens with duplicates of emails -- I would assume that the 
message id would be the key to match the tag data to the message.  In this 
system a duplicate of a message could not have a different set of tags from the 
original (not that this would necessarily be desirable.)

As I mentioned, I went with tags in headers -- though this has its own 
Your mention of potential leakage (aka inadvertent disclosure of tag 
data) is real -- but only if the client used to bounce/forward is not the one 
to tag the message (one would assume that if a client can tag, it can know to 
exclude the tags in a bounce.)   Mail.app -- which I am pluging into does not 
forward headers -- though it will include all headers in a bounce -- but chance 
are you aren't tagging messages you are bouncing.:)

The performance issue is very real -- because it means that somehow 
messages have to rewritten to the IMAP server -- IMAP doesn't have a mechanism 
AFAIK for updates.  Additionally, IMAP doesn't have a mechanism for simply 
replacing one message data with another -- a new message must be written and 
the old message must be deleted and the message IMAP UID will change, and the 
client will have to deal with this especially if it is cache the messages.

Also GMAIL IMAP is an issue-  gmail IMAP is not IMAP -- it simply 
doesn't work like a true imap server -- writes to folders in gmail IMAP are 
translated to database updates where it is attributing a single record of the 
message with the folder it was "written" to.   Changing headers on a gmail IMAP 
message simply will not work because it will will reject the message as update 
of the single record (and not actually write the new data).

Still tags in headers meant that I didn't have to worry about making sure that 
the .tags folder is maintained appropriate (throughout moves and deletions) and 
that the data is stored much closer to the message for data recovery if it is 
ever needed and for archiving tags. -- in anycase -- this is what I have 
working -- though I am open to considering new approaches.


also see my post to the mailtags-list from a few years back

On 2010-01-11, at 5:19 PM, martin f krafft wrote:

> Folks, over in #notmuch, we just floated an idea that I'd like to
> get out to you. We've been debating storing tags for messages.
> Therefore I am cross-posting. Please forgive me.
> So far, there are two approaches:
> 1. External database, which has the downside of not being
>  synchronisable with standard IMAP, like the rest of your mail
>  (assuming you use IMAP). Also, it's possible for mailstore and
>  database to get out of sync.
> 2. In-headers, which has the downside of leaking (e.g. when
>  bouncing), and incurs the risks associated with message rewrites
>  (which I think is pretty much ignorable, but it's still there).
>  Also, there's a performance issue, but in the context of an
>  indexer like notmuch, this is negligible.
>  The leakage is real, though and I think it makes in-headers
>  unusable. After all, I don't ever want anyone else to know that
>  I tag e-mails from my boss as "from-idiots", and I forward and
>  bounce mail on a regular basis. I could tell my MTA to remove
>  those headers, but I might forget to do that on a new system.
> We also previously determined that IMAP keywords are pretty much
> useless as they are stored per mailbox, not per message, not
> standardised, and limited in their length anyway [0]. This also
> means that we don't really need to investigate sensibly storing tags
> in Maildir (e.g. with xattrs), because IMAP cannot transport them.
> 0. http://lists.madduck.net/pipermail/mailtags/2007-August/msg00016.html
> Seriously, who implemented IMAPv4rev1 and what sort of crack were
> they smoking??
> I remember there was some KDE groupware contacts manager that used
> IMAP to synchronise 

[notmuch] Idea for storing tags

2010-01-11 Thread Scott Robinson
I wrote a script to store and sync my tags.

  * One filename per message-ID.
  * Line-feed seperated tags in each file.

Then the whole structure is controlled via git. Conflict-resolution and sync
comes for free.

It isn't clear what use-case the earlier e-mail is aiming to satisfy. This is
how I solved my tag sync issues, though.

[notmuch] Xapian::DatabaseError on notmuch new upgrade

2010-01-11 Thread Jed Brown
I rebuilt notmuch and got a Xapian exception while notmuch new was
upgrading my database.  This seemed like a decent time to try the latest
Xapian, so I built a copy and the crash remained (and maintainer-mode
still optimizes so I built Xapian again so I could get decent
debugging).  Here is a trace from

  XAPIAN_PREFER_CHERT=1 notmuch new

I got the same error without setting XAPIAN_PREFER_CHERT.  It's not
clear to me whether this is a usage issue or Xapian problem.  The core
is 245 MB, I can provide it if someone wants to try debugging.


(gdb) bt
#0  0x7fa687d9e035 in raise () from /lib/libc.so.6
#1  0x7fa687d9f460 in abort () from /lib/libc.so.6
#2  0x7fa688622925 in __gnu_cxx::__verbose_terminate_handler() () from 
#3  0x7fa688620d56 in __cxxabiv1::__terminate(void (*)()) () from 
#4  0x7fa688620d83 in std::terminate() () from /usr/lib/libstdc++.so.6
#5  0x7fa688620e7e in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x7fa688a221bc in FlintTable::read_block (this=0x16395c8, n=2569, 
p=0xe3fd8e0 "") at backends/flint/flint_table.cc:243
#7  0x7fa688a22bda in FlintTable::block_to_cursor (this=0x16395c8, 
C_=0xe4074e0, j=0, n=2569) at backends/flint/flint_table.cc:393
#8  0x7fa688a299fc in FlintTable::next_default (this=0x16395c8, 
C_=0xe4074e0, j=1) at backends/flint/flint_table.cc:2219
#9  0x7fa688a29983 in FlintTable::next_default (this=0x16395c8, 
C_=0xe4074e0, j=0) at backends/flint/flint_table.cc:2214
#10 0x7fa688a022c5 in FlintTable::next (this=0x16395c8, C_=0xe4074e0, j=0) 
at backends/flint/flint_table.h:692
#11 0x7fa688a020d5 in FlintCursor::read_tag (this=0xe43ece0, 
keep_compressed=false) at backends/flint/flint_cursor.cc:264
#12 0x7fa688a1a6d8 in FlintPostListTable::get_chunk (this=0x16395c8, 
tname=..., did=82407, adding=false, from=0x7fff849be358, to=0x7fff849be350) at 
#13 0x7fa688a1b5e3 in FlintPostListTable::merge_changes (this=0x16395c8, 
mod_plists=..., doclens=..., freq_deltas=...) at 
#14 0x7fa688a08d89 in FlintWritableDatabase::flush_postlist_changes 
(this=0x1639590) at backends/flint/flint_database.cc:1053
#15 0x7fa688a0bb5f in FlintWritableDatabase::replace_document 
(this=0x1639590, did=45341, document=...) at 
#16 0x7fa688975b7d in Xapian::WritableDatabase::replace_document 
(this=0x16393a0, did=45341, document=...) at api/omdatabase.cc:817
#17 0x00416c6f in _notmuch_message_sync (message=0xe327130) at 
#18 0x00410ca5 in notmuch_database_upgrade (notmuch=0x1639460, 
progress_notify=0x40a890 , closure=0x7fff849bef20) at 
#19 0x0040ad4a in notmuch_new_command (ctx=0x162b120, argc=0, 
argv=0x7fff849bf0f8) at notmuch-new.c:746
#20 0x0040862e in main (argc=2, argv=0x7fff849bf0e8) at notmuch.c:449
(gdb) f 17
#17 0x00416c6f in _notmuch_message_sync (message=0xe327130) at 
609 db->replace_document (message->doc_id, message->doc);
Current language:  auto
The current source language is "auto; currently c++".
(gdb) p *message
$1 = {notmuch = 0x1639460, doc_id = 45341, frozen = 0, message_id = 0x0, 
thread_id = 0x0, in_reply_to = 0x0, filename = 0x0, message_file = 0x0, replies 
= 0xbaea960, flags = 0, doc = {internal = {dest = 0xe3816c0}}}

[notmuch] Coming to LCA? Come to the notmuch BOF!

2010-01-11 Thread Carl Worth
For anyone who is planning to be in Wellington next week, I've just
scheduled a BOF for notmuch. The current plan is for Thursday in the
slot just before lunch.

I'm flexible to change that if anyone has any scheduling concerns.

See the LCA wiki page for the BOF for any changes:


(And feel free to edit that page if you'd like to indicate you'll be
there or if you have specific things you'd like to discuss).


-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available

[notmuch] Bug with commit 2e96464f9705be4ec772280cad71a6c9d5831e6f

2010-01-11 Thread ra...@free.fr

I just updated notmuch and now notmuch new cannot update my mail anymore... It 
tells me that there are
700 files found, but tells that there's no new mail.

I did a git bisect, which tells me the first bad commit is commit 

I did not try to use the new xapian database or to update xapian; maybe this is 
the problem.

I tested with several tools to get mail in the maildir format, including mb2md 
and getmail, and I always get the problem.

I will try to investigate a bit more.