Tag timestamps and synchronization

2011-01-25 Thread Michal Sojka
On Mon, 24 Jan 2011, dm-list-email-notmuch at scs.stanford.edu wrote:
> One of the features I would like to see from notmuch is an easier
> ability to synchronize tags across machines.  At the very least, I
> would need either incremental dump and restore, or some way to
> communicate arbitrary tags to a local imap server that shares
> notmuch's maildir (much as notmuch currently syncs the standard tags),
> so that I synchronize two maildirs with a tool like offlineimap.

[...]

> In the case of dovecot, the arbitrary tag format is very simple.  Each
> maildir has a file called dovecot-keywords mapping numbers 0, 1,
> ... to keywords.  Then mail file names contain lower-case letters for
> the flags they are marked with--0 => a, 1 => b, etc.--allowing up to
> 26 arbitrary tags for each maildir.  One could probably sync to
> dovecot's maildir format relatively easily in a script given
> incremental dump and restore of tags.  Or possibly notmuch could
> natively support dovecot as one of multiple back-end tag storage
> schemes.

Hi David,

here is my idea of solving the problem of synchronizing tags and all
message metadata. The problem, it seems, is that every program uses a
different format for message metadata. Maybe, it would be useful to
define a simple metadata format that could be used by multiple programs
(at least by notmuch, dovecot and maybe mutt) and base the
synchronization on this format. Currently, I'm thinking about a separate
file with the same base name as the message storing message metadata in
the same format as message headers so it could look like:

tag: inbox
tag: notmuch
timestamp: 2011-01-25 10:48:00 GMT
spam: no
...

Then, any program could do whatever it wants with the metadata, e.g.
index them in a database etc.

In the ideal it would work like this: Dovecot would store the metadata
in a file like described above. IMAP protocol would be extended to be
able to send such metadata corresponding to a particular UID.
offlineimap would be able to retrieve (and synchronize) the metadata
files with the IMAP server and notmuch would index the metadata
similarly as it index messages and would modify them when it change
tags.

What do you (and others) think? Is this too wild? Too longterm?

Cheers
Michal



Tag timestamps and synchronization

2011-01-25 Thread Tim Stoakes
dm-list-email-notmuch at scs.stanford.edu(dm-list-email-notmuch at 
scs.stanford.edu)@240111-11:10:
> One of the features I would like to see from notmuch is an easier
> ability to synchronize tags across machines.  At the very least, I
> would need either incremental dump and restore, or some way to
> communicate arbitrary tags to a local imap server that shares
> notmuch's maildir (much as notmuch currently syncs the standard tags),
> so that I synchronize two maildirs with a tool like offlineimap.

David,

I do something like this by using some shell scripts with formail, to
'store' notmuch tags into the X-Label headers of the individual mails.
Offlineimap then syncs these headers. If I need the tags to become
notmuch-ified on the target, I just scan all the mail's X-Label headers.

(Actually it's better than this, since I use maildrop to set notmuch
tags with notmuch-deliver, *and* set X-Label headers to the same thing,
at mail delivery time. Then I use keybindings and shell scripts in mutt
such that whenever I retag a message, it is pushed to both notmuch and
X-Label.)

I'm happy to share this hack glue if it would help.

This is not great for a few reasons - there are a ton of moving parts,
and some double-work. If notmuch could index X-Label headers (a coming
feature I hear) then this would be much cleaner.

This is just one way of doing it, that works for me...

Tim

> As Carl pointed out to me in private email, there has been some
> previous discussion in the following thread:
> 
>   notmuch show id:87hbfnmiux.fsf at yoom.home.cworth.org
> 
> Based on that thread, there seems to be some desire for notmuch to
> keep track of a per-message timestamp when the flags were last
> updated.  This would allow much easier expiration for people who want
> the deleted tag.  It would also allow incremental dump and restore of
> tags, which is exactly what I need to sync tags across servers with
> reasonable amounts of bandwidth.
> 
> Metadata timestamps are one of those things that probably have a lot
> of different applications, so since Carl is considering a new database
> format for the next release anyway, perhaps it also makes sense to add
> a metadata change time for each messages.
> 
> The timestamp would be included in "dump" output, and you could
> request a dump of changes since a particular time.  On restore, you
> might have several options:
> 
>   - overwrite: always set the new tags and timestamp in the database
> to the value in the restore data.
> 
>   - update: always set the tags, but update the to the current time.
> 
>   - conditional T: update only if the message metadata has not been
> updated since time T.
> 
> To sync flags, then you just need to keep track of the last time you
> synced with a particular server--call this time T.  Do a dump since
> time T, upload to server, do a conditional restore for time T on
> server.  Finally do a partial dump from time T on the server and an
> overwrite import on the client.  (This policy makes changes on the
> server always override conflicting ones on the client--perhaps people
> want other policies, like union of the tags, etc.)
> 
> 
> Second, there seems to be some desire in that thread to sync with IMAP
> flags.  This would be particularly great, but the easies way to do it
> is probably *not* to try to implement IMAP, but rather to use an
> existing IMAP server and just modify the maildir so that the IMAP
> server will pick up the flags.
> 
> In the case of dovecot, the arbitrary tag format is very simple.  Each
> maildir has a file called dovecot-keywords mapping numbers 0, 1,
> ... to keywords.  Then mail file names contain lower-case letters for
> the flags they are marked with--0 => a, 1 => b, etc.--allowing up to
> 26 arbitrary tags for each maildir.  One could probably sync to
> dovecot's maildir format relatively easily in a script given
> incremental dump and restore of tags.  Or possibly notmuch could
> natively support dovecot as one of multiple back-end tag storage
> schemes.
> 
> Having a static tag mapping in the .notmuch-config file would be much
> better than hard-coding flag2tag.  However, I'm not sure it's
> sufficient.  The reason is that if you ever completely delete a tag
> (e.g., you have "todo", and "meeting" tags and periodically have no
> messages in either categories in a given mail folder), then an IMAP
> server like dovecot might end up re-allocating the letters
> corresponding to those tags in a different order.  Also, at least for
> dovecot, the flag mappings are per-folder, which you kind of want
> since you are limited to 26 non-standard tags, so global values might
> not work.
> 
> I'm curious to hear people's thoughts/reactions?
> 
> David

-- 
Tim Stoakes


Re: Tag timestamps and synchronization

2011-01-25 Thread Michal Sojka
On Mon, 24 Jan 2011, dm-list-email-notm...@scs.stanford.edu wrote:
 One of the features I would like to see from notmuch is an easier
 ability to synchronize tags across machines.  At the very least, I
 would need either incremental dump and restore, or some way to
 communicate arbitrary tags to a local imap server that shares
 notmuch's maildir (much as notmuch currently syncs the standard tags),
 so that I synchronize two maildirs with a tool like offlineimap.

[...]

 In the case of dovecot, the arbitrary tag format is very simple.  Each
 maildir has a file called dovecot-keywords mapping numbers 0, 1,
 ... to keywords.  Then mail file names contain lower-case letters for
 the flags they are marked with--0 = a, 1 = b, etc.--allowing up to
 26 arbitrary tags for each maildir.  One could probably sync to
 dovecot's maildir format relatively easily in a script given
 incremental dump and restore of tags.  Or possibly notmuch could
 natively support dovecot as one of multiple back-end tag storage
 schemes.

Hi David,

here is my idea of solving the problem of synchronizing tags and all
message metadata. The problem, it seems, is that every program uses a
different format for message metadata. Maybe, it would be useful to
define a simple metadata format that could be used by multiple programs
(at least by notmuch, dovecot and maybe mutt) and base the
synchronization on this format. Currently, I'm thinking about a separate
file with the same base name as the message storing message metadata in
the same format as message headers so it could look like:

tag: inbox
tag: notmuch
timestamp: 2011-01-25 10:48:00 GMT
spam: no
...

Then, any program could do whatever it wants with the metadata, e.g.
index them in a database etc.

In the ideal it would work like this: Dovecot would store the metadata
in a file like described above. IMAP protocol would be extended to be
able to send such metadata corresponding to a particular UID.
offlineimap would be able to retrieve (and synchronize) the metadata
files with the IMAP server and notmuch would index the metadata
similarly as it index messages and would modify them when it change
tags.

What do you (and others) think? Is this too wild? Too longterm?

Cheers
Michal
 
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Tag timestamps and synchronization

2011-01-24 Thread dm-list-email-notm...@scs.stanford.edu
At Tue, 25 Jan 2011 10:08:12 +1030,
Tim Stoakes wrote:
> 
> I do something like this by using some shell scripts with formail, to
> 'store' notmuch tags into the X-Label headers of the individual mails.
> Offlineimap then syncs these headers. If I need the tags to become
> notmuch-ified on the target, I just scan all the mail's X-Label headers.

How well does offlineimap work when you modify the contents of
messages?  This doesn't change the message UIDs, does it?  Are you
syncing between two imapd instances, or one imapd and a maildir?  (I
currently run a local imap server as well, because it seems to be a
lot faster.)

How does the imap server even detect that the message contents has
been modified?  Does it have to stat 300,000 files every time you
check your email?

In my setup, I regularly check email from three or four different
machines, so the syncing is not just for backup purposes.  When I
switch between computers all my label changes need to be visible.

> I'm happy to share this hack glue if it would help.

Yes, I'd be glad to take a look at your scripts.

Thanks,
David


Tag timestamps and synchronization

2011-01-24 Thread dm-list-email-notm...@scs.stanford.edu
One of the features I would like to see from notmuch is an easier
ability to synchronize tags across machines.  At the very least, I
would need either incremental dump and restore, or some way to
communicate arbitrary tags to a local imap server that shares
notmuch's maildir (much as notmuch currently syncs the standard tags),
so that I synchronize two maildirs with a tool like offlineimap.

As Carl pointed out to me in private email, there has been some
previous discussion in the following thread:

notmuch show id:87hbfnmiux.fsf at yoom.home.cworth.org

Based on that thread, there seems to be some desire for notmuch to
keep track of a per-message timestamp when the flags were last
updated.  This would allow much easier expiration for people who want
the deleted tag.  It would also allow incremental dump and restore of
tags, which is exactly what I need to sync tags across servers with
reasonable amounts of bandwidth.

Metadata timestamps are one of those things that probably have a lot
of different applications, so since Carl is considering a new database
format for the next release anyway, perhaps it also makes sense to add
a metadata change time for each messages.

The timestamp would be included in "dump" output, and you could
request a dump of changes since a particular time.  On restore, you
might have several options:

  - overwrite: always set the new tags and timestamp in the database
to the value in the restore data.

  - update: always set the tags, but update the to the current time.

  - conditional T: update only if the message metadata has not been
updated since time T.

To sync flags, then you just need to keep track of the last time you
synced with a particular server--call this time T.  Do a dump since
time T, upload to server, do a conditional restore for time T on
server.  Finally do a partial dump from time T on the server and an
overwrite import on the client.  (This policy makes changes on the
server always override conflicting ones on the client--perhaps people
want other policies, like union of the tags, etc.)


Second, there seems to be some desire in that thread to sync with IMAP
flags.  This would be particularly great, but the easies way to do it
is probably *not* to try to implement IMAP, but rather to use an
existing IMAP server and just modify the maildir so that the IMAP
server will pick up the flags.

In the case of dovecot, the arbitrary tag format is very simple.  Each
maildir has a file called dovecot-keywords mapping numbers 0, 1,
... to keywords.  Then mail file names contain lower-case letters for
the flags they are marked with--0 => a, 1 => b, etc.--allowing up to
26 arbitrary tags for each maildir.  One could probably sync to
dovecot's maildir format relatively easily in a script given
incremental dump and restore of tags.  Or possibly notmuch could
natively support dovecot as one of multiple back-end tag storage
schemes.

Having a static tag mapping in the .notmuch-config file would be much
better than hard-coding flag2tag.  However, I'm not sure it's
sufficient.  The reason is that if you ever completely delete a tag
(e.g., you have "todo", and "meeting" tags and periodically have no
messages in either categories in a given mail folder), then an IMAP
server like dovecot might end up re-allocating the letters
corresponding to those tags in a different order.  Also, at least for
dovecot, the flag mappings are per-folder, which you kind of want
since you are limited to 26 non-standard tags, so global values might
not work.

I'm curious to hear people's thoughts/reactions?

David


Tag timestamps and synchronization

2011-01-24 Thread dm-list-email-notmuch
One of the features I would like to see from notmuch is an easier
ability to synchronize tags across machines.  At the very least, I
would need either incremental dump and restore, or some way to
communicate arbitrary tags to a local imap server that shares
notmuch's maildir (much as notmuch currently syncs the standard tags),
so that I synchronize two maildirs with a tool like offlineimap.

As Carl pointed out to me in private email, there has been some
previous discussion in the following thread:

notmuch show id:87hbfnmiux@yoom.home.cworth.org

Based on that thread, there seems to be some desire for notmuch to
keep track of a per-message timestamp when the flags were last
updated.  This would allow much easier expiration for people who want
the deleted tag.  It would also allow incremental dump and restore of
tags, which is exactly what I need to sync tags across servers with
reasonable amounts of bandwidth.

Metadata timestamps are one of those things that probably have a lot
of different applications, so since Carl is considering a new database
format for the next release anyway, perhaps it also makes sense to add
a metadata change time for each messages.

The timestamp would be included in dump output, and you could
request a dump of changes since a particular time.  On restore, you
might have several options:

  - overwrite: always set the new tags and timestamp in the database
to the value in the restore data.

  - update: always set the tags, but update the to the current time.

  - conditional T: update only if the message metadata has not been
updated since time T.

To sync flags, then you just need to keep track of the last time you
synced with a particular server--call this time T.  Do a dump since
time T, upload to server, do a conditional restore for time T on
server.  Finally do a partial dump from time T on the server and an
overwrite import on the client.  (This policy makes changes on the
server always override conflicting ones on the client--perhaps people
want other policies, like union of the tags, etc.)


Second, there seems to be some desire in that thread to sync with IMAP
flags.  This would be particularly great, but the easies way to do it
is probably *not* to try to implement IMAP, but rather to use an
existing IMAP server and just modify the maildir so that the IMAP
server will pick up the flags.

In the case of dovecot, the arbitrary tag format is very simple.  Each
maildir has a file called dovecot-keywords mapping numbers 0, 1,
... to keywords.  Then mail file names contain lower-case letters for
the flags they are marked with--0 = a, 1 = b, etc.--allowing up to
26 arbitrary tags for each maildir.  One could probably sync to
dovecot's maildir format relatively easily in a script given
incremental dump and restore of tags.  Or possibly notmuch could
natively support dovecot as one of multiple back-end tag storage
schemes.

Having a static tag mapping in the .notmuch-config file would be much
better than hard-coding flag2tag.  However, I'm not sure it's
sufficient.  The reason is that if you ever completely delete a tag
(e.g., you have todo, and meeting tags and periodically have no
messages in either categories in a given mail folder), then an IMAP
server like dovecot might end up re-allocating the letters
corresponding to those tags in a different order.  Also, at least for
dovecot, the flag mappings are per-folder, which you kind of want
since you are limited to 26 non-standard tags, so global values might
not work.

I'm curious to hear people's thoughts/reactions?

David
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch