[notmuch] Asynchronous tagging

2009-11-21 Thread Jed Brown
On Sat, 21 Nov 2009 22:00:13 +0100, Carl Worth  wrote:
> Ah, OK. So you made a change on the Gmail side and that caused a file to
> be renamed locally.

yes

> Or did you mean you removed the tag from within emacs? In that case, the
> search term used to find the message is the message id itself. (Try
> running "M-x visible-mode" from a notmuch-search view in emacs to see
> what those look like.)

Exactly, that's what I meant by manually.  Those messages don't match a
nice generic pattern.

> Meanwhile, just archiving the message won't make things perfect for
> you. The document in the database point to the broken file is still
> there. And it should still have all of its terms, so will likely show up
> if you do more searches. (The "(null)" stuff you're seeing isn't because
> the message is NULL---for example, notmuch was able to find the date,
> etc. It's just that notmuch couldn't find the subject and authors when
> it went to look for the file.)

Yeah.

> So if GMail+offlineimap continues to shuffle your files around, you're
> going to keep seeing more and more confusion like this buildup.
> 
> So we really just need to teach notmuch how to handle an unstable file
> store in order to be able to use it in this kind of setup.

This seems unavoidable with maildir in the presence of any
synchronization, or use of a different client.

An ugly, but possible solution would be to mirror the entire maildir via
hard links with whatever naming scheme you like.  You then have a stable
link to the file and can resolve changing names in the real maildir.
This eats up a lot of inodes.

Jed


[notmuch] Asynchronous tagging

2009-11-21 Thread Jan Janak
On Sat, Nov 21, 2009 at 10:04 PM, Jed Brown  wrote:
>> ?3) Message flags are updated on the IMAP server (for example when you
>> mark a message as read in gmail). Offlineimap keeps message flags
>> synchronized. ?If you mark a local message as read then the change is
>> propagated to the IMAP server and vice versa.
>
> Do you know if Offlineimap (or some similar tool) can be told not to
> bother keeping flags synchronized?

Try using the cmdline option -q, from offlineimap's help:

-q  Run  only quick synchronizations.   Ignore any flag updates on IMAP servers.

This kinda works, but even with this option I am still seeing missing
files if I work with my inbox in gmail. AFAIK there is currently no
easy way to prevent that.

  -- Jan


[notmuch] Asynchronous tagging

2009-11-21 Thread Jed Brown
On Sat, 21 Nov 2009 21:50:10 +0100, Jan Janak  wrote:
> I get errors about missing files too. There are several reasons why
> that can happen:
> 
>  1) A message is moved from one folder to another in other mail
> clients that work with the Maildir spool.

Not a problem in my case because I currently have everything in one big
maildir (100k in one directory is a lot, but not too painful at 0.3s for
ls and 2s to stat everything).

>  2) A client changes the flags on a message, for example, when you
> read a message or mark it as deleted. Maildir stores flags in
> filenames.

This seems like a problem.  I'm not familiar with xapian, is it
necessarily an expensive operation to correct these inconsistencies?
Matching by thread id ought to be cheap.

>  3) Message flags are updated on the IMAP server (for example when you
> mark a message as read in gmail). Offlineimap keeps message flags
> synchronized.  If you mark a local message as read then the change is
> propagated to the IMAP server and vice versa.

Do you know if Offlineimap (or some similar tool) can be told not to
bother keeping flags synchronized?

Jed


[notmuch] Asynchronous tagging

2009-11-21 Thread Carl Worth
On Sat, 21 Nov 2009 20:45:30 +0100, Jed Brown  wrote:
> Actually, this popped up again.  I have a workaround, but here's the
> story if you are interested.

Hmmm... we definitely want to fix this, so let's figure this out.

> After changing a flag in Gmail and syncing with offlineimap, I get this
> in my inbox
> 
>  Today 19:18 [1/2] (null)   (null) (inbox 
> unread)
> 
> And when I try to open it, the buffer is full of stderr.
> 
>   Error opening 
> /home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
>  No such file or directory

Ah, OK. So you made a change on the Gmail side and that caused a file to
be renamed locally.

And yes, this currently makes notmuch very confused. That's a known
issue that needs to be documented better. And even better needs to be
fixed, (I just added a note for this to TODO).

> Explicitly archiving the null message removes it from these queries so
> the clutter is gone now, but it has to be done manually because the null
> message doesn't match any search terms.

Manually? All tag manipulation is done by search terms, so there's no
other way to remove a tag.

Or did you mean you removed the tag from within emacs? In that case, the
search term used to find the message is the message id itself. (Try
running "M-x visible-mode" from a notmuch-search view in emacs to see
what those look like.)

Meanwhile, just archiving the message won't make things perfect for
you. The document in the database point to the broken file is still
there. And it should still have all of its terms, so will likely show up
if you do more searches. (The "(null)" stuff you're seeing isn't because
the message is NULL---for example, notmuch was able to find the date,
etc. It's just that notmuch couldn't find the subject and authors when
it went to look for the file.)

So if GMail+offlineimap continues to shuffle your files around, you're
going to keep seeing more and more confusion like this buildup.

So we really just need to teach notmuch how to handle an unstable file
store in order to be able to use it in this kind of setup.

-Carl


[notmuch] Asynchronous tagging

2009-11-21 Thread Jan Janak
On Sat, Nov 21, 2009 at 9:01 PM, Carl Worth  wrote:
>> 3. I had initially put 'notmuch new' in a cron job (instead of
>> offlineimap postsync hook) and new/search would sometimes complain about
>> missing files in the maildir. ?The first time this happened, it did not
>> correct itself and I ended up reimporting the database (I had moved some
>> things around so I could have been at fault). ?Since then I have seen
>> these errors on a couple occasions, but they always go away upon
>> rerunning 'notmuch new'. ?My guess is that it has to do with offlineimap
>> changing flags so I moved 'notmuch new' to the postsync hook and have
>> not seen the errors since. ?If it is important that notmuch never runs
>> concurrently with an offlineimap sync, it should eventually go in the
>> docs.
>
> Thanks for the pointer.
>
> Does offlineimap use tmp while it's delivering message and then move
> things to new? If so, then maybe all we need to do to fix notmuch to not
> look into tmp directories?

Yes, it does. I think all delivery agents work this way, IIRC the
reason why messages are first written in tmp and then moved to new is
to make sure that clients do not see partially written messages.
Maildir has been designed to be lock-less so this is needed.

I get errors about missing files too. There are several reasons why
that can happen:

 1) A message is moved from one folder to another in other mail
clients that work with
the Maildir spool.

 2) A client changes the flags on a message, for example, when you
read a message or
mark it as deleted. Maildir stores flags in filenames.

 3) Message flags are updated on the IMAP server (for example when you
mark a message
as read in gmail). Offlineimap keeps message flags synchronized.
If you mark a
local message as read then the change is propagated to the IMAP
server and vice
versa.

 -- Jan


[notmuch] Asynchronous tagging

2009-11-21 Thread Jed Brown
On Sat, 21 Nov 2009 21:01:20 +0100, Carl Worth  wrote:
> Yes, this is a known bug in Xapian (it rewrites all of the indexed terms
> for the email message even though you're just trying to add/remove one
> term). The Xapian ticket for this is here:
> 
>   replace_document should make minimal changes to database file
>   http://trac.xapian.org/ticket/250

This bug report is concerned that it could require an API change, it
sounds like you think this is unnecessary.  Thanks for the detailed
explanation.

> Chris Wilson just contributed a patch to enable read-only usage of
> notmuch while another notmuch process holds the write lock.

I'm running it.

> Does offlineimap use tmp while it's delivering message and then move
> things to new? If so, then maybe all we need to do to fix notmuch to not
> look into tmp directories?

Yes, that's how maildir is supposed to work.  Deliver to tmp, hard link
from new, then unlink in tmp.  The client should never look in tmp.
Should be very quick to fix in notmuch.

Jed


[notmuch] Asynchronous tagging

2009-11-21 Thread Karl Wiberg
On Sat, Nov 21, 2009 at 9:01 PM, Carl Worth  wrote:

> Does offlineimap use tmp while it's delivering message and then move
> things to new? If so, then maybe all we need to do to fix notmuch to not
> look into tmp directories?

That's probably the right thing to do regardless---IIRC, the tmp
directory exists so that processes can put messages there while they
are writing them, and then do an atomic rename to the new (or cur)
directory.

-- 
Karl Wiberg, kha at treskal.com
   subrabbit.wordpress.com
   www.treskal.com/kalle


[notmuch] Asynchronous tagging

2009-11-21 Thread Carl Worth
On Sat, 21 Nov 2009 19:35:39 +0100, Jed Brown  wrote:
> I'm really enjoying notmuch, thanks!  I have a minor issue and a couple
> observations worth noting.

Thanks, Jed! And welcome to notmuch.

> 1. Changing tags (like removing inbox/unread) has really high latency.

Yes, this is a known bug in Xapian (it rewrites all of the indexed terms
for the email message even though you're just trying to add/remove one
term). The Xapian ticket for this is here:

replace_document should make minimal changes to database file
http://trac.xapian.org/ticket/250

I've looked at the code, and it looks like it's going to be easy to
fix. If anyone wants to try, here's the file to change:

xapian-core/backends/flint/flint_database.cc

And look for:

// FIXME - in the case where there is overlap between the new
// termlist and the old termlist, it would be better to compare the
// two lists, and make the minimum set of modifications required.
// This would lead to smaller changesets for replication, and
// probably be faster overall

So I think this might be as easy as just walking over two sorted lists
looking for differences.

Note that this is in the currently default "flint" backend, but the
Xapian folks are probably more interested in fixing the in-development
"chert" backend. So the patch to get upstreamed there will probably also
fix:

xapian-core/backends/chert/chert_database.cc

(I'm hoping the fix will be the same---an identical comment exists
there.)

Also, if you want to experiment with the chert backend, compile current
Xapian source and run notmuch with XAPIAN_PREFER_CHERT=1. I haven't
tried that yet, but there are claims that a chert database can be 40%
smaller than an equivalent flint database.

> 2. I have 'notmuch new' in an offlineimap postsync hook, but
> notmuch-search-refresh-view occasionally complains that another process
> has the lock (since I might press '=' when 'notmuch new' is running).
> Waiting a moment and trying again works fine, but it would be nice to
> clean this up eventually.

Chris Wilson just contributed a patch to enable read-only usage of
notmuch while another notmuch process holds the write lock. This should
be very nice, (and means that new users will be able to start playing
with notmuch even while the initial index creation is happening).

> 3. I had initially put 'notmuch new' in a cron job (instead of
> offlineimap postsync hook) and new/search would sometimes complain about
> missing files in the maildir.  The first time this happened, it did not
> correct itself and I ended up reimporting the database (I had moved some
> things around so I could have been at fault).  Since then I have seen
> these errors on a couple occasions, but they always go away upon
> rerunning 'notmuch new'.  My guess is that it has to do with offlineimap
> changing flags so I moved 'notmuch new' to the postsync hook and have
> not seen the errors since.  If it is important that notmuch never runs
> concurrently with an offlineimap sync, it should eventually go in the
> docs.

Thanks for the pointer.

Does offlineimap use tmp while it's delivering message and then move
things to new? If so, then maybe all we need to do to fix notmuch to not
look into tmp directories?

-Carl






[notmuch] Asynchronous tagging

2009-11-21 Thread Jed Brown
On Sat, 21 Nov 2009 19:35:39 +0100, Jed Brown  wrote:

[...]

> 3. I had initially put 'notmuch new' in a cron job (instead of
> offlineimap postsync hook) and new/search would sometimes complain about
> missing files in the maildir.  The first time this happened, it did not
> correct itself and I ended up reimporting the database (I had moved some
> things around so I could have been at fault).  Since then I have seen
> these errors on a couple occasions, but they always go away upon
> rerunning 'notmuch new'.  My guess is that it has to do with offlineimap
> changing flags so I moved 'notmuch new' to the postsync hook and have
> not seen the errors since.  If it is important that notmuch never runs
> concurrently with an offlineimap sync, it should eventually go in the
> docs.

Actually, this popped up again.  I have a workaround, but here's the story if 
you are interested.

After changing a flag in Gmail and syncing with offlineimap, I get this
in my inbox

 Today 19:18 [1/2] (null)   (null) (inbox 
unread)

And when I try to open it, the buffer is full of stderr.

  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory

It is present in any searches that contain the problem files

  $ notmuch search tag:inbox | wc
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258826583_1.20705.kunyang,U=174235,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 
/home/jed/.mail-archive/gmail-all/new/1258827595_0.20705.kunyang,U=174288,FMD5=844bb96d088d057aa1b32ac1fbc67b56:2,:
 No such file or directory
  Error opening 

[notmuch] Asynchronous tagging

2009-11-21 Thread Keith Packard
On Sat, 21 Nov 2009 23:46:44 +0100, Carl Worth  wrote:

> So some investigation is needed to see how important that optimization
> is, and if it's important to see whether there's another way to keep the
> performance while being able to support renames. (Or alternately,
> allowing the user to configure an option saying, "I need to support
> renames even if that means that notmuch new is a bit slower.").

I'd suggest that the best way to make this more efficient would be to
capture directory contents (along with the directory mtime) and use that
to detect changes. If we assume that mail messages are never changed, we
could use that to avoid stat'ing files in directories too.

-- 
keith.packard at intel.com
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: