Re: [notmuch] Notmuch performance problems on OSX

2010-02-15 Thread Stewart Smith
On Fri, 15 Jan 2010 03:58:50 + (UTC), Olly Betts o...@survex.com wrote:
 One difference between OS X and other systems is that OS X supports the
 F_FULLSYNC ioctl, and other systems don't (currently, at least AFAIK)
 and Xapian uses that if it is available to ensure that changes have
 actually made it to disk:
 
 http://trac.xapian.org/ticket/288
 
 On other systems, it uses fdatasync() or fsync(), which typically just
 ensure that the data has left the OS - it can sit in disk controller or
 drive caches for potentially seconds longer.  This call happens once
 per table for every (explicit or implicit) flush on a database.

At least if you OS and file system don't hate you (e.g. XFS on Linux),
then fsync() really does flush the drive cache.

Also keep in mind that the OSX file system (HFS+) was great for
1985. It's essentially single threaded :/

-- 
Stewart Smith
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [notmuch] Notmuch performance problems on OSX

2010-02-09 Thread Oliver Charles
On Tue, Feb 9, 2010 at 10:09 PM, Olly Betts o...@survex.com wrote:
 On 2010-02-09, Oliver Charles wrote:
 I just upgraded to xapian-core HEAD and notmuch master tip today, in
 desparation to get away from GMail. Sadly it's still taking at least
 0.7s to tag a single thread (with one message). I'm really eager to
 solve this, could anyone give me any pointers on how I could go about
 profiling it or finding the cause of this problem?

 The first thing to try is disabling use of F_FULLFSYNC.  You'll need to
 run this command in the xapian-core source tree to comment out the F_FULLFSYNC
 code:

 perl -pi -e 's/^#ifdef F_FULLFSYNC/#if 0/' backends/*/*_io.h

 Then run make and make install.

$ time notmuch tag +inbox thread:6e66368b7887184c6d4c63653211b3f2

real0m0.067s
user0m0.036s
sys 0m0.028s

Now this looks a little bit more usable!

 Assuming that helps, then (a) you have a workaround, and (b) we'll know for
 sure it is F_FULLFSYNC to blame.

Looks like this is the case.

 I've created a ticket for a change to Xapian which should help here, but
 not had a chance to work on it yet:

 http://trac.xapian.org/ticket/426

I will add my info there and follow the ticket if I can.

--
   Oliver Charles / aCiD2

(Olly, sorry about the double email - in all my excitement I didn't
hit reply all :))
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [notmuch] Notmuch performance problems on OSX

2010-01-18 Thread Oliver Charles
On Thu, Jan 14, 2010 at 11:16 PM, Carl Worth cwo...@cworth.org wrote:
 Hi Oliver, welcome to notmuch!

 On Thu, 14 Jan 2010 15:30:48 +, Oliver Charles 
 oliver.g.char...@googlemail.com wrote:
 I've installed the latest notmuch from Git at this time of writing,
 along with Xapian from SVN head. However, just tagging a single thread
 with only one message seems to take too long:

 $ time notmuch tag +dissertation thread:7dc536441e6deade4256a46d46451221

 real  0m0.812s
 user  0m0.022s
 sys   0m0.037s

 Things work quite a bit faster than that on my machine:

 $ time notmuch tag +foo 
 id:5641883d1001140730l22832715ld6bdc95c9938d...@mail.gmail.com

 real    0m0.024s
 user    0m0.012s
 sys     0m0.004s

 But that could just be system differences.

Possibly, though my machine is more than capable of what it's doing -
a Q6600 with 4gb RAM

 And tagging all my messages is really horrible:

 $ time notmuch tag +foobar tag:inbox

 real  0m5.076s
 user  0m3.688s
 sys   0m0.105s

 For this operation, I can't really compare. How many messages are you
 tagging? Here's that operation for me with 525 messages in my inbox:

A few thousand (4k, I believe)

 That xapian-svn was built from svn HEAD right now, so I'm assuming it
 contains the #250 fix (http://trac.xapian.org/changeset/13808)

 Which I think means that things could have been even *much* slower
 before. ;-)

 The Xapian defect #250 was just one, initial (and obvious) performance
 problem. [Though, as I mentioned in a previous thread, if you're using a
 Xapian flint database, (look for .notmuch/xapian/iamflint), then you
 won't get the benefit of the Xapian fix until you rebuild your notmuch
 database from scratch with a current notmuch.]

I didn't know about this need to rebuild, but I tried that and didn't
have any more success sadly.

 Once you've verified that you've got the #250 fix functional, there
 could still be lots of performance bugs. And it would be time to start
 profiling.

 [...]

I'm pressed for time at the moment, but in a few weeks I might have
some time to investigate here...

-- 
Oliver Charles / aCiD2
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [notmuch] Notmuch performance problems on OSX

2010-01-14 Thread Carl Worth
Hi Oliver, welcome to notmuch!

On Thu, 14 Jan 2010 15:30:48 +, Oliver Charles 
oliver.g.char...@googlemail.com wrote:
 I've installed the latest notmuch from Git at this time of writing,
 along with Xapian from SVN head. However, just tagging a single thread
 with only one message seems to take too long:
 
 $ time notmuch tag +dissertation thread:7dc536441e6deade4256a46d46451221
 
 real  0m0.812s
 user  0m0.022s
 sys   0m0.037s

Things work quite a bit faster than that on my machine:

$ time notmuch tag +foo 
id:5641883d1001140730l22832715ld6bdc95c9938d...@mail.gmail.com

real0m0.024s
user0m0.012s
sys 0m0.004s

But that could just be system differences.

 And tagging all my messages is really horrible:
 
 $ time notmuch tag +foobar tag:inbox
 
 real  0m5.076s
 user  0m3.688s
 sys   0m0.105s

For this operation, I can't really compare. How many messages are you
tagging? Here's that operation for me with 525 messages in my inbox:

$ time notmuch tag +foobar tag:inbox

real0m1.551s
user0m1.504s
sys 0m0.016s

 That xapian-svn was built from svn HEAD right now, so I'm assuming it
 contains the #250 fix (http://trac.xapian.org/changeset/13808)

Which I think means that things could have been even *much* slower
before. ;-)

The Xapian defect #250 was just one, initial (and obvious) performance
problem. [Though, as I mentioned in a previous thread, if you're using a
Xapian flint database, (look for .notmuch/xapian/iamflint), then you
won't get the benefit of the Xapian fix until you rebuild your notmuch
database from scratch with a current notmuch.]

Once you've verified that you've got the #250 fix functional, there
could still be lots of performance bugs. And it would be time to start
profiling.

Perhaps the notmuch daemon idea (which we've proposed earlier for
other reasons) could help reduce overhead from reading the database and
writing it back out again. So that might be one avenue to explore for
fixing things.

I have no idea what OS X does, but Linux keeps my notmuch database in
its buffer cache so I can do these operations without even touching
disk (which is actually an SSD anyway, which also helps). I just
tried, and was able to get the single-message tag operation to be 3
times slower by dropping the cache:

$ sudo sh -c echo 3  /proc/sys/vm/drop_caches 
$ time notmuch tag +foo 
id:5641883d1001140730l22832715ld6bdc95c9938d...@mail.gmail.com

real0m0.062s
user0m0.000s
sys 0m0.020s

But again, whatever the performance problem might be, the first step
would be to examine some profiles. (And I'm clueless, myself as to what
profiling tools might be available for OS X.)

-Carl


pgpzh9qB9woVQ.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [notmuch] Notmuch performance problems on OSX

2010-01-14 Thread Olly Betts
On 2010-01-14, Oliver Charles wrote:
 I've installed the latest notmuch from Git at this time of writing,
 along with Xapian from SVN head. However, just tagging a single thread
 with only one message seems to take too long:

One difference between OS X and other systems is that OS X supports the
F_FULLSYNC ioctl, and other systems don't (currently, at least AFAIK)
and Xapian uses that if it is available to ensure that changes have
actually made it to disk:

http://trac.xapian.org/ticket/288

On other systems, it uses fdatasync() or fsync(), which typically just
ensure that the data has left the OS - it can sit in disk controller or
drive caches for potentially seconds longer.  This call happens once
per table for every (explicit or implicit) flush on a database.

I can see an issue here which is that currently Xapian writes the base
file for the table, then syncs it, then does the next table.  I bet it
would be more efficient to write them all and then sync them all,
especially with F_FULLSYNC.

I'll take a look at doing that, and have created a ticket for it:

http://trac.xapian.org/ticket/426

If after that this is still causing problems, it should probably be made
configurable what (if any) flushing is done.  If you're on a UPS-backed
server, you probably don't need such paranoia.

Cheers,
Olly

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch