Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-11 Thread Ralf S. Engelschall
On Wed, Oct 10, 2007, Justin Patrin wrote:

 On 10/10/07, Ben Walton [EMAIL PROTECTED] wrote:
  Indexes speed up read operations but slow down writes.  Which do we do
  more of?  I'd optimize for the case that would benefit most.  I
  suspect we do a fair amount of both.  In this case, maybe adding
  indexes for mostly-read tables would be the way to go.
 

 The size increase on the DB should also be investigated here. How much
 did your DB increase in size when the indexes were added?

With the database where all n.v.m.* revisions stay:

| $ sqlite mtn.db
| -- Loading resources from /u/rse/.sqliterc
| SQLite version 3.5.1
| Enter .help for instructions
| sqlite VACUUM;
| $ ls -l
| total 141344
| -rw-r--r--  1 rse  rse  144629760 Oct 11 07:52 mtn.db
| $ sqlite mtn.db
| -- Loading resources from /u/rse/.sqliterc
| SQLite version 3.5.1
| Enter .help for instructions
| sqlite CREATE INDEX revision_certs__id_name_value ON revision_certs (id, 
name, value);
| sqlite CREATE INDEX public_keys__id ON public_keys (id);
| sqlite VACUUM;
| $ ls -l
| total 147024
| -rw-r--r--  1 rse  rse  150446080 Oct 11 07:53 mtn.db
| $ bc
| bc 1.06
| Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
| This is free software with ABSOLUTELY NO WARRANTY.
| For details type `warranty'.
| scale = 2
| 150446080/144629760
| 1.04

So, the indices increased the database by just 4% and this IMHO is more
or less negligible...

   Ralf S. Engelschall
   [EMAIL PROTECTED]
   www.engelschall.com



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-11 Thread Ralf S. Engelschall
On Wed, Oct 10, 2007, Bruce Stephens wrote:

 Chad Walstrom [EMAIL PROTECTED] writes:

 [...]
  Would there be a way to tell sqlite to ignore indices for given
  operations, such as pulls?

 That strikes me as a low-level question that should be ignored (at
 least unless it causes some measurable problem).

 One would hope that SQLite will only (or mostly, anyway) use indexes
 when they'll be beneficial.  Index updates strike me as more likely to
 be a problem, but I doubt it makes sense to suggest sometimes not
 updating indexes.

SQLite has a sufficient optimizer and uses indices only where it really
makes sense. See http://www.sqlite.org/optoverview.html for some
details.
   Ralf S. Engelschall
   [EMAIL PROTECTED]
   www.engelschall.com



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-11 Thread Nathaniel Smith
On Wed, Oct 10, 2007 at 08:07:26PM +0200, Ralf S. Engelschall wrote:
 What do we think? Should we investigate further and especially add
 additional indices like the above to the Monotone database schema? Or is
 there consensus that this type of speed optimization is just the root of
 furthcoming evil and at least at this time should be still ignored at
 all...

No way, 5x speedups for 1 line of code = ++good.  And while once upon
a time the database upgrade machinery was not savvy to indexes, that
got fixed long ago (exactly to add some similar indexes, IIRC).  We
just don't revisit whether we have the right indexes very often :-).
(This isn't to override any more specific problems people might raise
in this thread, though I didn't see any obvious showstoppers so far.)

[Err... though... waittasec.  Shouldn't the database upgrade machinery
be causing mtn to bomb out on your modified database (unrecognized
schema version or the like)?  Did you disable that or something?]

I do wonder where the benefit is coming from in this particular case.
The index on public_keys is almost certainly just irrelevant (though
it doesn't hurt and adds scalability), since you probably don't have
more than, say, 100 keys in there, and the whole table is almost
certainly cached.

And on revision_certs we already have:

CREATE INDEX revision_certs__id ON revision_certs (id);
CREATE INDEX revision_certs__name_value ON revision_certs (name, value);

So if we add an index on (id, anything) we should remove the index
on simple (id).  But also that index on simple (id) should be making
that search fast already, because once you've located the certs
for a particular rev, then you only have to do a sequential scan over
4 of them (in most cases) to find any particular one.  Perhaps
sqlite's optimizer has gotten *too* smart and is picking the wrong
index, doing a lookup by (name, value) and then sequential scan to
match the id?

...It may not be worth answering these questions as opposed to just
adding the stupid index, though.

-- Nathaniel

-- 
Details are all that matters; God dwells there, and you never get to
see Him if you don't struggle to get them right. -- Stephen Jay Gould


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-11 Thread Ralf S. Engelschall
On Wed, Oct 10, 2007, Nathaniel Smith wrote:

 [...]
 [Err... though... waittasec.  Shouldn't the database upgrade machinery
 be causing mtn to bomb out on your modified database (unrecognized
 schema version or the like)?  Did you disable that or something?]

Monotone showed just _warning_ messages but operated just fine.

 I do wonder where the benefit is coming from in this particular case.
 The index on public_keys is almost certainly just irrelevant (though
 it doesn't hurt and adds scalability), since you probably don't have
 more than, say, 100 keys in there, and the whole table is almost
 certainly cached.

 And on revision_certs we already have:

 CREATE INDEX revision_certs__id ON revision_certs (id);
 CREATE INDEX revision_certs__name_value ON revision_certs (name, value);

 So if we add an index on (id, anything) we should remove the index
 on simple (id).

In general, yes. The (id, name, value) index should be consulted
by SQLite also in case only id has to be looked up for a query.

 But also that index on simple (id) should be making
 that search fast already, because once you've located the certs
 for a particular rev, then you only have to do a sequential scan over
 4 of them (in most cases) to find any particular one.  Perhaps
 sqlite's optimizer has gotten *too* smart and is picking the wrong
 index, doing a lookup by (name, value) and then sequential scan to
 match the id?
 [...]

We can check this with a manual EXPLAIN query, I think.

   Ralf S. Engelschall
   [EMAIL PROTECTED]
   www.engelschall.com



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-11 Thread Markus Schiltknecht

Hi,

Ralf S. Engelschall wrote:

Monotone showed just _warning_ messages but operated just fine.


Strange, the following error message should be returned:

mtn: misuse: /home/markus/.monotone.36.ind.db appears to be a monotone 
database, but this version of

mtn: misuse: monotone does not recognize its schema.
mtn: misuse: you probably need a newer version of monotone.


In general, yes. The (id, name, value) index should be consulted
by SQLite also in case only id has to be looked up for a query.


As a quick test, I did add the new, combined index and dropped the old 
one (on id only). Doing that, my db has grown by about 2.4 percent:


-rw-r--r-- 1 markus markus 617M 2007-10-11 10:33 .monotone.36.db
-rw-r--r-- 1 markus markus 632M 2007-10-11 10:38 .monotone.36.ind.db


But I doubt very much that there are any gains. The index on 
revision_certs(id) should be enough, as we have only few (in most cases 
four) revision_certs per revision id.


And AFAICT sqlite optimizes properly, at least the explain result looks 
good:


# sqlite3 .monotone.36.db EXPLAIN SELECT id, name, value FROM 
revision_certs WHERE id = 'abc' AND name = 'def' AND value = 'ghi';  
e1.txt
# sqlite3 .monotone.36.ind.db EXPLAIN SELECT id, name, value FROM 
revision_certs WHERE id = 'abc' AND name = 'def' AND value = 'ghi';  
e2.txt


The two execution plans differ only slightly:

*** e1.txt  2007-10-11 10:53:15.492029559 +0200
--- e2.txt  2007-10-11 10:53:23.128009459 +0200
***
*** 1,26 
  0|Goto|0|22|
  1|Integer|0|0|
! 2|OpenRead|1|28|keyinfo(5,BINARY,BINARY)
! 3|SetNumColumns|1|6|
! 4|String8|0|0|def
  5|IsNull|-1|20|
! 6|String8|0|0|abc
  7|IsNull|-2|20|
  8|String8|0|0|ghi
  9|IsNull|-3|20|
! 10|MakeRecord|3|0|b
  11|MemStore|0|0|
  12|MoveGe|1|20|
  13|MemLoad|0|0|
  14|IdxGE|1|20|+
! 15|Column|1|1|
! 16|Column|1|0|
  17|Column|1|2|
  18|Callback|3|0|
  19|Next|1|13|
  20|Close|1|0|
  21|Halt|0|0|
  22|Transaction|0|0|
! 23|VerifyCookie|0|26|
  24|Goto|0|1|
  25|Noop|0|0|
--- 1,26 
  0|Goto|0|22|
  1|Integer|0|0|
! 2|OpenRead|1|39|keyinfo(3,BINARY,BINARY)
! 3|SetNumColumns|1|4|
! 4|String8|0|0|abc
  5|IsNull|-1|20|
! 6|String8|0|0|def
  7|IsNull|-2|20|
  8|String8|0|0|ghi
  9|IsNull|-3|20|
! 10|MakeRecord|3|0|bbb
  11|MemStore|0|0|
  12|MoveGe|1|20|
  13|MemLoad|0|0|
  14|IdxGE|1|20|+
! 15|Column|1|0|
! 16|Column|1|1|
  17|Column|1|2|
  18|Callback|3|0|
  19|Next|1|13|
  20|Close|1|0|
  21|Halt|0|0|
  22|Transaction|0|0|
! 23|VerifyCookie|0|29|
  24|Goto|0|1|
  25|Noop|0|0|


I'm not an expert reading these plans, but for sure both variants use an 
index scan and not a sequential scan. Thus I don't think it's worth 
changing these revision_certs indices.


Regards

Markus



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-11 Thread Nathaniel Smith
On Thu, Oct 11, 2007 at 11:17:17AM +0200, Markus Schiltknecht wrote:
 But I doubt very much that there are any gains. The index on 
 revision_certs(id) should be enough, as we have only few (in most cases 
 four) revision_certs per revision id.

Measurement beats guessing :-).

 I'm not an expert reading these plans, but for sure both variants use an 
 index scan and not a sequential scan. Thus I don't think it's worth 
 changing these revision_certs indices.

My question was which index it was choosing to use; it is trying to
satisfy a constraint like A=1, B=2, C=3, and it can pick to either
use an index on A then sequential scan for (B, C), or it can use an
index on (B, C) and then sequential scan for A.  (We do have an index
on (name, value).)  A plausible heuristic would be to use the index
that lets you satisfy more parts of the constraint, but that heuristic
would be wrong in this case, since in fact there are many certs with
NAME=branch, VALUE=net.venge.monotone, but few certs with ID=da39...

-- Nathaniel

-- 
The best book on programming is still Strunk and White.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-10 Thread Ben Walton
Indexes speed up read operations but slow down writes.  Which do we do
more of?  I'd optimize for the case that would benefit most.  I
suspect we do a fair amount of both.  In this case, maybe adding
indexes for mostly-read tables would be the way to go.

-Ben

On 10/10/07, Ralf S. Engelschall [EMAIL PROTECTED] wrote:
 Some Monotone operations really operate slower than what one would
 expect in the first spot. Hence, I've today looked at the run-time of
 a simple mtn update in a workspace which *is already* at h:n.v.m.
 This no-operation command internally performs a dozend times the
 following SQL queries:

   SELECT id, name, value, keypair, signature
  FROM revision_certs WHERE id = ? AND name = ? AND value = ?
   SELECT keydata FROM public_keys WHERE id = ?
   SELECT id FROM public_keys WHERE id = ?

 The problem is that revision_certs and public_keys have not the
 proper indices for those queries and hence full-table scans seem to
 be performed. I did a quick test and added the following to indices
 manually:

   CREATE INDEX revision_certs__id_name_value ON
revision_certs (id, name, value);
   CREATE INDEX public_keys__id ON
public_keys (id);

 This dropped down the total execution time of the mentioned mtn update
 command by over 80%! A time mtn update showed 0.450s on average before
 and 0.080s on average afterwards. And this was really not any type of
 in-depth analysis of the situation. I just created two obvious indices
 for the most prominent queries which mtn --debug update showed me.

 What do we think? Should we investigate further and especially add
 additional indices like the above to the Monotone database schema? Or is
 there consensus that this type of speed optimization is just the root of
 furthcoming evil and at least at this time should be still ignored at
 all...
Ralf S. Engelschall
[EMAIL PROTECTED]
www.engelschall.com



 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel



-- 
---
Ben Walton [EMAIL PROTECTED]

When one person suffers from a delusion, it is called insanity. When
many people suffer from a delusion it is called Religion.
Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance

---


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-10 Thread Ralf S. Engelschall
Some Monotone operations really operate slower than what one would
expect in the first spot. Hence, I've today looked at the run-time of
a simple mtn update in a workspace which *is already* at h:n.v.m.
This no-operation command internally performs a dozend times the
following SQL queries:

  SELECT id, name, value, keypair, signature
 FROM revision_certs WHERE id = ? AND name = ? AND value = ?
  SELECT keydata FROM public_keys WHERE id = ?
  SELECT id FROM public_keys WHERE id = ?

The problem is that revision_certs and public_keys have not the
proper indices for those queries and hence full-table scans seem to
be performed. I did a quick test and added the following to indices
manually:

  CREATE INDEX revision_certs__id_name_value ON
   revision_certs (id, name, value);
  CREATE INDEX public_keys__id ON
   public_keys (id);

This dropped down the total execution time of the mentioned mtn update
command by over 80%! A time mtn update showed 0.450s on average before
and 0.080s on average afterwards. And this was really not any type of
in-depth analysis of the situation. I just created two obvious indices
for the most prominent queries which mtn --debug update showed me.

What do we think? Should we investigate further and especially add
additional indices like the above to the Monotone database schema? Or is
there consensus that this type of speed optimization is just the root of
furthcoming evil and at least at this time should be still ignored at
all...
   Ralf S. Engelschall
   [EMAIL PROTECTED]
   www.engelschall.com



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-10 Thread Justin Patrin
On 10/10/07, Ben Walton [EMAIL PROTECTED] wrote:
 Indexes speed up read operations but slow down writes.  Which do we do
 more of?  I'd optimize for the case that would benefit most.  I
 suspect we do a fair amount of both.  In this case, maybe adding
 indexes for mostly-read tables would be the way to go.


The size increase on the DB should also be investigated here. How much
did your DB increase in size when the indexes were added?

 -Ben

 On 10/10/07, Ralf S. Engelschall [EMAIL PROTECTED] wrote:
  Some Monotone operations really operate slower than what one would
  expect in the first spot. Hence, I've today looked at the run-time of
  a simple mtn update in a workspace which *is already* at h:n.v.m.
  This no-operation command internally performs a dozend times the
  following SQL queries:
 
SELECT id, name, value, keypair, signature
   FROM revision_certs WHERE id = ? AND name = ? AND value = ?
SELECT keydata FROM public_keys WHERE id = ?
SELECT id FROM public_keys WHERE id = ?
 
  The problem is that revision_certs and public_keys have not the
  proper indices for those queries and hence full-table scans seem to
  be performed. I did a quick test and added the following to indices
  manually:
 
CREATE INDEX revision_certs__id_name_value ON
 revision_certs (id, name, value);
CREATE INDEX public_keys__id ON
 public_keys (id);
 
  This dropped down the total execution time of the mentioned mtn update
  command by over 80%! A time mtn update showed 0.450s on average before
  and 0.080s on average afterwards. And this was really not any type of
  in-depth analysis of the situation. I just created two obvious indices
  for the most prominent queries which mtn --debug update showed me.
 
  What do we think? Should we investigate further and especially add
  additional indices like the above to the Monotone database schema? Or is
  there consensus that this type of speed optimization is just the root of
  furthcoming evil and at least at this time should be still ignored at
  all...
 Ralf S. Engelschall
 [EMAIL PROTECTED]
 www.engelschall.com
 
 
 
  ___
  Monotone-devel mailing list
  Monotone-devel@nongnu.org
  http://lists.nongnu.org/mailman/listinfo/monotone-devel
 


 --
 ---
 Ben Walton [EMAIL PROTECTED]

 When one person suffers from a delusion, it is called insanity. When
 many people suffer from a delusion it is called Religion.
 Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance

 ---


 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel



-- 
Justin Patrin


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-10 Thread Chad Walstrom
Ralph wrote:
   CREATE INDEX revision_certs__id_name_value ON
revision_certs (id, name, value);
   CREATE INDEX public_keys__id ON
public_keys (id);

 This dropped down the total execution time of the mentioned mtn update
 command by over 80%!

Ben wrote:
 Indexes speed up read operations but slow down writes.

I can't imagine a lot of writes happening to public_keys. ;-)
revision_certs would get four or more inserts per commit, and obviously
sync operations would add a bunch.  Commits have generally been pretty
fast for me.  Would there be a way to tell sqlite to ignore indices for
given operations, such as pulls?

Chad


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Monotone speedup by adding additional database indices?

2007-10-10 Thread Bruce Stephens
Chad Walstrom [EMAIL PROTECTED] writes:

[...]

 Would there be a way to tell sqlite to ignore indices for given
 operations, such as pulls?

That strikes me as a low-level question that should be ignored (at
least unless it causes some measurable problem).

One would hope that SQLite will only (or mostly, anyway) use indexes
when they'll be beneficial.  Index updates strike me as more likely to
be a problem, but I doubt it makes sense to suggest sometimes not
updating indexes.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel