On Tue, 2003-11-04 at 17:12, D. Richard Hipp wrote:
> I'm not sure what you or Mrs.Brisby mean by "packing".

My apologies then; my statement was made before espresso so I don't
doubt I was making much sense.

Here goes:

Because MySQL stores the literal values of integers as a four octet
value the number of pages that an index would cover could be drastically
smaller-- even if otherwise using the exact same format (on a factor of
4:13) -- I didn't even bother going into possibly faster comparisons as
that's even more obvious.

The result? More values per page so less pages, so less work for the
block cache of the operating system. Index skips can be made faster as
well.

I felt all of this was quite obvious, and shouldn't surprise anyone to
see a 10x speed improvement for something like this. As noted in prior
messages, you'll remember I'm not exactly in favor of giving up
typelessness (even if I'd like automatic data coercion using the table
"field type").

I am more interested in talking about the second part- re-engineering
indexing.

MySQL has stated in-documentation that it uses a B-tree for it's index.
I think this is a mistake- especially for larger indexes. Using several
B-trees attached to a hash-table is much faster (if the hash is fast or
your data is uniform).

I have an email client that uses a rather large index for searching
email- Moving from a straight btree to multiple btrees attached to
hashtables (actually: I used db4's Btree mode in perl and used multiple
files- one for each bucket) I was able to achieve quite a significant
performance increase (on the order of 3-5x). I found 7 buckets to be
optimum although I did not bother to find a function that would describe
what was optimal...

All in all, it took about an hour of perl hacking to build it out, test
it, and start using it in-place of what I was presently using.

Unfortunately, doing this for sqlite requires a rather major change to
the indexing system- and results in incompatible file formats (or at the
very least, unusable indexes on older sqlite builds)-- which is why when
you mentioned the possibility for 3.0.0 I decided to bring it up :)

[oh and BTW at the bottom of section 4.2 I think you meant
lexicographical, not "lexigraphical"]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to