Stephen Toney wrote:
On Tue, 2007-03-27 at 11:53 -0600, Dennis Cote wrote:
Stephen Toney wrote:
Meta-question: this is the second time I've asked this question. The
first was about a month ago and got not a single reply. Is there
something wrong with my postings? Or is this just not an interesting
topic?
Stephen,
There is nothing wrong with your question. In fact it was very clearly
stated.
I just don't have any answers for you. I suspect that others are in the
same boat.
It is normal for an index creation operation to take some time since it
is inserting index records into a btree in random order. It involves
many updates to pages throughout the index.
It does seem strange that you are seeing such different times for the
two cases (single column vs compound index). Are you sure about the
times you posted? Were they indexing the exact same table? Were both the
indexes created after the fill operation during your timing tests (ie.
fill + index(single) and then fill + index(compound)) and not one after
the other (i.e. fill + index(single) + index(compound))?
Thanks, Dennis,
I feel reasonably confident about my numbers, but since the system is in
development, other factors may have changed. I would re-test before
asking anyone else to try to replicate this.
The timings were done by recreating the db content each time with no
indexes, then building either the multi-column index or the two
single-column indexes.
Can you supply sample data if someone wants to try some test of their
own? It wouldn't have to be the full data set you are using. We could
use a subset to get relative timings in the seconds range rather than
minutes (This assumes that you are not running into some cache size
problems that slow down the larger data set disproportionately).
Yes, I'd be glad to supply sample data, but do not have an FTP site. I'm
not sure how big the sample should be. Maybe I should test some samples
myself before asking anyone else to. As you say, the problem may not
exist at a smaller size.
Another reason for my puzzlement -- although I love SQLite, my
expectations are based on using Foxpro for many years. Foxpro's indexing
speed for a problem like this is about 10 - 20 times faster. And I've
never come across a Foxpro database where the indexing took longer than
the loading -- and Foxpro is blazingly fast at loading. So I assumed
that every DBMS would be faster at indexing than loading. (Both use
B-trees for indexes, so I believe it's a meaningful comparison. But
maybe in this case the single-file architecture of SQLite works against
it; Foxpro uses a binary format for its B-trees.)
In other words, since SQLite is so fast at some things, I expect it to
be fast at all things. Is this unreasonable? Is it optimized for fast
retrieval and not indexing?
Thanks!
You are comparing an ACID RDBMS with rollback and commit with a much
simpler situation. For example we developed a data storage software
product which was of the same generation as Foxpro. I wrote a fast
indexing program which would create a 10 million entry B-Tree index in
less than a minute on a very slow machine, but it had minimal features,
unlike Sqlite.
I assume that if Foxpro had all the features you want now you would not
be changing from it.
Have you thought of doing the index creation as a background process
unseen by the user?
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------