"john s wolter" <[EMAIL PROTECTED]> wrote:
> I have an old Clipper DBF type application....  Clipper DBF file 
> indexes are a simple key-value and a pointer into a DBF file's 
> records.  That is why these lookups are so fast....
> 
> I am of the opinion that full SQL engine's set based extractions are too
> slow ....
> 
> I am wondering if SQLite has features that could be used to make these
> lookups just as quick?  Can you point me , so to speak, in the correct
> direction?
> 

Why do you think a full SQL engine will be too slow?  Under 
the hood, every SQL database engine I know of (including 
SQLite) uses simple key/value btrees just like Clipper DBF.  
They just hide the ugly details from the programmer. Working 
with a key/value database (like Berkeley DB for example) 
compared to working in SQL is similar to the difference in 
programming in assembly language versus a high-level language.
The level of abstraction is higher, but the same CPU is used 
to run them both.

As a performance test, I created a "key/value" table in SQLite
and filled it with 100000 rows of random data.  20MB of data
roughly.  The table is:

   CREATE TABLE t1(key, value, UNIQUE(key,value));

I filled it with 100000 iterations of this:

   INSERT INTO t1 VALUES(hex(randomblob(10)), hex(randomblob(30)));

So each key was a random 20-character string and each data was a
random 60-character string.  A real application would contain more
useful data, obviously.  But this seems like a good test set for
performance.

To measure performance, I did this:

   SELECT b FROM t1 WHERE a=(SELECT hex(randomblob(10)));

This simulates a lookup of a single value from the table.  Actually,
the lookup probably missed each time.  But the time needed to do a
failed lookup and a successful lookup is the same, so I don't much
care.

On 1000 iterations the average time to do a lookup was 38 microseconds.

To take the test further, consider the following query:

   SELECT b FROM t1 WHERE a>=(SELECT hex(randomblob(10))) LIMIT 10;

In this case we are not looking up a single element, but 10 elements
that are near a randomly choosen value.  In this case, 10 rows of
result are nearly always returned even if the key never matches
exactly.

In 1000 iterations the average time per query was 96 microseconds.

So we are talking in excess of 10000 queries per second on a 20MB
database.  Performance will fall off as the logorithm of the size
of the database file, of course.  And it will be much slower with
a cold cache due to disk latency.  But how many queries per second
do you really need in order to keep a service desk operator happy?

--
D. Richard Hipp <[EMAIL PROTECTED]>


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to