Julien said in the original post:
> I've been doing some test with a ~100 MB database

on a real fast physical disk it would take < 1 sec to read
the entire database. On a slow drive like on my laptop,
it takes 3 sec, and on my external USB drive it takes 5 sec.

> Now the first time I run my application, it takes quite a long time
> (30s) to load the blobs.

This means we are at a transfer speed of 3Mb/s. This indicates that
some seeks are involved that slow down the process.

The seek time of a flash drive is a at least 4 orders of magnitude
faster than the best mechanical disk. If you have a really fast
mechanical disk with a seek time of 1 ms, this means you can do
1000 seeks per second.

Now, let's assume we have a flash drive with 30Mb/s transfer rate and
<1 ns seek time, the operation would take 3 seconds.

> I tried to VACUUM the DB, it only reduced the timings to 16s

This indicates that there is still quite some time lost with seeks.

I think there is a general misconception on what the performance
problems of physical disk drives are. Transfer speed is often not
the problem, but seeks are. Databases like sqlite are optimized
under the assumption that it is much better to do a few table
lookups to avoid disk IO. With a sqlite database, there is no
way to read the entire database in physical order. You always
reed the table in ROW_ID order (if you are lucky you might have
filled the database in key order or you have vacuumed it). However,
if you have some more complex queries, the database engine happily
seeks around on the disk.

Some simple math: let's assume we have a 100mb database, and a
drive with 30mb transfer rate and 10ms seek. Then (on a cold
database) reading the entire database is much faster than
1000 seeks. However, once the data is in (cache) memory the
rules change. It makes sense to do some index lookups to
access the data. So, ideally, the operating system disk
cache should be disabled for the database files and the
database engine should have full control over the file
and the disk cache. The database engine would need
some knowledge about the attached disk (it's seek time and
transfer rate) to make good decisions. Then the database engine
could really optimize the database access depending on
where the data is: in cache memory or on the physical disk.

And, yes, write speed is a problem with solid state drives....

Michael

John Stanton wrote:
Michael Scharf wrote:
[EMAIL PROTECTED] wrote:
"Trevor Talbot" <[EMAIL PROTECTED]> wrote:
Beyond that, I'm not aware of anything that would help.


All good advice.  But you left off the obvious:  Get a
faster disk drive.  ;-)

...which does not really help unless you buy a very expensive
flash disk drive. How much faster is a *really* fast spinning disk?

Trevo, have you tried to put your database on a (fast!) USB stick.
It should be much faster in 'seeking' but is slower in the
data transfer. This would give some indication if the access
is limited by seek or the disk reading speed.


Michael

A USB flash drive is not particularly fast due to the limited write speed of flash memory and buss speed. A fast disk spins at 15,000 rpm, double the speed of the higher end 7,500 rpm disks and almost 3 times the speed of the regular 5,400 rpm devices.

If you want to simulate a disk with no latency set up a RAM drive.

There is a physical constraint here. If you want to verify that your data is safely written to non-volatile storage you have to live with the latency. If that is unimportant to you you can relax the ACID requirements and get faster writes, but when you do that there is no crying over lost data after a crash.

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------



--
http://MichaelScharf.blogspot.com/


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to