[sqlite] Sharing some speed-test results BLOB vs flat-file... comments?

Pero Mirko Mon, 05 Sep 2011 08:02:24 -0700

Here are some speed tests for those interested. My main concern was speed of
reading rows from database.


The comparison was measured by QueryPerformanceCounter which is most precise
AFAIK on Windows. A couple of hundred runs for each test to eliminate
statistical error.

In tests there were 3 SQLite tables and one flat-file with binary data. The
measurement was how to place this data most effectively so it can be
retrieved as quickly as possible.

Flat-file had a simple | LENGTH | DATA | binary data in it. Seek to offset
position, read LENGTH then read binary DATA. As easy as that.

Database1 had 16 fields (some INT64, some TEXT with small amount of text
data, some REAL). Last field was 64bit offset in flat-file.
Database2 had also 16 fields except the very last field instead of offset
was a BLOB with actual binary data (about 5-50 kb of data).
Database3 was only ID + BLOB fields (2 fields). In other words - a
replacement for flat-file.

No indexes in any database file.

Here are the tests:

Test 1:
- SELECT * FROM db1 ORDER BY RANDOM() LIMIT 1
- use offset field to seek to file position
- read binary data len from file and read binary data

Test 2:
- SELECT * FROM db2 ORDER BY RANDOM() LIMIT 1
- read binary data from BLOB field in db2 already in result set

Test 3:
- SELECT [col1],[col2]... [except blob column] FROM db2 ORDER BY RANDOM()
LIMIT 1
- just measure speed of reading

Test 4:
- SELECT * FROM db1 ORDER BY RANDOM() LIMIT 1
- SELECT [blob_column] FROM db3 WHERE id = [col_id]
- use blob from database3 field

Test 5:
- SELECT [col1],[col2]... [except blob column] FROM db2 ORDER BY RANDOM()
LIMIT 1
- SELECT [blob_column] FROM db2 WHERE id=[col_id];
- use blob from second read

Results:
Test1:
237.216 - 100%

Test2:
7562.158 - 3187%

Test3:
320.832 - 135%

Test4:
246.821 - 104%

Test5:
388.566 - 163%

Comments -
test1 was the fastest. Reading a row from DB and then seeking to flat-file
and reading binary data is the fastest. no surprises there.

test4 came very close. it is quite fast to put BLOB in another database file

and then just read it based on ID. so it may be worth to actually put BLOB
in separate database file to be able to read it much faster than in the same

database file.

test5 was still much much faster than test2. although you'd expect that
reading BLOB along with the rest of fields is the same speed or even
faster - it was VERY VERY SLOW. to verify - I made a couple of queries in
SQLite admin tools - they also had about 2000ms per query which usually took

60-100ms to finish. strange?

test3 - it is obviously much faster to avoid putting BLOBs in same database
file even if not reading them - even when skipped reading speed is about 35%

slower than initial database file

So the conclusion may be:

- where speed is of concern - put BLOB data in flat-file and put just an
offset in database file where to seek to read BLOB data

- if you want to write safely, the best option is to put BLOB in another
database and use ID from first database to read from another

- BLOB data should be avoided along with the rest of smaller row data as it
seems to slow things down considerably, the smaller the row and the simpler
data structure - the better.

Your comments? Why is test2 so much slower than the rest of the bunch?
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Sharing some speed-test results BLOB vs flat-file... comments?

Reply via email to