We're seeing a lot more disk activity than expected on Linux when using 
sqlite3. We've run this same series of test on windows and the disk IO is 
much lower, which is the opposite of what I really expected. Below is
my scenario and perhaps someone can point out what I can do to fix this 
problem.

We are using sqlite version 3.5.7 . 

Shared cache is enabled using at program startup time:
sqlite3_enable_shared_cache(1);

We create a new database every hour. This database has 2 indexes and a ton 
of columns. The maximum size of each row is about 3,400 bytes. A lot of 
that is UTF8 strings. The database was actually pre-created
on a windows machine and is installed with the product. The stored 
database template file is then copied to the real database instead of 
actually "creating" it every hour.


The database is opened in read/write mode and we issue the following 
pragma's after opening:
PRAGMA read_uncommitted=ON
PRAGMA synchronous=OFF

I also call:
sqlite3_busy_timeout( db,TM_DB_BUSY_TIMEOUT);  // 30,000 is the value of 
the busy timeout.


We are then attempting to insert about 278 rows per second over the course 
of an hour. The batch size is 8,192 rows or every 5 minutes, whichever 
comes first. In this case we are committing the transaction approximately
every 30-40 seconds.

The database is grows from it's initial creation size of about 11,000 
bytes to about 1.5 gigabytes in the course of the hour. As the database 
size increases we are seeing significant increases in the CPU IOWait times 
that eventually slow down the insertion thread enough that our reception 
queue fills up and we have to throw out data to prevent a queue overflow. 
The disk we are using is 2 disks in Raid 0 configuration.

On windows we had a similar problem but we fixed it by issuing the PRAGMA 
synchronous=OFF to speed up the disk io. A corrupt database is a problem 
but performance is very important and we can tolerate a potential database 
corrupt in low chance scenarios.

I've tried everything I can think of to reduce the IOWait times and am at 
a loss. I've lowered the batch size and even tried the pragma PRAGMA 
journal_mode=MEMORY but that didn't seem to have any effect that I could 
determine.


If I watch the database directory as the test is run the Database file and 
the journal file are being written to constantly. Anyone have any ideas? 
This problem is limiting the scale of our product significantly and I need 
to find a solution to it. We are pretty much hitting a wall around 277 
records per second and we need to be more around 500-1000 records per 
second if possible.


Bret Patterson

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to