I have been able to do some more investigation here. What I was seeing
originally made me suspicious that a writer was skipping over parts of the
WAL file during the time that a checkpoint is unable to proceed due to a
reader. After further investigation, however, I have determined that this is
NOT the case in my scenario.

What I am seeing may be of interest, however, so I will explain what I have
determined is going on in my scenario.

This is a stress test scenario. I am inserting rows as quick as my C code
allows. I am writing an average of around 80 rows a second. This writing is
going on constantly.  A checkpoint occurs / wants to occur every few seconds
on average with this amount of activity and the size of the rows.

Then every 30 seconds from another process, I do a read query of the last
10,000 rows and I have it such that this query takes roughly 15 seconds (I
put some delay in the processing to mimic a longer operation such as from
ruby code from a web server). This read process opens a db connection, does
the query, closes the connection, and exits the process.

The WAL file grows just about every time the read process kicks in. It does
not grow at all in between invocations of the read process.

What I have determined (via using sqlite3_wal_hook, catching some statistics
in the callback as well as calling sqlite3_wal_checkpoint from the callback
myself) is that each time I am writing during a read operation (so a
checkpoint can't complete), I am able to write more records than the last
time on average during this period. So, more records written during the time
a checkpoint can't happen is going to relate to the WAL file needing to
grow.

I am periodically doing the same type of read query each time I do one and
it is taking about the same amount of time as measured in my calling
process. I really don't see an increase in the time it takes to do the reads
(for example in my last run reads took 18sec, 16sec, 16sec, 16sec, 16sec,
15sec, 16sec, 17sec, 16sec, 16sec, 14sec). However, I am measuring more
writes being able to complete from my writer during these times
progressively. The current read query is reading the last 10,000 rows based
on the internal rowid primary key.  Since I am getting more writes every
time during this period, that is why the WAL file grows every time during
this period.

I am not sure why more writes are progressively able to occur each time
these reads are happening, but this is the cause of my continued WAL file
growth in this scenario.

In a production system, I would not expect to be constantly pounding so many
events into the system non stop all the time.   Still, I would like to be
able to handle the stress case so that I know things are solid.

I can also adjust the threshold to checkpoint in larger chunks than the
default. This actually won't solve this (and will make the standard size WAL
file larger), but I do think it can  reduce the impacts of the phenomenon in
my scenario.

Is this type of scenario just not right for WAL mode?

Best Regards,

Bob
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to