On 20 September 2016 at 14:29, Simon Slavin <slav...@bigfraud.org> wrote:
> > On 20 Sep 2016, at 8:59am, Jose Arroyo <jose.m.arroyo...@gmail.com> wrote: > > > My current issue is that I'm encountering WAL checkpoint starvation and > I'm > > trying to figure out a way to create checkpoint gaps only when needed. > > Can I ask why you're trying to have as few checkpoints as possible ? The > fewer you have, the more time they take, the more annoying it is to your > users, the more processes reach their timeout times. > > I can understand that there are times when you want as few checkpoints as > possible, but if you explain why then we may be able to suggest a strategy > which doesn't involve hacking SQLite and becoming dependent on a particular > version behaving a specific way. > > For instance, you could use "PRAGMA wal_autocheckpoint=N" and use a > smaller value for N than 1000. That sort-of does what you asked since it > will automatically fire when the WAL journal reaches a certain size (of > unwritten data). > > Another common strategy is that every process which does writing has a > checkpoint every 1000 transactions. Or 100. or 10. Whatever suits. > Again, it is a system which takes care of itself and will not stop working > if a new version of SQLite handles WAL files differently. > > Simon. > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > Sorry, I used the wrong word, I meant "reader gaps" not "checkpoint gaps" I don't exactly want to minimize the number of checkpoints, what I want is figuring out a way to force a successful checkpoint if the WAL starts growing unbounded, by leaving a "reader gap" without the writer process itself being the one responsible for handling this "reader gap". My problem is that with the dynamics of how the writer and the reader processes interact, there is a certain DB size threshold after which long periods of checkpoint starvation occur. The writer process (there is only on in my scenario) actually implements it's own wal checkpointing mechanism, where it tries to checkpoint in a separate thread when wal_pages > 1000, and tries to checkpoint synchronously while blocking all queued updates if ever wal_pages > 10000. However, this writer process has no control over the reader processes, so checkpoints may not necessary complete successfully ("If another connection has a read transaction open, then the checkpoint cannot reset the WAL file because doing so might delete content out from under the reader"). A large WAL means slow queries, which means fewer gaps where a checkpoint could finish successfully, so this issue is a bit of a vicious circle. I can't modify the writer process too much, and without a busy_handler that notifies all readers to stop for a while any wal_checkpoints that block readers (FULL, RESTART, TRUNCATE) just end up "timeout'ing" during these "checkpoint starvation" periods (this is linked to the SPARQL layer on top of SQLite that I'm working with, I think). So that's why I'm looking for a way of detecting unusual WAL growth moments, notify all readers so that they leave just enough time for a checkpoint to complete but avoid doing this unless absolutely necessary. If the writer process does TRUNCATE checkpoints after a certain WAL size, monitoring the WAL file size directly, as Paul mentioned, might be a good enough approach. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users