> So, if you have a large scale system, you have two choices: block during
> the checkpoint, or suffer from huge WAL file (and a crash if the file is
> too big).

For a large scale system you have a third choice: use some other RDBMS
which is implemented in one process and has a much better control over
its data and much better communication between readers and writers.


Pavel

On Thu, Nov 25, 2010 at 4:07 AM, Yoni <[email protected]> wrote:
> Hi,
>
>  > In which case the writer would have to block on the reader,
>  > which is something that the current system doesn't suffer from.
> I agree when we are talking about small systems, which does not
> read/write allot. But, on a larger scale system, who have a several
> threads reading and writing to the DB non-stop, there always will be at
> least one reader who block the checkpoint from being completed, thus
> making the WAL file grow forever (unless of course, you block during the
> checkpoint - but then you make your system unresponsive).
>
> So, if you have a large scale system, you have two choices: block during
> the checkpoint, or suffer from huge WAL file (and a crash if the file is
> too big).
>
>  > You can never be sure when is the right time to wrap around.
>  > If you wrap around too early, you run the risk of hitting the wall
>  > put up by the earliest reader, who is still using the end of the
>  > WAL file.
> If we choose a large enough size to be free at the beginning of the file
> (can be proportional to the WAL file size of just a constant), the
> chances that this will happen a very low. and even when this will
> happen, the write wont have to block the reader, it just have to force
> running checkpoint (while they are reading a page), and wait until they
> finish read the specific page (not the whole read transaction), since
> the next page they need to read is already in the main DB file.
>
>  > The only solution around this would be to have more than 1 WAL file.
>  > To avoid hitting this same problem with multiple WAL files, you'd
>  > need to support an arbitrary N number of WAL files.
> I think my solution will work too. on PostgreSQL they already attend the
> problem of how many WAL files should exist. see here:
> http://www.sql.org/sql-database/postgresql/manual/wal-configuration.html.
>
>  > On one system of mine, where blocking is an issue, I
>  > buffer up the write messages, and flush them on a background thread.
>  > Of course this may not be practical for you...
> I think this kind of a solution can work of me too (I dont want to block
> the main thread EVER, and I have a steady stream of writes to the DB).
> Can you please elaborate how did you implemented your solution?
>
> Yoni.
>
>
> On 25/11/2010 8:42 AM, Ben Harper wrote:
>> A cyclical WAL file has this problem:
>>
>> You can never be sure when is the right time to wrap around.
>> If you wrap around too early, you run the risk of hitting the wall
>> put up by the earliest reader, who is still using the end of the
>> WAL file. In which case the writer would have to block on the reader,
>> which is something that the current system doesn't suffer from.
>> My guess is that a very high proportion of use cases would never
>> suffer this issue, provided they set an appropriate wrap around size
>> but this does place that burden on the implementer - of picking
>> the right heuristics.
>>
>> It is a pathological case, but imagine an implementer never tests
>> for this wraparound issue, and builds a deadlock into his app
>> that arises when a writer blocks on a reader. This is the kind of
>> unexpected behaviour that could really bite you.
>>
>> The only solution around this would be to have more than 1 WAL file.
>> To avoid hitting this same problem with multiple WAL files, you'd
>> need to support an arbitrary N number of WAL files. And that,
>> undoubtedly, is a more complex implementation than what currently
>> exists. I imagine it's quite a task for the SQLite developers to
>> get 100% branch test coverage.
>>
>> On one system of mine, where blocking is an issue, I
>> buffer up the write messages, and flush them on a background thread.
>> Of course this may not be practical for you...
>>
>> Ben
>>
>> -----Original Message-----
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of Yoni
>> Sent: 24 November 2010 04:46 PM
>> To: [email protected]
>> Subject: [sqlite] WAL file size
>>
>> Hi,
>>
>> I am currently developing a system, which have the following requirements:
>> 1. should be very fast and responsive
>> 2. run forever (or at least a very long time) without restart.
>> 3. have steady stream of writes to the DB.
>>
>> Currently I am using sqlite with one sql connection, and I run sql
>> checkpoint every once in a while (e.g. after 5 minutes, or after 10000
>> inserts to the DB) - I don't use auto checkpoint.
>> Since I run the checkpoint in the main thread, the system is
>> unresponsive for a long time (can take from 400ms to 20000ms!)
>>
>> What I would like to do, is to run the checkpoint in another thread, so
>> the main thread can keep working (and using sql), and do the long
>> checkpoint work in the background.
>>
>> The problem is that when I try to do it, the WAL file grows without limit.
>> I wrote a little program that do exactly this (insert rows as fast as C
>> allows, and in the background run checkpoints).
>> The result (after 30 secs and 400K records) was: DB size 19MB and WAL
>> size 451MB! (tried on linux Debian with sqlite 3.7.2, and on winXP with
>> same sqlite version).
>>
>> I think that the problem relies on the fact the during the checkpoint, I
>> keep writing to the WAL file, thus the checkpoint can never get to the
>> end of the file. And since WAL file is not cyclic, it will grow forever,
>> leaving the most of the file unused (the beginning), and using only the
>> end of the file for real data.
>>
>> I found this thread
>> http://www.mail-archive.com/[email protected]/msg56074.html,
>> discussing a similar problem, and also saw dan's commit
>> (http://www.sqlite.org/src/info/72787c010c) approaching the same
>> problem, but not solving it, since it requires to block all DB activity.
>>
>> I think this can be fixed, if sqlite use a cyclic WAL file.
>> I found that PostgreSQL also bumped into this, and used multiple WAL
>> files, in order to avoid this kind of problem
>> (http://www.sql.org/sql-database/postgresql/manual/wal-implementation.html).
>> Now, we do not need multiple files to do it, we just need to write to
>> the beginning of the WAL file, when we have enough space (e.g. at least
>> 16MB of unused space - configurable).
>> So the WAL file might look like this:
>>
>>                       logical end           logical start
>>                            |                    |
>>                            V                    V
>> ------------------------------------------------------------------
>> header | committed data  | unused space       | committed data   |==>
>> ------------------------------------------------------------------  |
>>          ^                                                            |
>>           ============================================================|
>>
>> when the data at the head of the file is logically at the end of it.
>>
>> When the writing is too fast, eventually they will get to logical start
>> of the file. in this point we can force a blocking checkpoint, since
>> this situation is very rare.
>>
>> This will allow limiting WAL file size, while avoiding blocking
>> checkpoint, and will increase responsiveness significantly!
>>
>> Yoni.
>> _______________________________________________
>> sqlite-users mailing list
>> [email protected]
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [email protected]
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to