----- Forwarded by Ben Carlyle/AU/IRSA/Rail on 31/03/2004 02:38 PM -----

Ben Carlyle
31/03/2004 02:34 PM


        To:     "D. Richard Hipp" <[EMAIL PROTECTED]>@CORP
        cc: 
        Subject:        Re: [sqlite] Concurrency Proposal

Peoples,





"D. Richard Hipp" <[EMAIL PROTECTED]>
31/03/2004 01:21 PM

 
        To:     [EMAIL PROTECTED]
        cc: 
        Subject:        Re: [sqlite] Concurrency Proposal


> I think the gist of Ben's proposal is as follows
> (please correct me if I am wrong):
>     Writes do not modify the main database file until
>     they are ready to commit - meaning that reader can
>     continue to read from the file during lengthy
>     transactions.  The journal contains modified pages,
>     not the original pages, and is thus a roll-forward
>     journal rather than a roll-back journal.

That's pretty accurate. If the modified pages could all be kept in memory 
the pager could contain either the new or the old pages. It wouldn't 
really matter, but if pages need to be flushed from memory they would have 
to be put into the journal.

> The main difficulty in this approach is locating
> a particular page in the journal file when it needs
> to be reloaded.  For example, suppose page X is read
> from the main database, modified, then written into
> the journal.  Due to cache pressure, that page is then
> flushed from the in-memory cache.  Later on, we need
> to read page X again.  How do we locate the position
> in the journal where the current data for page X was
> written so that we can reread it?

I'm not surprised this has come up as a problem. I wasn't sure whether 
sqlite flushed changed pages back to the original database file. Since the 
whole algorithm hinges off the main file being changed any overflow would 
have to be written to the journal.

Hmm... an on-disk hash or maybe ordered pages?

If this were to proceed, ordered pages might be the simplest approach but 
wouldn't perform well. The algorithm would simply be to use binary 
searches over the journal file's known-sized blocks to find the pages. 
This would also have to be done to find an insert point for new pages and 
then everything afterwards would have to be shifted to accommodate. Hashes 
would be able to place things explicitly, but you'd have empty pages (of 
undefined content) throughout the journal that you'd have to explicitly 
write "this is not a page" headers into.

Hmm.

Obviously the poor performance of ordered pages would only become a factor 
if your in-memory cache were exceeded so it could be optimised... but 
there comes a point where you have a few gig of database and a 
corresponding couple-o gig of index that all need to be updated every time 
you do an insert. Large databases lead to large changes, at least every 
once in a while.

Doug's shadow pager idea looks like it would effectively be an optimised 
form of the ordered pages concept[1]. If the journal had an on-disk linked 
list of ordered page numbers and their in-journal indicies searching and 
data modification would occur mostly in-memory, and would could be done 
with constant memory-usage:

Pager file :-
<Header block><Index Header,102=500,203=0,605=1...,index continued at page 
501><Page 203><page 605>...<page 102><Index header,....>

A BTree with the corresponding navel-gazing might be better, but consider 
the following search algorithm:
Read first index block
While the greatest page in this block is less than the target page, skip 
to the next index block.
Do binary search in memory

Anyway :) It's just some ideas that fell out of my head. It may or may not 
be practical.

Benjamin.
[1] I haven't read the proposal in detail, so I apologise if I'm making 
misrepresentations




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to