On Thu, 15 Apr 2004, Doug Currie wrote: > >Thursday, April 15, 2004, 9:16:01 AM, Christian Smith wrote: > >> On Wed, 14 Apr 2004, Doug Currie wrote: > >>>One way to get table level locking without a great deal of pain is to >>>integrate the shadow paging ideas with BTree management. Rather than >>>using page tables for the shadow pages, use the BTrees themselves. >>>This means that any change to a BTree requires changes along the >>>entire path back to the root so that only free pages are used to store >>>new data, including the BTree itself. Writing the root page(s) of the >>>BTree(s) commits the changes to that table (these tables). > >> Actually, this gets my vote. Keeps the pager layer the same, > >The pager gets *much* simpler because it doesn't need to make a log >file. The log file is not necessary because writes only go to free >pages. > >Well, there would be one write-ahead log. It's needed to prevent >partial updates to the page number pointers to the roots page(s) of >the BTree(s) at commit. This log is created at commit time, and is >much simpler and much smaller than the present log file.
I'd have thought it'd be better to preserve the pager layer as is. If it ain't broke... > [...] > >This design works well. It has the advantage (compared with shadow >pager) that reads are not burdened with page table indirection. It has >the potential disadvantage (compared with SQLite 2.8) that small >writes can modify several pages (based on the depth of the BTree). So for reads, there is basically no extra burden (other than the caching of the initial tree roots,) and writing will be slightly slower, but with decreasing penalty as updates get bigger, and probably insignificant against dumping of the page cache when transactions are finished, and all of course in parallel with reads, so overall performance should be improve in many scenarios. It would of course be limited, like shadow paging, to a single address space (writes would block reads in other address spaces.) > >I used this design in a proprietary database in the late 1980s. The >only reason I didn't consider modifying SQLite this way up until now >is that I was anticipating BTree changes for 3.0, so I confined my >efforts to the pager layer. Given this design, if it is adopted, it would also be trivial (and free in terms of IO) to maintain a running total of records in a given btree as well, as was requested some weeks back, as any new/deleted records would update the btree to the root anyway. Is this design feasible given the time constraints on 3.0? I've not studied the btree layer in much detail, so don't know how much existing code would need to change. Christian -- /"\ \ / ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL X - AGAINST MS ATTACHMENTS / \ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]