Re: [Firebird-devel] RFC: Data page allocation algorithm

Jim Starkey Thu, 26 Dec 2013 14:19:00 -0800

I'm not sure this will change much.  It will reduce the number of times 
that the PIP needs to be referenced, which will save a little CPU.  
Careful write needs to be preserved in all cases, so the PIP will always 
need to be written before the data pages and associated pointer pages.  
But since all allocated data pages are lower in precedence than the PIP, 
the PIP won't be forced out until a data page needs to be written, so it 
is unlikely that the PIP will be written multiple times using either scheme.


There may well be some benefit to allocating data pages contiguously 
when the database page size is smaller than the OS page size which, 
presumably, is usually the case.

I did do some experiments quite some time ago with prefetching (for 
which system I frankly don't remember).  The results were sufficiently 
disappointing that I gave it up.  The fundamental problem is that 
fetching things that you don't need interferes with fetching things that 
you do.  Predicting exactly what you will need next is really quite 
difficult.  Guessing wrong is significantly worse than not guessing at 
all.  Probably.

What might be a better thing to explore is an extent based allocation 
mechanism.   It probably could be made to reduce both allocations and 
references to the allocation pages, though it would probably be a 
relatively large architectural change -- and one difficult to migrate to 
without a full dump and reload.

An in-depth historical review of the various Unix file system though the 
ages might be interesting (and then it might not).  The issues are similar.

On 12/26/2013 8:36 AM, Vlad Khorsun wrote:
>      Hi, all
>
>      I will speak about how pages are allocated and released in Firebird. The 
> current algorithm is well
> known, simple and easy to both understanding and implementation:
> - we have single bitmap in database where every bit corresponds to one page,
> - this bitmap is stored at sequence of Page Inventory Pages (PIP) distributed 
> evenly thru the database,
> - pages are allocated and released one-by-one when necessary.
>
>      So far, so good. I consider that we could make some improvements in this 
> algorithm.
>
>      First thing is batch allocation\release. I.e. ability to allocate or 
> release a group of pages at once.
> It could lower concurrency for PIP pages and number of writes of PIP when 
> pages are allocated
> (because of careful writes we must write PIP page before new allocated 
> page(s)). Also it could
> make faster release of big blob's and GTT's (as private and more often case 
> of DROP TABLE).
>
>      So, the first part is ability to allocate and release group of pages at 
> once. Corresponding PIP
> page is changed once.
>
>      The second part is (based on first one) implementation of special 
> allocation policy for data pages.
> Some (or many) database engines already used it. Idea of algorithm below 
> inspired by MSSQL but
> there is a lot of Firebird's ODS specifics of course.
>
>      I offer to allocate data pages not one-by-one (as currently) but in 
> group of sequential ordered pages.
> Such group of pages is often called "extent". I offer to change page 
> allocation algorithm for tables as
> following:
> - if table is empty or small (have no full extent allocated) then data pages 
> is allocated one-by-one (as
>    currently)
> - if table already have at least one full extent allocated, next request for 
> new page will allocate the
>    whole extent of pages
> - size of extent is 8 pages
> - every such extent is aligned at 8-pages boundary
>
>      Such algoritm will reduce page-level fragmentation (all pages in extent 
> are adjacent), allows
> OS-level prefetch to work more efficient (it will read not just a bunch of 
> pages of random objects but
> pages related to the same table) and allows us in the future to read and 
> write in a large chunks
> making IO more efficient.
>
>      There was requests to implement big pages (64KB, 128KB etc) to make 
> reading more fast but
> such solution have some drawbacks:
> - big page good for readers but bad for writers - the more data we have on 
> page, the more concurrent
>    writers will wait for each other to change this page
> - compressed index nodes are walked sequentially when some key is searched in 
> index. Yes, jump
>    nodes in ODS 11 lower this issue but not eliminated it completely. Again, 
> big index pages is very bad
>    for concurrent writers
> - in Classic architecture different processes are often make exchange of 
> pages between each other
>    and exchange by big page obviously os more costly than exchange of small 
> page
>
>      I think that extents helps to solve problem of physical IO not making 
> concurrency worse at the same
> time. Implementation is ready for a few months and i consider it stable 
> enough, so it will not delay
> release of FB3. I can provide patch or compiled binaries (for Windows) for 
> testing to interested.
>
> Comments ?
>
> Vlad
>
>
> ------------------------------------------------------------------------------
> Rapidly troubleshoot problems before they affect your business. Most IT
> organizations don't have a clear picture of how application performance
> affects their revenue. With AppDynamics, you get 100% visibility into your
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
> Firebird-Devel mailing list, web interface at 
> https://lists.sourceforge.net/lists/listinfo/firebird-devel


------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Re: [Firebird-devel] RFC: Data page allocation algorithm

Reply via email to