Hi, all I will speak about how pages are allocated and released in Firebird. The current algorithm is well known, simple and easy to both understanding and implementation: - we have single bitmap in database where every bit corresponds to one page, - this bitmap is stored at sequence of Page Inventory Pages (PIP) distributed evenly thru the database, - pages are allocated and released one-by-one when necessary.
So far, so good. I consider that we could make some improvements in this algorithm. First thing is batch allocation\release. I.e. ability to allocate or release a group of pages at once. It could lower concurrency for PIP pages and number of writes of PIP when pages are allocated (because of careful writes we must write PIP page before new allocated page(s)). Also it could make faster release of big blob's and GTT's (as private and more often case of DROP TABLE). So, the first part is ability to allocate and release group of pages at once. Corresponding PIP page is changed once. The second part is (based on first one) implementation of special allocation policy for data pages. Some (or many) database engines already used it. Idea of algorithm below inspired by MSSQL but there is a lot of Firebird's ODS specifics of course. I offer to allocate data pages not one-by-one (as currently) but in group of sequential ordered pages. Such group of pages is often called "extent". I offer to change page allocation algorithm for tables as following: - if table is empty or small (have no full extent allocated) then data pages is allocated one-by-one (as currently) - if table already have at least one full extent allocated, next request for new page will allocate the whole extent of pages - size of extent is 8 pages - every such extent is aligned at 8-pages boundary Such algoritm will reduce page-level fragmentation (all pages in extent are adjacent), allows OS-level prefetch to work more efficient (it will read not just a bunch of pages of random objects but pages related to the same table) and allows us in the future to read and write in a large chunks making IO more efficient. There was requests to implement big pages (64KB, 128KB etc) to make reading more fast but such solution have some drawbacks: - big page good for readers but bad for writers - the more data we have on page, the more concurrent writers will wait for each other to change this page - compressed index nodes are walked sequentially when some key is searched in index. Yes, jump nodes in ODS 11 lower this issue but not eliminated it completely. Again, big index pages is very bad for concurrent writers - in Classic architecture different processes are often make exchange of pages between each other and exchange by big page obviously os more costly than exchange of small page I think that extents helps to solve problem of physical IO not making concurrency worse at the same time. Implementation is ready for a few months and i consider it stable enough, so it will not delay release of FB3. I can provide patch or compiled binaries (for Windows) for testing to interested. Comments ? Vlad ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel