Dimitry,

> > When it is restoring data, it
> > fills a data page and goes on to the next one.  A large cache will
> > fill with pages that will not be referenced again until the indexes
> > are built.  To build indexes, Firebird reads records and sorts by
> > keys.  That might suggest that keeping millions of pages in cache
> > would improve performance by eliminating disk reads.
> 
>    Putting restored data to data pages and sorting streams at the same time
> also could eliminate reads.

While I agree, that would be ideal.

It, however, would require the engine to have special/specific functions to 
support gbak restores/operations.

Currently, gbak uses standard/public data access/write functions which do not 
have intelligence to process the data writes and build the sorting streams in 
one operation.

An intermediate step/compromise would be to implement a multi-index rebuild, 
like Interbase v6.5(?) introduced.  This would have all data writes occur as 
they do now, but the all the indexes for a table would be built via a single 
read pass of the table (versus the separate read pass for each index as is the 
case today).

The first step, would be to ensure that gbak would rebuilds indexes in an 
ordered approach -- ie. rebuild indexes for Table A before proceeding to 
indexes for Table B.  I have seen cases where indexes are rebuilt in an 
"intermixed" fashion.  The ordered approach would provide at least some hope 
that the OS cache could have the data pages cached from the 1st index rebuild 
when rebuilding subsequent indexes.


Sean

Reply via email to