Hello Sean,

>> I'm currently running a restore of a Firebird 2.5.2 TPC-H scale 1 database
>> (both, backup and restore are on a consumer SDD) and in that case the index
>> creation process is mainly CPU bound, basically fully utilizing 1 core of my
>> hexa-core desktop.
>
> Are you testing using IB XE3?

No. FB 2.5.2 snapshot.

> Being CPU bound is what I would expect.
>
>
>> Is this something where Firebird could benefit as well? If yes, would this be
>> possible across all architectures or does this need a shared page cache etc.
>
> Yes, this is something that FB could benefit from.
>
> Yes, it should be possible across all architectures.
>
> No, it should not require a shared page cache.  I don't think the cache 
> factors into it.
>
>
>> Another improvement for the index creation process during a restore might
>> be to allow a temporary bigger page cache, so perhaps a -cache, -buffers or
>> whatever option might be helpful. Pretty much what e.g isql already allows at
>> connection level etc.
>
> I don't see how a larger cache or more buffers would help the index creation 
> process.

I don't know the internals, but one can get some insight on the restore 
process when running a trace against the database while being restored. ;-)

With a higher page cache, one can see in Task Manager a steady increase 
of RAM usage while restoring records, which *might* get re-used during 
index creation, because it doesn't need to pull data pages from the disk 
again when creating the index. Especially for largish indexes, trace 
shows a much lower value on "number of reads" and for certain indexes 
(well, you don't see regular create index statements, but I/O statistics 
for COMMIT_TRANSACTION trace events) ~ 100% improvement in execution time.

The test included restoring the same scale 1 TPC-H database with setting 
75 and 100000 page buffers before running the backup, thus the database 
backup had a different page buffers value.

In sum, according to trace: While the 75 page buffers restore took 
601030ms, the 100000 page buffers restore took 375253ms.

gbak usage for restoring the database has made use of the Services 
Manager (-service switch)

Just an observation.

I can provide a raw trace output for both scenarios if someone is 
interested.


Regards,
Thomas

> The ideal process would have a single pass of a table collect data from 
> columns in all indexes, into a temporary file (or multiple), then pass that 
> data to appropriate sorting/index create routines.  In which case, the number 
> of pages not going to make a significant difference because each page would 
> be read only once.  The cache is beneficial when multiple page accesses are 
> involved.



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to