restore speed is important for big databases. It's a pain to wait 7 hours for 
restore to complete (~3 hours for data import and ~4 hours for index creation). 
Usually those big databases are already on fastest disk arrays companies can 
afford (and in addition servers have lots of RAM for caching). From what I've 
seen in production, during index creation CPU usage is ~99% (there is very 
little disk access). So I believe that making index creation multi-threaded and 
allowing it to run on many cores simultaneously would help significantly. So in 
the above real life example on 16 core server we could potentially decrease 
index creation time from 4 hours to 15..30 minutes, thus whole restore process 
could become twice faster (3.5 instead of 7 hours)


-----Original Message-----
From: Ann Harrison [mailto:aharri...@nuodb.com] 
Sent: Friday, February 24, 2012 6:35 PM
To: For discussion among Firebird Developers
Subject: Re: [Firebird-devel] gbak improvement

Nick,

>
> When gbak finishes the actual data part of the restore it then creates all
> the indexes, could it do that in parallel based on the number of processors
> available.
> It seems daft that I have to wait for each index to be built, one at a time,
> when the server has several processors doing nothing

As others have said, the cost of building an index is primarily the
cost of reading the data, so building indexes in parallel on separate
threads is unlikely to help.  Building indexes incrementally as the
data is stored (as someone else suggested) will be slow and build less
dense indexes than the current method which requires sorting the data.
 What might help, but would require engine support is the ability to
build two indexes on the same data with a single read of a table.

Cheers,

Ann

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to