restore speed is important for big databases. It's a pain to wait 7 hours for restore to complete (~3 hours for data import and ~4 hours for index creation). Usually those big databases are already on fastest disk arrays companies can afford (and in addition servers have lots of RAM for caching). From what I've seen in production, during index creation CPU usage is ~99% (there is very little disk access). So I believe that making index creation multi-threaded and allowing it to run on many cores simultaneously would help significantly. So in the above real life example on 16 core server we could potentially decrease index creation time from 4 hours to 15..30 minutes, thus whole restore process could become twice faster (3.5 instead of 7 hours)
-----Original Message----- From: Ann Harrison [mailto:aharri...@nuodb.com] Sent: Friday, February 24, 2012 6:35 PM To: For discussion among Firebird Developers Subject: Re: [Firebird-devel] gbak improvement Nick, > > When gbak finishes the actual data part of the restore it then creates all > the indexes, could it do that in parallel based on the number of processors > available. > It seems daft that I have to wait for each index to be built, one at a time, > when the server has several processors doing nothing As others have said, the cost of building an index is primarily the cost of reading the data, so building indexes in parallel on separate threads is unlikely to help. Building indexes incrementally as the data is stored (as someone else suggested) will be slow and build less dense indexes than the current method which requires sorting the data. What might help, but would require engine support is the ability to build two indexes on the same data with a single read of a table. Cheers, Ann ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel