Background. We have about a billion records that we need to merge from 100 files. We have devised a customised algorithm that reduces the runtime from weeks (using a RDBMS) to about 12 hours. The main bottleneck currently in this is a need to sort an array (of arrays) composed of 100 items.
[ [ 'a_str', val ], [ 'b_str', val ], ... [ '100th_str', val ], ] The sort occurs on the 'X_str' string ie $ary->[N]->[0] Currently we are using perls native sort which of course works fine, however in 99% of the cases only the $ary->[0]->[0] element will be out of place. The remainder of the 'list' will have remained sorted. Native perl sort can not take advantage of this, but we would like to, as on average this will mean only 50 cmps (1-99) are required to correctly position the element. Sort requires at least 99 every time so it should be much faster to do this. Here is an implementation in perl to show what we want to achieve in case that was not clear enough ##moveFirst n=200000 Took 56 seconds 3571.43/sec #sort n=200000 Took 85 seconds 2352.94/sec sub moveFirstElementIntoPosition { my $ref = shift; # if we are already in place return return if $ref->[0]->[0] le $ref->[1]->[0]; # bubble it into position for (1..$#{$ref}) { if ( $ref->[$_-1]->[0] le $ref->[$_]->[0] ) { return } else { ( $ref->[$_-1], $ref->[$_] ) = ( $ref->[$_], $ref->[$_-1] ); } } } my $s = time(); my $n = 200_000; my @test = map { [ $_, $a++ ] } ( 'ca', 'aa' .. 'bz', 'cb' .. 'dz' ); for ( 1 .. $n ) { my @ary = @test; moveFirstElementIntoPosition( [EMAIL PROTECTED] ); } $s = time() - $s; printf "moveFirst n=$n Took %d seconds %.2f/sec\n", $s, $n/$s; $s = time(); for ( 1 .. $n ) { my @ary = @test; @ary = sort { $a->[0] cmp $b->[0] } @ary; } $s = time() - $s; printf "sort n=$n Took %d seconds %.2f/sec\n", $s, $n/$s; As you can see even implemented in Perl the move algorithm beats sort by some 30%. I would expect to be able to beat sort by 50%+ if we implemented something similar in XS. With a billion sorts even a small increase in throughput speed is significant, thus the desire. If the data existed in a C struct I would have no problem in implementing this, but of course it does not and although I have used some XS to implement a number of algorithms where speed was of the essence I have not tackled anything similar in the past. In other words my XS is not currently up to it. I have had a look at sort.c in the source and the cmp function(s). I would welcome any suggestions as to the most efficient way to implement this moveFirstElementIntoPosition() function. I can see ways to do using my limited XS by returning a new array ref but ideally would just like to modify the array passed to the function in place to avoid the overheads of data copying and just have a void return. Can anyone point me in the right direction, perhaps to some XS code that modifies an array in place or anything else that might be useful. Kind Regards Dr James Freeman