sort/merge exponential slow down

paul . matthews Tue, 20 Nov 2001 16:50:05 -0800

G'Day,

It appears that 'sort -m' slows down exponentially as the number of files
to it increases. This also appears to be why 'sort' is slow when dealing
with files that exceed the internal buffer size, that is, it is slow during
the "merging the 1Mb files back together" part of the run.


files     time
1    0min 08sec
2    0min 59sec
3    2min 36sec
4    4min 58sec
5    8min 37sec
:
12   1hr 14min

[All files are the same size, each containing 1million records.]

Not having not read the 'sort' code (and as such speculating), I think I
might know what the problem is.

Each time around the driving loop of the merge it evaluates each input
files current record for its fitness to be next output record. The fittest
record is output, and that file is advanced. The problem is, that the next
time around, all current records are evaluated again, even though there is
only one new record.

The solution would to remeber the fitness of each record in a sorted order,
and to insert the new record into position, ensuring a minimum number of
comparison tests are done (insertion sort?)

Hopefully, will be able to provide example of this in the next few days.

[If responding to this email send copies to [EMAIL PROTECTED]]

--
Paul Matthews
ETI Migrations
x446169


_______________________________________________
Bug-textutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-textutils

sort/merge exponential slow down

Reply via email to