Kyle Sallee wrote:
Thanks for the fast response. Right or wrong POSIX is POSIX, Yet a LF as part of a line does seem worth counting. A line must terminate with a line feed. Yet a string does not require a line feed.
---- how is that important? Sort sorts lines, not strings.
but for actual sorting tasks; would consecutive LF be common?
---- Anytime you have multiple blank lines in a row, you have consecutive line feeds.
If the sort function's compare function was inlined rather than called from a pointer then a modest 5% performance boon could become. To implement some creativity would be required.
---- I'm sure if you submitted a working patch + documentation + rights assigned to GNU, and first born child given to FSF, the coreutil maintainers would consider it. (ok maybe the first born isn't required these days, I think some POSIX update changed that)
If the input data was not copied and string conversion was omitted then another 5% performance boon could become.
---- patches patches patches...
The sort method used is not known. However, a merge sort has some surprisingly frequent uhm code paths like a 3 way comparison which can be implemented for 2 or 3 comparisons and 0 to 4 memory moves.
um... it's open source... note -- something that might affect your algorithm design: it has to handle sort input that is greater than the size of memory and in different character encodings. *cheers* -linda
