On 03/03/2013 05:32 PM, James Dowdell wrote: > I'm considering writing a patch for sort.c to add a new feature, related to a > stackoverflow inquiry I wrote > (http://stackoverflow.com/questions/14882897/what-standard-commands-can-i-use-to-print-just-the-first-few-lines-of-sorted-out). > > This would be my first patch, and this is my first time messaging a gnu list; > apologies if I'm "doing it wrong." > > I use GNU sort a lot, and routinely find myself in the situation of > executing, e.g.: > > $ sort ... | head -n 1000 > > This can be very unnecessarily slow when the input is huge, because sort does > a lot of work that head throws away. > > I propose a new parameter, "-H, --head=NLINES", which has sort only print at > most NLINES of output. More than just a filter at the end like | head, it > would avoid unnecessary sorting on more than NLINES of output. > > I want to know the procedure for submitting a patch, and the likelihood that > such a patch would even be considered, before I spend time to parse the whole > sort.c file and propose a complete and rigorous solution (which would be > analogous to submitting the patch). From a quick glance at the source, my > current strategy would be to alter the merge nodes when this parameter is set > so that the number of lines listed per node is clamped to NLINES. While less > efficient than an ideal solution, it would be more efficient than what's > currently in place, and has the benefits of minimal code edits and negligible > negative performance impact on mainstream use when the parameter is not > passed. > > All feedback welcome, thank you.
There is general agreement that this is worthwhile. Please read these first: http://lists.gnu.org/archive/html/bug-coreutils/2004-04/msg00157.html http://lists.gnu.org/archive/html/bug-coreutils/2009-07/msg00019.html As for contributing the patch, it would be much appreciated. For contribution details, please see the HACKING file: http://git.sv.gnu.org/cgit/coreutils.git/plain/HACKING In summary you would submit a patch against the latest git tree, to [email protected]. Also for a patch of this significance, you would need to follow the copyright assignment procedure. thanks! Pádraig.
