The last paragraph of the sort help txt explains -k except that we implement the silly gnu "-k2.3,4.5n" extension (some build script somewhere probably used it), but don't document it. (This help text is _so_ old it double spaces after periods, which went away due to html never doing that and everybody retconning their own memories to insist that this is how it's always been even though circa 1992 teachers marked you off for _not_ double spacing after a period.)
Anyway, the problem with documenting it is that the gnu behavior (which we implemented) is stupid. If you don't -t then the leading separator (arbitrary runs of whitespace) is included in the character count, but if you specify -t the first character (which count from 1 just like fields do) is the next character _after_ the separator. And don't get me started on: $ echo -e 'a b\na\tb' | sort -k2,2 a b a b $ echo -e 'a\tb\na b' | sort -k2,2 a b a b $ echo -e 'a\tc\na b' | sort -k2,2 a b a c $ echo -e 'a\tb\na c' | sort -k2,2 a b a c Which toybox sort isn't matching: $ echo -e 'a b\na\tb' | ./sort -k2,2 a b a b $ echo -e 'a\tb\na b' | ./sort -k2,2 a b a b $ echo -e 'a\tc\na b' | ./sort -k2,2 a c a b $ echo -e 'a\tb\na c' | ./sort -k2,2 a b a c (Lemme guess: they _do_ strip the leading whitespace from key sorts even when they _say_ they don't, and then they do a fallback whole-string sort as tie breaker. So I need to change when I'm advancing past the leading space...) This is why my todo list doesn't get shorter. I noticed this because I was checking existing xstrdup() callers... I am _HIGHLY_TEMPTED_ to make toybox -k2.3 start at the third character of the key _always_ skipping the separator, since THAT'S WHAT THE OTHER ONE IS ACTUALLY COMPARING under non-micromanaged circumstances. But that's not how those clowns implemented .x. I can add the above tests to tests/sort.test but I kinda dowanna? @@ -121,7 +122,8 @@ static char *get_key_data(char *str, struct sort_key *key, i if (TT.t && str[start]==*TT.t) start++; // Strip leading and trailing whitespace if necessary - if (flags&FLAG_b) while (isspace(str[start])) start++; + if ((flags&FLAG_b) || (!TT.t && !key->range[3])) + while (isspace(str[start])) start++; if (flags&FLAG_bb) while (end>start && isspace(str[end-1])) end--; // Handle offsets on start and end That's just embarassing. It's the _compatible_ behavior, but is it the _right_ behavior? Sigh. Nobody else has noticed this for years and years... Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
