On 12/06/2013 12:15 AM, Bernhard Voelker wrote:
> On 12/06/2013 12:37 AM, Pádraig Brady wrote:
>> diff --git a/src/shuf.c b/src/shuf.c
>> index f7fc936..4d0ae90 100644
>> --- a/src/shuf.c
>> +++ b/src/shuf.c
>> @@ -76,8 +76,8 @@ Write a random permutation of the input lines to standard 
>> output.\n\
>>    -n, --head-count=COUNT    output at most COUNT lines\n\
>>    -o, --output=FILE         write result to FILE instead of standard 
>> output\n\
>>        --random-source=FILE  get random bytes from FILE\n\
>> -  -r, --repetitions         output COUNT items, allowing repetition.\n\
>> -                              -n 1 is implied if not specified.\n\
>> +  -r, --repetitions         allow repetition within a specified 
>> --head-count\n\
>> +                              which is assumed to be 1, if not specified.\n\
> 
> n=1 is not really a repetition. ;-)
> 
> BTW: What was the reason to default n=1 with -r anyway?
> I mean as the user did not specify any limit he may assume
> the same number as in the input, like without -r.

Well --repetitions means --allow-repetitions and in this mode
we're just picking random items from the input.
So it doesn't make much sense then to output the same number
of items as was input in this mode. It makes more sense to
default to picking a single random item. Now granted that is
a bit of an awkward default in the context of the --repetitions name.
Though you could read `gen_data | shuf -r` as pick a
random item from the data, which is less awkward.

I suppose we could make -r require that -n is specified,
but I'm not sure.

Other edge cases I've now noticed...

-n1 could degenerate to the faster --repetitions mode in all cases
-n0 -r should exit without reading input

These edge case fixes are not worth adding to this release though.

thanks,
Pádraig.



Reply via email to