On 03/25/2013 04:30 PM, Assaf Gordon wrote: > Hello Pádraig, > > Pádraig Brady wrote, On 03/24/2013 11:45 PM: >>>>>> On 03/06/2013 11:50 PM, Assaf Gordon wrote: >>>>>>> Attached is a suggestion to implement reservoir-sampling in shuf: >>>>>>> When the expected output of lines is known, it will not load the entire >>>>>>> file into memory - allowing shuffling very large inputs. >> >> I've attached 9 patches to adjust things a bit. >> > > Looks great, thank you very much. > > One minor improvement: the comment in the test file is wrong (in early stages > of the patch I thought I could use a fixed random-source and pre-calculate > the expected output). > Attached is a fix.
OK pushed that. I added a note on how to improve the efficiency of reading small inputs from a pipe, as that's a fairly invasive change, and more appropriate for a follow up patch. thanks! Pádraig.
