On 05/03/10 00:39, Joey Degges wrote:
2010/3/4 Pádraig Brady <p...@draigbrady.com <mailto:p...@draigbrady.com>> Have you considered the seek characteristics of SSDs and how they might affect things (with consideration that mechanical disks will be replaced by SSDs quite quickly). There still would be some benefit splitting per SSD, but it would be worth measuring to see. I will post some results testing with various flash keys but I do not have any proper SSD drives to play with. The extreme case here would be sorting from multiple ramdisks in which case there is likely to be no improvements whatsoever --
Right. In general it's worth posting the results for counter cases like this to help with decisions.
supposing the underlying "do_sort" function can process a single file in parallel. In this worst case it might be useful to expose a "--no-multidisk" flag allowing the user to disable this feature (or a "--multidisk" flag to enable it).
I'm not fond of options for this because if the user needed to make that decision, then they could nearly as easily and more generally do: sort -m <(find /flash/ | xargs -P2 -n1 sort) \ <(find /mech/ | xargs -n1 sort)
+ unsigned long int np2 = num_processors (NPROC_CURRENT) / 2; You probably want NPROC_CURRENT_OVERRIDABLE ? Would we want to use the OpenMP environmental variable to affect the number of pthreads that are used? A more generic PARALLEL variable might be better suited.
That would be a better name but non standard. OMP_NUM_THREADS can be used to config all OpenMP programs, and also the coreutils nproc command honors it. So for consistency at least _OVERRIDABLE might be best. cheers, Pádraig.