On Sat, Jul 18, 2015 at 8:30 PM, Mark Maimone <maim...@gmail.com> wrote: : > So I tried -m instead of --xargs, and that got really confusing. Why > do so many threads get so few arguments? (-X seems to produce output > similar to -m since I don't do any argument substituion). > > What I'd prefer to see is full core utilization, with each core > processing #args/#cores arguments, (limited by the max command line > length limit of course). Am I missing some simple syntax?
GNU Parallel reads arguments one at a time. With -m/-X it executes the line when it has a full line (i.e. the max line size). If, however, it hits EOF it takes this last line and splits the arguments between all cores as #args_on_last_line/#cores. This is why you see so many threads get so few arguments. Currently there is no support for reading all arguments and splitting them into #args/#cores arguments limited by the max command line length. > On a related note, is there an easy way to group arguments together to > ensure they run on one thread? If the groups are a fixed number of arguments: -N. > I suppose I could use commas instead > of whitespace and then prune the commas out, but I might hit the shell > command line length limit so I'm just wondering if there's already a > way to specify that. It should be safe to use {= s/,/ /g =} as the replacement string. Just be aware that GNU Parallel quotes the spaces, so you may need to prepend your command with 'eval': echo /bin/bash,/bin/ls | parallel eval wc {= s/,/ /g =} --colsep might also work for you: echo /bin/bash,/bin/ls | parallel --colsep , wc > Finally, I notice in the -m output below that some of the first dozen > arguments are repeated multiple times. Bug or feature? Bug: https://savannah.gnu.org/bugs/?45575 /Ole