Re: load balancing

Ole Tange Sat, 18 Jul 2015 17:30:06 -0700

On Sat, Jul 18, 2015 at 8:30 PM, Mark Maimone <maim...@gmail.com> wrote:
:
> So I tried -m instead of --xargs, and that got really confusing.  Why
> do so many threads get so few arguments?  (-X seems to produce output
> similar to -m since I don't do any argument substituion).
>
> What I'd prefer to see is full core utilization, with each core
> processing #args/#cores arguments, (limited by the max command line
> length limit of course).  Am I missing some simple syntax?


GNU Parallel reads arguments one at a time. With -m/-X it executes the
line when it has a full line (i.e. the max line size). If, however, it
hits EOF it takes this last line and splits the arguments between all
cores as #args_on_last_line/#cores. This is why you see so many
threads get so few arguments.

Currently there is no support for reading all arguments and splitting
them into #args/#cores arguments limited by the max command line
length.

> On a related note, is there an easy way to group arguments together to
> ensure they run on one thread?

If the groups are a fixed number of arguments: -N.

> I suppose I could use commas instead
> of whitespace and then prune the commas out, but I might hit the shell
> command line length limit so I'm just wondering if there's already a
> way to specify that.

It should be safe to use {= s/,/ /g =} as the replacement string. Just
be aware that GNU Parallel quotes the spaces, so you may need to
prepend your command with 'eval':

    echo /bin/bash,/bin/ls | parallel eval wc {= s/,/ /g =}

--colsep might also work for you:

    echo /bin/bash,/bin/ls | parallel --colsep , wc

> Finally, I notice in the -m output below that some of the first dozen
> arguments are repeated multiple times.  Bug or feature?

Bug: https://savannah.gnu.org/bugs/?45575


/Ole

Re: load balancing

Reply via email to