On Mon, Aug 3, 2015 at 12:03 AM, Ole Tange <ta...@gnu.org> wrote:

> On Sun, Aug 2, 2015 at 2:36 AM, Schweiss, Chip <c...@innovates.com> wrote:
>
> > The problem with that is that parallel will start execution on parent
> folder
> > before the child process is finish.
>
> Ahh. Yes.
>
> One solution is to find the max depth.
> Run all for that depth using GNU Parallel.
> Do the same for depth-1.
>

That sound like the winner!   This has a few inefficiencies, but for the
most part fits the problem quite well.

Thanks for the suggestion!

-Chip


>
> This way you should have very little time wasted as you will
> parallelize over different subdirs at the same level. You will only
> have wasted time at the end of each depth. This should work:
>
> # Find the maxdepth
> MAX=$(find | perl -ne '$a=s:/:/:g;$max=$a>$max?$a:$max;END{ print $max+1
> }')
> # For each depth (D) in MAX..1:
> #   Find files/dirs at depth D and do_stuff on them in parallel
> seq $MAX -1 1 | parallel -j1 -I D 'find . -mindepth D -maxdepth D |
> parallel do_stuff {}'
>
>
> /Ole
>

Reply via email to