Hi,

first of all - thanks for parallel, it's a great tool!

I have a use case where I have a large workload that is generated
dynamically and piping that input via stdin to parallel takes a long time
(a couple of hours). Since it takes so long, I also really want to use
--bar for seeing the progress and ETA, but then it fails with:

> parallel: Warning: Reading NNNN arguments took longer than 10 seconds
> parallel: Warning: Consider removing --bar

This is because if you specify --bar or --eta, it waits for collecting all
the lines of the input to know the total number of jobs in order to
calculate percentage & ETA.

In my case I do know the total upfront though, so I could simply pass that
number into parallel. I did a quick patch and it works nicely:

- added a "--total N" option which expects an integer number
- if set, it would use that value preferably in the sub total_jobs()
instead of counting the input (or any of the other cases it has)
- the rest follows automatically

Below is a quick patch against version "GNU parallel 20220822":

1706c1706,1707
< ("debug|D=s" => \$opt::D,
---
> ("total=i" => \$opt::total,
> "debug|D=s" => \$opt::D,
8818c8819,8821
< if($opt::sqlworker) {
---
> if($opt::total) {
>    $self->{'total_jobs'} = $opt::total;
> } elsif($opt::sqlworker) {

Not sure what the right contribution process is, so I thought I would start
with a mail on this list.

Cheers,
Alexander Klimetschek

Reply via email to