Re: Parallel script with variable name

2019-02-28 Thread Ernst, Kevin
Hi Lindsey,

If I’m understanding you correctly, then {} (a pair of curly braces) is what 
you want. It’s the “variable” that holds the names of the things coming in on 
parallel‘s standard input, and it gets a different value, one for each input 
line, for each invocation of lots_of_stuff.

The “parallel way” would look like this:

find *filepattern* | parallel lots_of_stuff


which implicitly means

find *filepattern* | parallel lots_of_stuff {}


which can be simplified to:

parallel lots_of_stuff ::: *filepattern*


…since the shell will expand the wildcards—also called “globs”—for you. This 
also saves a process (because there is no pipe), which is not a super-huge deal 
in this simple example, but it’s, like, points for style. A complete reference 
for how this “filename expansion” happens is 
here.

It most cases, it’s fine, but in general it’s more conservative to let the 
shell itself do wildcard expansion, or to use find rather 
than ls, as discussed 
here 
(tl;dr: “word splitting”). Parsing the output of ls can be problematic… for 
reasons you are sure to run into someday, if you haven’t already.

If you haven’t checked them out already, the videos the author posted to 
YouTube are really 
excellent, and not too long. You can get lots of insight about GNU Parallel 
(from basic use cases that you can employ immediately, all the way up to some 
mind-bending possibilities) with just about 30 minutes of your time.

If you ever lose the URL to the YouTube playlist, it’s also mentioned in the 
man page.

Hope this helps.

—Kevin

​


Re: Parallel script with variable name

2019-02-27 Thread Shlomi Fish
Hi Ms. Fenderson,

On Thu, 28 Feb 2019 12:20:13 +1030
Lindsey Fenderson  wrote:

> Hi,
> 
> I'm very new to using GNU parallel, so this is probably a simple question
> but I haven't been able to figure out from the resources online how to do
> the following:
> 
> I have a very long script that I am running over multiple input files and I
> would like to parallelize this process. So for instance,
> 
> ls *filepattern* | parallel script.sh
> 
> seems to work as far as iterating over the files. However, I need to use
> the filename as a variable. So in my serial script I was doing this:
> 
> ls *filepattern* > filenames
> while read filenames; do
>   lots_of_stuff_$filenames
> done < filenames
> 
> So how can I get gnu parallel to incorporate the current filename it is
> using as a variable in my script?
> 

See:

* https://perl.plover.com/varvarname.html

* https://en.wikipedia.org/wiki/Associative_array

* https://en.wikipedia.org/wiki/Command-line_interface#Arguments

Regards,

Shlomi

> Thanks



-- 
-
Shlomi Fish   http://www.shlomifish.org/
“So, who the hell is Qoheleth?” - http://shlom.in/qoheleth

Right to bear arms? In Soviet Russia, we have right to whole bear.
— http://is.gd/EU4puV

Please reply to list if it's a mailing list post - http://shlom.in/reply .



Parallel script with variable name

2019-02-27 Thread Lindsey Fenderson
Hi,

I'm very new to using GNU parallel, so this is probably a simple question
but I haven't been able to figure out from the resources online how to do
the following:

I have a very long script that I am running over multiple input files and I
would like to parallelize this process. So for instance,

ls *filepattern* | parallel script.sh

seems to work as far as iterating over the files. However, I need to use
the filename as a variable. So in my serial script I was doing this:

ls *filepattern* > filenames
while read filenames; do
  lots_of_stuff_$filenames
done < filenames

So how can I get gnu parallel to incorporate the current filename it is
using as a variable in my script?

Thanks