Re: /usr/bin/env

2016-12-23 Thread Russell Coker via luv-main
On Friday, 23 December 2016 8:13:02 PM AEDT Andrew Mather via luv-main wrote:
> It's not uncommon in scientific computing to need multiple versions of
> compilers and various bits of software compiled against a range of
> different libraries and the like.  You have to retain old versions of
> software, often long past its use-by date in case someone queries a
> scientific paper based on using that particular version.

If you are particularly concerned about such things you wouldn't want to have 
a system where lots of different versions of the software were installed side 
by side.  You would want a VM/chroot image with the exact software in 
question.  The amount of storage space isn't an issue by today's standards.  A 
plain text representation of a human DNA scan is 3G which is probably larger 
than the complete OS and all software needed to analyse it.

But if you really want to reproduce things you need a copy of the same 
hardware (different releases of CPU families can give different floating point 
answers etc) and the same OS kernel.

I've heard a lot of scientific computing people talk about a desire to 
reproduce calculations, but I haven't heard them talking about these issues so 
I presume that they haven't got far in this regard.

http://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970

Not that it matters, minor issues like these pale into insignificance when 
compared to the above.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-23 Thread Andrew Mather via luv-main
>
> Message: 5
> Date: Fri, 23 Dec 2016 15:12:05 +1100
> From: Craig Sanders <c...@taz.net.au>
> To: Luv Main <luv-main@luv.asn.au>
> Subject: Re: /usr/bin/env
> Message-ID: <20161223041205.qbkvfszdiy4tp...@taz.net.au>
> Content-Type: text/plain; charset=us-ascii
>
> On Fri, Dec 23, 2016 at 02:44:28PM +1100, Andrew Mather wrote:
> > Module files are generally set up by the admins, so they don't require
> > anything more from the user than including the appropriate loading
> > statements in their scripts.  It's not unlike a wrapper script really.
>
> it sounds similar to (but quite a bit more advanced than) what i've done in
> the past with wrapper scripts and collections of environment setting files
> sourced (#included) as needed.
>

Yep. Pretty much.

It's not uncommon in scientific computing to need multiple versions of
compilers and various bits of software compiled against a range of
different libraries and the like.  You have to retain old versions of
software, often long past its use-by date in case someone queries a
scientific paper based on using that particular version.

By using a chain of module load commands the user can easily set up an
environment very different from the current OS state (apart from the kernel
itself), repeatably, across an entire cluster if needs be.

They can even swap environments around between various steps in a script if
that is needed.

Obviously it's overkill for some requirements and won't suit everyone's
Modus Operandi, but well worth knowing about if that's the sort of thing
you need to do.


-- 
-
 https://picasaweb.google.com/107747436224613508618
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
"Voting is a lot like going to Bunnings really:
You walk in confused, you stand in line, you have a sausage on the way out and
at the end, you wind up with a bunch of useless tools"
Joe Rios
-
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-23 Thread Craig Sanders via luv-main
On Fri, Dec 23, 2016 at 06:02:54PM +1100, russ...@coker.com.au wrote:
> While it is documented to work that way doesn't mean it's a good idea to do 
> it.

the issue isn't about bash and '-e', it's about env breaking the ability to
pass options on the #! line. '-e' is just a trivial illustrative example.

with bash it's only a minor annoyance because most (or maybe all, i can't
remember) options can be enabled with a 'set' command inside the script
anyway.  For other languages, it can break the script entirely or, worse,
change the script's behaviour in subtle and "interesting" ways.


> cd $DIR
> rm -rf *

-e isn't a replacement for defensive programming around potentially dangerous
things, it's just a way to avoid uglifying your code by adding exit-status
checks after every trivial command. an uncaught non-zero exit code will abort
the script.

a saner, or more defensive, way to write that would be:

  cd "$DIR" && rm -rf *

or 

  cd "$DIR" \
&& rm -rf * \
|| exit 1

and it's worthwhile doing that (including quoting the $DIR variable) whether
you use 'bash -e', 'set -e' in the script, or neither.

craig

--
craig sanders 
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Russell Coker via luv-main
On Friday, 23 December 2016 3:07:09 PM AEDT Craig Sanders via luv-main wrote:
> On Fri, Dec 23, 2016 at 02:35:06PM +1100, Russell Coker wrote:
> > Putting the -e in the first line of the shell script is considered bad
> > practice anyway.
> 
> that's debatable. some think it's bad practice. some think it's using bash
> as it's documented to work.

While it is documented to work that way doesn't mean it's a good idea to do 
it.

cd $DIR
rm -rf *

For example if a script has the above 2 lines then a failure to change 
directory will be catastrohic (and it's something I've seen in production more 
than once).  If you are writing your own scripts then you can avoid such 
things, but in a typical sysadmin team it's best to have "set -e" near the top 
of scripts.

> As mentioned, the problem is even worse if used with other languages. e.g
> the following perl script works and produces useful (and expected) output:

Agreed.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Craig Sanders via luv-main
On Fri, Dec 23, 2016 at 02:44:28PM +1100, Andrew Mather wrote:
> Module files are generally set up by the admins, so they don't require
> anything more from the user than including the appropriate loading
> statements in their scripts.  It's not unlike a wrapper script really.

it sounds similar to (but quite a bit more advanced than) what i've done in
the past with wrapper scripts and collections of environment setting files
sourced (#included) as needed.

craig

--
craig sanders 
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Craig Sanders via luv-main
On Fri, Dec 23, 2016 at 02:35:06PM +1100, Russell Coker wrote:
> Putting the -e in the first line of the shell script is considered bad
> practice anyway.

that's debatable. some think it's bad practice. some think it's using bash as
it's documented to work.

'bash -e' was just a simple example (and it's easy enough to just have 'set
-e' in the script itself), not the beginning or end of the problem.


As mentioned, the problem is even worse if used with other languages. e.g the
following perl script works and produces useful (and expected) output:

#!/usr/bin/perl -p

s/foo/bar/;


This doesn't:

#!/usr/bin/perl

s/foo/bar/;


for those not familiar with perl, `-p` tells perl to wrap the entire script
(aside from a few exclusions like BEGIN and END blocks) in a while/read/print
loop on STDIN+filename args, so the former script actually runs as if the code
is more like this (slightly simplified):

#!/usr/bin/perl

while (<>) {
  s/foo/bar/;
  print "$_";
}

see `man perlrun` for more details. and note that it gets even more
complicated when other options are also used - e.g. see the example given for
'#!/usr/bin/perl -pi.orig'



> If correct operation of the script requires aborting on error then you don't
> want someone debugging it with "bash -x scriptname" to accidentally stop
> that.

as with most things, there are pros and cons to that. sometimes you want -x to
override -e, sometimes you don't. and you can always "override the override"
by running 'bash -x -e scriptname'

the point is that #!/usr/bin/env breaks the documented behaviour of
interpreters, preventing scripts from being run with any options that they may
require.

craig

--
craig sanders <c...@taz.net.au>
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Andrew Mather via luv-main
>
> Message: 5
> Date: Fri, 23 Dec 2016 14:04:39 +1100
> From: Craig Sanders <c...@taz.net.au>
> To: Luv Main <luv-main@luv.asn.au>
> Subject: Re: luv-main Digest, Vol 64, Issue 15
> Message-ID: <20161223030439.alqgnoiollf6v...@taz.net.au>
> Content-Type: text/plain; charset=us-ascii
>
> On Fri, Dec 23, 2016 at 01:06:47PM +1100, Andrew Mather wrote:
> > We use the "modules" environment (TACC's lmod implementation
> specifically)
> > for this type of thing.
> >
> > https://www.tacc.utexas.edu/research-development/tacc-projects/lmod
> >
> > It allows multiple versions of packages to exist without library
> collisions
> > and so on.  Loading the appropriate modules allows the user to set up the
> > execution environment and even swap between versions if necessary.
>
> this looks interesting. i'll have to read more about it but at first sight
> it
> seems like a specific language and system for setting up the environment
> for a
> particular program/script.
>
> one of the main reasons i prefer wrapper scripts (or symlinks) is that they
> don't rely on undocumented and unknown settings (PATH, LDPATH, etc) in some
> random individual's environment. scripts document those settings
> explicitly.
> symlinks just use the standard system environment.
>
> this has the huge benefit of NOT relying on fallible human memory,
> resulting
> in reproducible, auditable, and easily debugged software usage. also avoids
> seemingly random breakage from changes to the environment - "why doesn't
> this
> work? it ran perfectly 3 weeks ago when i last ran it."
>
>
>
> i'm kind of surprised that a language with the slogan "explicit is better
> than
> implicit" is one of the main perpetrators of the #!/usr/bin/env
> abomination.
>
>
As Rodney mentioned, modules, or some variation thereof is quite common in
HPC environments.

Module files are generally set up by the admins, so they don't require
anything more from the user than including the appropriate loading
statements in their scripts.  It's not unlike a wrapper script really.

Through some of the other commands available, it also allows for querying
of what modules and versions are available and what particular packages
actually do.  Our users are slowly getting used to this and beginning to
check before asking for packages to be installed.

One other advantage is that if configured, modules allows for logging,
which can help in software management.




-- 
-
 https://picasaweb.google.com/107747436224613508618
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
"Voting is a lot like going to Bunnings really:
You walk in confused, you stand in line, you have a sausage on the way out and
at the end, you wind up with a bunch of useless tools"
Joe Rios
-
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Russell Coker via luv-main
Putting the -e in the first line of the shell script is considered bad practice 
anyway. If correct operation of the script requires aborting on error then you 
don't want someone debugging it with "bash -x scriptname" to accidentally stop 
that.

On 23 December 2016 2:14:55 am LHDT, Craig Sanders via luv-main 
<luv-main@luv.asn.au> wrote:
>On Fri, Dec 23, 2016 at 01:37:11AM +1100, Craig Sanders wrote:
>> one of the worst problems with doing it is that it breaks the ability
>> to pass command-line options to the interpreter in the #! line - e.g.
>> '#!/bin/bash -e' works, but with '#!/usr/bin/env bash -e' the '-e' is
>> ignored by bash.
>
>that's not quite true. it's not that bash ignores the '-e', it's that
>env tries to run a non-existent program called 'bash -e'
>
>e.g.
>
>$ cat foo.bash
>#!/usr/bin/env bash -e
>
>echo foo
>
>$ ./foo.bash
>/usr/bin/env: ‘bash -e’: No such file or directory
>
>
>IMO, that's an unmistakable signal that env was not intended to be used
>in this way.
>
>
>
>IIRC some versions of env (not the one in GNU coreutils, which is
>installed on almost every linux system - with some embedded or
>busybox/tinybox systems being the exceptions) will still run the
>correct
>interpreter but fails to pass on any options.
>
>craig
>
>--
>craig sanders <c...@taz.net.au>
>___
>luv-main mailing list
>luv-main@luv.asn.au
>https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main

-- 
Sent from my Nexus 6P with K-9 Mail.
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Russell Coker via luv-main
Why not use systemd-nspawn to create virtual environments for every 
distribution that people want to use? It doesn't do anything significant that 
chroot didn't do but it's an easy way of doing it all including bind mounts for 
/home etc.

If you run multiple versions of python etc on a single OS image then you either 
have python running with versions of libc etc that it didn't get distribution 
developer testing on or you have a lot of hackery to get multiple versions of 
libc etc (which has potential for inconsistent results).

On my laptop I have i386 and amd64 versions of the last few Debian releases 
running under systemd-nspawn for supporting older releases. Apart from the 
wheezy libc not running with a 4.x kernel everything is fine.
-- 
Sent from my Nexus 6P with K-9 Mail.
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Craig Sanders via luv-main
On Fri, Dec 23, 2016 at 12:26:54PM +1100, Sean Crosby wrote:
> > that's one of the things that symlinks are for.
> >
> > e.g. I have python2.6, 2.7, 3.1, 3.2, 3.4, and 3.5 all installed in
> > /usr/bin, with symlinks python & python2 pointing to 2.7, and python3
> > pointing to 3.5
> 
> All well and good if you're root

and if you're not root, you can do the same things in ~/bin or edit
the #! line of your script to point to your preferred interpreter.

if it's not your script and you can't edit it, then either accept that
it's going to be run with a system interpreter or write a wrapper script
in ~/bin to call it with your preferred interpreter.


> Yes but with the software our students use, they repackage python into
> a self contained directory, under the version of the software

they should edit the #! line then. it's not hard, and it avoids making
an unmaintainable mess.

> Hence why /usr/bin/env python is great.

it's not great. it's a mistake arising from inadequate understanding or
knowledge of existing tools and practices.

craig

--
craig sanders <c...@taz.net.au>
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Sean Crosby via luv-main
On 23 December 2016 at 10:58, Craig Sanders via luv-main <
luv-main@luv.asn.au> wrote:

> On Fri, Dec 23, 2016 at 08:11:15AM +1100, Sean Crosby wrote:
> > I've taken to using /usr/bin/env a bit more because of the max length
> > limit in shebang lines. We store newer versions of Ruby, Python etc
> > on a separate filesystem, where there are many versions of these
> > directories, and they are hidden down quite far in the dirtree. So we
> > regularly hit the max shebang length limit of 128 characters.
>
> that's one of the things that symlinks are for.
>
> e.g. I have python2.6, 2.7, 3.1, 3.2, 3.4, and 3.5 all installed in
> /usr/bin, with symlinks python & python2 pointing to 2.7, and python3
> pointing to 3.5
>

All well and good if you're root


>
> python scripts have either a specific versioned binary name in the #!
> line or just #!/usr/bin/python or #!/usr/bin/python2 for the latest
> python 2.x or #!/usr/bin/python3 for the latest python 3.x. at some
> point in the future, python3 will become the default python and
> /usr/bin/python will point to it.
>

Yes but with the software our students use, they repackage python into a
self contained directory, under the version of the software

e.g.

/foo/bar/v1.1/external/python/bin/python
/foo/bar/v1.2/external/python/bin/python

Even though the python versions might be the same, when you set up your
environment to be for v1.1 of the package or v1.2 (which changes
LD_LIBRARY_PATH, PATH etc), the python version/modules change location.
Hence why /usr/bin/env python is great.

Sean
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Craig Sanders via luv-main
On Fri, Dec 23, 2016 at 08:11:15AM +1100, Sean Crosby wrote:
> I've taken to using /usr/bin/env a bit more because of the max length
> limit in shebang lines. We store newer versions of Ruby, Python etc
> on a separate filesystem, where there are many versions of these
> directories, and they are hidden down quite far in the dirtree. So we
> regularly hit the max shebang length limit of 128 characters.

that's one of the things that symlinks are for.

e.g. I have python2.6, 2.7, 3.1, 3.2, 3.4, and 3.5 all installed in
/usr/bin, with symlinks python & python2 pointing to 2.7, and python3
pointing to 3.5

that's all managed automatically by the system python packages.

if i ever need a custom compiled python 3.x or whatever, I can either
make a package the same way or install it under /usr/local and create
and/or update the symlink as needed.

or i can compile and install it anywhere and make a specific symlink
(e.g. /usr/local/bin/python.custom) pointing to it - avoiding the 128
character #! limit.


python scripts have either a specific versioned binary name in the #!
line or just #!/usr/bin/python or #!/usr/bin/python2 for the latest
python 2.x or #!/usr/bin/python3 for the latest python 3.x. at some
point in the future, python3 will become the default python and
/usr/bin/python will point to it.

and the scripts work exactly the same, using the exact same interpreter
(with the exact same set of library modules) no matter who runs them or
what environment they're run in (e.g. from a shell, or from cron, or a
web server).  Consistency and predictability are important.  As is
manual control/override where needed.


similarly, I have ruby1.9.1, ruby2.0, ruby2.1, ruby2.2, and ruby2.3 in
/usr/bin, with /usr/bin/ruby a symlink pointing to ruby2.3

craig

ps: to me, using #!/usr/bin/env is just a variant of something that i've
hated ever since my first unix sysadmin job (actually, before that when
I was just a user or programmer) - important things should not be buried
in a programmer's home directory and dependent on their idiosyncratic
(and undocumented) environment settings. that's fine for your own tools
and hacks and dev/testing versions, but when any such program moves
beyond being a personal tool, it needs to be integrated into the system
so that it works consistently for everyone who uses it.


--
craig sanders <c...@taz.net.au>
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Sean Crosby via luv-main
On 23 December 2016 at 01:37, Craig Sanders via luv-main <
luv-main@luv.asn.au> wrote:

>
> the one argument in favour of doing this (that the script will be run by
> the
> first matching interpreter found in the PATH) is both a blessing and a
> curse.
> at best it's a minor convenience. at worst, it's a potential security risk
> -
> it's not an accident or an oversight that every unix system since the #!
> line
> was invented DOESN'T search $PATH for the interpreter.


I've taken to using /usr/bin/env a bit more because of the max length limit
in shebang lines. We store newer versions of Ruby, Python etc on a separate
filesystem, where there are many versions of these directories, and they
are hidden down quite far in the dirtree. So we regularly hit the max
shebang length limit of 128 characters.

Sean
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: /usr/bin/env

2016-12-22 Thread Craig Sanders via luv-main
On Fri, Dec 23, 2016 at 01:37:11AM +1100, Craig Sanders wrote:
> one of the worst problems with doing it is that it breaks the ability
> to pass command-line options to the interpreter in the #! line - e.g.
> '#!/bin/bash -e' works, but with '#!/usr/bin/env bash -e' the '-e' is
> ignored by bash.

that's not quite true. it's not that bash ignores the '-e', it's that
env tries to run a non-existent program called 'bash -e'

e.g.

$ cat foo.bash
#!/usr/bin/env bash -e

echo foo

$ ./foo.bash
/usr/bin/env: ‘bash -e’: No such file or directory


IMO, that's an unmistakable signal that env was not intended to be used
in this way.



IIRC some versions of env (not the one in GNU coreutils, which is
installed on almost every linux system - with some embedded or
busybox/tinybox systems being the exceptions) will still run the correct
interpreter but fails to pass on any options.

craig

--
craig sanders <c...@taz.net.au>
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


/usr/bin/env

2016-12-22 Thread Craig Sanders via luv-main
On Fri, Dec 23, 2016 at 12:57:48AM +1100, Andrew McGlashan wrote:
> #!/usr/bin/env bash

please don't promote thet obnoxious brain-damage. it's bad enough seeing the
#!/usr/bin/env disease on sites like stackexchange (where at least they have
the excuse of catering to non-linux systems - and even there it's broken,
because env isn't guaranteed to be in /usr/bin on all systems anyway) but bash
will be /bin/bash on every linux system that exists, and always will be.

there are lots of good reasons why abusing /usr/bin/env like this is a bad
idea at:

http://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my

(see also the Linked and Related Q on the RHS of the page)

the one argument in favour of doing this (that the script will be run by the
first matching interpreter found in the PATH) is both a blessing and a curse.
at best it's a minor convenience. at worst, it's a potential security risk -
it's not an accident or an oversight that every unix system since the #! line
was invented DOESN'T search $PATH for the interpreter.


one of the worst problems with doing it is that it breaks the ability to pass
command-line options to the interpreter in the #! line - e.g. '#!/bin/bash -e'
works, but with '#!/usr/bin/env bash -e' the '-e' is ignored by bash.

this is bad enough for bash, but worse for other scripting languages where
passing command-line options to the #! interpreter is routine (like sed,
awk, perl) or required (like make, which requires '-f' on the #! line of an
executable make script).

also, env messes with ARGV[0] which can make the script difficult or
impossible to find with ps


'#!/usr/bin/env interpreter' - brought to you by the people who think
that 'curl http://randomwebsite/path/to/script | sudo bash' is a good
way to install software.

craig

--
craig sanders <c...@taz.net.au>
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main