Danek Duvall wrote:
> On Thu, Jun 07, 2007 at 12:06:34AM +0200, Roland Mainz wrote:
> > Does Sun have any coding (style) guidelines/notes for Bourne/Korn/POSIX
> > shell scripts ?
> 
> No, only C (and, by some extension, C++) and Java.  Shell, perl, and python
> are missing.
> 
> > A quick review of some standard scripts like "bfu", "acr", some of the
> > SMF/Dtrace shell scripts and other stuff reveals some IMO not very "nice"
> > (where "nice" is more or less an understatement) constructs...
> 
> Oh, yes.  Definitely.

Ok... lets start (this is mainly about Korn shell/POSIX scripts but some
things apply to older shells like Bourne, too) ...

---- SNIP ---- SNIP ----
* Use "print" and not "echo" for output.
"print" is guranteed to be a ksh builtin while "echo" may not be a
builtin (causing extra |fork()|+|exec()|), additionally "echo" may not
be portable (e.g. SysV vs. BSD "echo" differences)


* Avoid things like $ echo "foo" >xxx ; echo "bar" >>xxx ; echo "baz"
>>xxx #
Each of the redirections above trigger an
|open()|,|write()|,|close()|-sequence. It is much more efficient group
the rediction into a block, e.g. $ { echo "foo" ; echo "bar" ; echo
"baz" } >xxx #


* Use typed variables if possible. For example the following is very
inefficient since it transforms the integer values to strings and back
several times:
-- snip --
a=0
b=1
c=2
# more code
if [ $a -lt 5 -o $b -gt c ] ; then do_something ; fi
-- snip --
This could be rewritten using ksh constructs:
-- snip --
integer a=0
integer b=1
integer c=2
if (( a < 5 || b > c )) ; then do_something ; fi
-- snip --


* Use ksh style functions instead of Bourne-style functions if possible
(and local variables instead of spamming the global namespace)


* Use builtin math ksh instead of "expr" or "bc"/"dc"


* Use printf "%a" when passing floating-point values between scripts or
as outout of a function to avoid rounting errors when converting between
bases.
Example:
-- snip --
function xxx
{
    float val

    (( val=sin(5.) ))
    printf "%a\n" val
}
float out
(( out=$(xxx) ))
xxx
print -- $out
-- snip --
This will print:
-- snip --
-0.9589242747
-0x1.eaf81f5e09933226af13e5563bc6p-01
-- snip --


* _ALWAYS_ put variables into quotes when handling filenames, even if
the values are hardcoded or the values appear to be fixed. Otherwise at
least two things may go wrong:
- a malicious user may be able to exploit a script's inner working to
infect his/her own code
- a script may (fatally) misbehave for unexpected input (e.g. file names
with blanks and/or special symbols which are interpreted by the shell)


* Avoid the creation of temporary files and store the values in
variables intead.
Example:
-- snip --
ls -1 >xxx
for i in $(cat xxx) ; do
    do_something ;
done
-- snip --
can be replaced with
-- snip --
x="$(ls -1)"
for i in ${x} ; do
    do_something ;
done
-- snip --
Note that ksh93 supports binary variables which can hold any value.


* Use builtin commands if the shell provides them. For example ksh93s+
(ksh93, version 's+') delivered with Solaris supports the following
builtins:
-- snip --
basename
cat
chgrp
chmod
chown
cmp
comm
cp
cut
date
dirname
expr
fds
fmt
fold
getconf
head
id
join
ln
logname
mkdir
mkfifo
mv
paste
pathchk
rev
rm
rmdir
stty
tail
tee
tty
uname
uniq
wc
sync
-- snip --
Those builtins can be enabled via $ builtin name_of_builtin # in shell
scripts (note that ksh93 builtins implement exact POSIX behaviour - some
commands in Solaris /usr/bin/ directory implement pre-POSIX behaviour.
Add /usr/xpg6/bin/:/usr/xpg4/bin before /usr/bin/ ${PATH} to test
whether your script works with the XPG6/POSIX versions)


* Use blocks and not subshells if possible, e.g. use $ { print "foo" ;
print "bar" ; } instead of $ (print "foo" ; print "bar") # - blocks are
faster since they do not require to save the subshell context (ksh93) or
trigger a shell child process (Bourne shell, bash, ksh88 etc.)


* Using -p" for starting non-interactive shell scipts is AFAIK a NO-OP,
e.g.
-- snip --
#!/bin/ksh -p
-- snip --
should be replaced with
-- snip --
#!/bin/ksh
-- snip --


* Store lists in arrays or associative arrays - this is usually easier
to manage. For example:
-- snip --
x="
/etc/foo
/etc/bar
/etc/baz
"
echo $x
-- snip --
can be replaced with
-- snip --
typeset -a mylist
mylist[0]="/etc/foo"
mylist[1]="/etc/bar"
mylist[2]="/etc/baz"
print ${x[*]}
-- snip --
or (ksh93-style append entries to a normal (non-associative) array)
-- snip --
typeset -a mylist
mylist+=( "/etc/foo" )
mylist+=( "/etc/bar" )
mylist+=( "/etc/baz" )
print ${x[*]}
-- snip --


* Use "[[ expr ]]" instead of "[ expr ]" if possible since it avoids
going through the whole pattern expansion/etc. machinery


* Use $( ... ) instead of `...` - `...` is an obsolete construct in ksh
scripts


* Make sure that yoour shell has a "true" builtin (like ksh93) when
executing endless loops like $ while true ; do do_something ; done # -
otherwise each loop cycle runs a |fork()|+|exec()|-cycle to run
/bin/true


* Always put the result of $( ... ) in quotes (e.g. "$( ... )" ) unless
there is a very good reason for not doing it


* Scripts should always set their PATH to make sure they do not use
alternative commands by accident


* Scripts should make sure that commands in optional packages are really
there, e.g. add a "precheck" block in scipts to avoid later failure when
doing the main job


* Check how boolean values are used in your application. For example:
-- snip --
mybool=0
# do something
if [ $mybool -eq 1 ] ; then do_something_1 ; fi
-- snip --
could be rewritten like this:
-- snip --
mybool=false # (valid values are "true" or "false", pointing
# to the builtin equivalents of /bin/true or /bin/false)
# do something
if ${mybool} ; then do_something_1 ; fi
-- snip --
or
-- snip --
integer mybool=0 # values are 0 or 1
# do something
if (( mybool==1 )) ; then do_something_1 ; fi
-- snip --


* Names of local variables which are not exported to the environment
should be lowercase while variable names which are exported to the
environment should be uppercase.
The only exception are global constants (=global readonly variables,
e.g. $ float -r M_PI=3.14159265358979323846 # (taken from <math.h>))
which may be allowed to use uppercase names, too.


* Think about whether your application has to handle file names or
variables in multibyte locales and make sure all commands used in your
script can handle such charatcers (e.g. lots of commands in Solaris's
/usr/bin/ are _not_ able to handle such values - either use ksh93
builtin constructs (which are guranteed to be multibyte-aware) or
commands from /usr/xpg4/bin and/or /usr/xpg6/bin)


* Use external filters like grep/sed/awk/etc. only if a significant
amount of data is processed by the filter. For example:
-- snip --
if [ "$(echo "$x" | egrep '.*foo.*)" != "" ] ; then
    do_something ;
done
-- snip --
can be re-written using ksh93 builtin constructs, saving several
|fork()|+|exec()|'s:
-- snip --
if [[ "${x}" == ~(E).*foo.* ]] ; then
    do_something ;
done
-- snip --


* Always use '{'+'}' when using variable names longer than one character
(to avoid problems with array or compund variables:
-- snip --
print $foo
-- snip --
should be rewritten to
-- snip --
print ${foo}
-- snip --


* Use "print -- ${varname}" when there is the slightest chance that the
variable "varname" may contain symbols like "-". Or better use "printf"
instead, for example
-- snip --
integer fx
# do something
print $fx
-- snip --
may fail if "f" contains a negative value. A better way may be to use
-- snip --
inetger fx
# do something
print "%f" fx
-- snip --


* Use $ export FOOBAR=val # instead of $ FOOBAR=val ; export FOOBAR # -
this is much faster


* Use compound variables or associative arrays to group similar
variables together.
For example:
-- snip --
box_width=56
box_height=10
box_depth=19
echo "${box_width} ${box_height} ${box_depth}"
-- snip --
could be rewritten to ("associative array"-style)
-- snip --
typeset -A -E box=( "width"=56 "height"=10 depth=19 )
print -- "${box[width"]} ${box["height"]} ${box["depth"]}"
-- snip --
or ("compound variable"-style
-- snip --
box=(
    float width=56
    float height=10
    float depth=19
    )
print -- "${box.width} ${box.height} ${box.depth}"
-- snip --


* Use a subshell (e.g. $ ( mycmd ) #) around places which use "set --
$(mycmd)" and/oror "shift" unless the variable affected is either a
local one or if it's guranteed that this variable will no longer be used
(be carefull for loadable functions, e.g. ksh/ksh93's "autoload" !!!!)


* Cleanup after yourself. For example ksh/ksh93 have an EXIT trap which
is very usefull for this.
Note that the EXIT trap is executed for a subshell and each subshell
level can run it#s own EXIT trap, for example
$ (trap "print bam" EXIT ; (trap "print snap" EXIT ; print "foo")) #
will print
-- snip --
foo
snap
bam
-- snip --


* Put constant values into readonly variables, for example:
-- snip --
float -r M_PI=3.14159265358979323846
-- snip --
or
-- snip --
float M_PI=3.14159265358979323846
readonly M_PI
-- snip --


* If you create more than one temporary file create an unique subdir for
these files and make sure the dir is writeable. Make sure you cleanup
after yourself (unless you are debugging).


* Be carefull with using TABS in script code, they are not portable
between editors or platforms


* if youhave multiple points where your application exists with an error
message create a central function for this, e.g.
-- snip --
if [ -z "$tmpdir" ] ; then
        print -u2 "mktemp failed to produce output; aborting."
        exit 1
fi
if [ ! -d $tmpdir ] ; then
        print -u2 "mktemp failed to create a directory; aborting."
        exit 1
fi
-- snip --
should be replaced with
-- snip --
function fatal_error
{
    print -u2 "${progname}: $@"
    exit 1
}
# do something (and save ARGV[0] to variable "progname")
if [ -z "$tmpdir" ] ; then
        fatal_error "mktemp failed to produce output; aborting."
fi
if [ ! -d "$tmpdir" ] ; then
        fatal_error "mktemp failed to create a directory; aborting."
fi
-- snip --


* Think about using $ set +o unset # by default (or at least during the
script's development phase) to catch errors where variables are used
when they are not set (yet), e.g.
-- snip --
$ (set +o unset ; print ${foonotset})
/bin/ksh93: foonotset: parameter not set
-- snip --
---- SNIP ---- SNIP ----


Uhm... better I create a DocBook/XML document from that... this list
will become very long (above is only from "bfu", "acr" etc. but there
are many many more scripts in OS/Net which could be improved a lot...)

... CC:'ing Michelle Olson: Michelle... is there any preferred
layout/style or something which I could as "template" for the language
being used ?

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz at nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)

Reply via email to