On Friday 13 February 2009 14:55, Lee, Gary D. wrote:
>The following script runs perfectly when run from the # prompt logged in as
> root on our sles9 sp4 instance. However, when run in the startup process as
> S16starttapes forinstance, the second drive does not add, and the message
> that it does add is also printed. Also, the output numbers from the fgreps
> do not show up in the log file. How do I solve this one?
I'm not exactly sure what's going wrong here, but being a shell script geek I
can give you some pointers on how to find out and make this script more
robust. Please forgive me if I get a bit pedantic about this, but I have
strong opinions about shell-script programming style formed over way too many
years of writing these things. And I like to teach. :-)
I'll start with some general comments:
1) You don't need to export all your variables after setting them. Exporting
places the shell variable into the environment of all sub-processes invoked
from the shell, so you only need to do that if you're going to run some
program that will look for that variable in its environment. All the
variables you're setting in here are only used within this script, so you
don't have to export any of them.
2) It's good form to do "command substitution" using the $(...) notation
instead of the `...` backtick notation. Backticks are a hold-over from the
original Bourne shell, are difficult to see when reading the code, require
internal quotes to be escaped and don't nest. The $(...) notation solves all
those problems.
3) It's always a good idea to check for errors on all commands, even the
simple ones you wouldn't expect to fail. Your script fails at boot time
because something is different from when you run it from your root shell.
Perhaps the environment is different, or some filesystems are not mounted?
We can find out with better error reporting written into the script.
4) Write shell functions to do common things, such as write messages to your
logfile, or report errors, or add a device. Reusing code is good, just as in
any programming language.
5) I'm impressed to see comments in your script! Not many people do that, but
most scripts are unmaintainable without some explanation of why it is doing
things.
So I would write the assignment you wrote as:
>e1b="`fgrep -c 3590 /proc/scsi/IBMtape` ";export e1b
>echo $e1b 3590 drives detected >>/root/tapeadd.log
like so:
if e1b="$(fgrep -c 3590 /proc/scsi/IBMtape)"
then Log $e1b 3590 drives detected
else Error Failed to count 3590 tape devices
fi
The Log and Error functions are pretty trivial:
# Append all arguments to a log file.
LOGFILE=/root/tapeadd.log
Log()
{
echo "$@" >> "$LOGFILE"
}
# Log the arguments as an error and exit with a non-zero value.
Error()
{
Log ERROR: "$@"
exit 1
}
I'd also write a function that adds a device, so that you can do more error
checking there:
# Add a FCP device. The arguments are the virtual device number,
# the WWPN and the LUN of the device to be added, without any
# leading "0x". Returns only if successful, exits with an error otherwise.
AddDevice()
{
local dev unit_add
dev="/sys/bus/ccw/drivers/zfcp/0.0.$1"
if [ -e "$dev" ]
then unit_add="$dev/0x$2/unit_add"
if [ -e "$unit_add" ]
then if echo "0x$3" > "$unit_add"
then Log Drive on device $1 added successfully
else Error Could not add device $1
fi
else Error Device $1 does not have WWPN 0x$2
fi
else Error Device $1 is not present
fi
}
Using functions like those above should make it easier to make sure the script
works consistently, and to tell what is wrong when it does not.
You mentioned two problems that occurred when this is run as an rc-script at
boot time. The first is that the second drive does not get added yet the
message claiming that it did get added is logged. I'm not sure why the
device is not getting added (perhaps that WWPN isn't known?), but using the
AddDevice() function above should tell you what went wrong. I can tell you
why the message was logged: because your test: [ $rv2 -lt 2 ] failed,
possibly because $rv2 is empty. If you do a numeric comparison and one
operand is not a number, the test will fail. So it didn't write out your
error message and exit, it fell through to writing out your success message.
Your second problem is that the numbers from the fgreps are not showing up in
the log file. That's because fgrep got an error so it didn't write anything
to its standard output and your variables got set to an empty string. You're
not capturing the error output from fgrep so you can't tell why it failed.
We should change those commands to be like this:
e1b="$(fgrep -c 3590 /proc/scsi/IBMtape 2>>$LOGFILE)"
so you will collect the fgrep errors in the log. I suspect the problem is
that /proc/scsi/IBMtape doesn't exist. Perhaps your rc-script is running
before /proc gets mounted? I doubt that, but you might want to explicitly
check for the existance of that pseudo-file before reading it. Here's
another place we should use a function:
# Count the tape devices of the type specified by the argument.
CountTapes()
{
local num
if [ -e /proc/scsi/IBMtape ]
then num="$(fgrep -c "$1" /proc/scsi/IBMtape 2>>$LOGFILE)"
if [ $? -eq 0 -a -n "$num" ]
then Log $num $1 drives detected
else Error Failed to count $1 tape devices
fi
else Error No tape devices known
fi
echo "$num"
}
The main code of your script would then start out something like this (but
with comments):
e1b=$(CountTapes 3590)
ts1120=$(CountTapes 3592)
if [ "$ts1120" -eq 0 ]
then AddDevice 0402 500507630f594801 0000000000000000
...
Actually, AddDevice() really should be checking to be sure the device appears
in /proc/scsi/IBMtape, but I don't know the format of that file off-hand so I
can't write the code to check for that.
Hopefully, all this will help you get more information about what is happening
during boot-time, so that you can find out exactly what is going wrong. I'll
stop now because this has gotten way too long.
- MacK.
-----
Edmund R. MacKenty
Software Architect
Rocket Software
275 Grove Street · Newton, MA 02466-2272 · USA
Tel: +1.617.614.4321
Email: [email protected]
Web: www.rocketsoftware.com
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390