Re: [ast-developers] Best way(s) to track down a sporadic bug in ksh?

Rocky Bernstein Sun, 01 Mar 2009 09:38:39 -0800

Since there don't seem to be any other takers, I'll mention general ksh
debugging. Some assembly will be required before this might be useful for
your situation.

You mention set +x tracing, so you may want to set PS4 to something nicer
than the default. For ksh I set mine to:

  (${.sh.file}:${LINENO}): ${.sh.fun} - [${.sh.subshell}]

As mentioned back in
comp.unix.shell<http://groups.google.com/group/comp.unix.shell/browse_frm/thread/87c3f728476ed29d>,
one can turn on/off set tracing by putting it in an interrupt handler, e.g,:

  PS4=(${.sh.file}:${LINENO}): ${.sh.fun} - [${.sh.subshell}]
  trap 'set -x' USR1 ; trap 'set +x'  USR2

A more sophisticated variant of this is to install the ksh
debugger<http://github.com/rocky/kshdb/>
and then on interrupt you can go into that or use routines from that to
customize whatever you want to do in your trap handler. For example if you
put something like this at the top of the file:

  trap '_Dbg_debugger' USR1;
  . /usr/local/share/kshdb/dbg-trace.sh # assuming /usr/local/share/kshdb is
where this is installed

sending the process a USR1 signal will have it go into the debugger.

I tried:
   trap '. /usr/local/share/kshdb/dbg-trace.sh; _Dbg_debugger' USR1

but for reasons I don't understand this doesn't work. Actually, it seemed to
crash ksh with:

  /tmp/kread.sh[5]: _Dbg_hook[155]: _Dbg_process_commands[91]: eval[1]:
rerun: not found [No such file or directory]
  /tmp/kread.sh: error: cannot find myself; rerun with an absolute file name
  Memory fault

Debugging doesn't slow the program down until you run _Dbg_debugger which
installs a trap DEBUG hook.

The above makes sense only you have access to a tty where you can then give
interactive commands and see the results. There is a debugger command "set
inferior tty" which allows you to specify a tty for debugger input output.
(The bash debugger has a command to allow you redirect output to a file, but
I haven't ported that over.)

If you want to run a one-shot list of debugger commands, create a file with
debugger commands and then specify that as option -x in  dbg-trace.sh. For
example, in file mykshdb.cmds put:

 list
 where
 continue # or perhaps "quit" or "kill"

And then the source dbg-trace as was done  above.
 . /usr/local/share/kshdb/dbg-trace.sh -x mykshdb.cmds

Disclaimer, I don't use the ksh debugger that much and really don't anyone
who does. The above more or less is the extent of what I understand, so
beyond that you are in uncharted waters.

Good luck!

. /usr/local/share/kshdb/dbg-trace.sh -x /tmp/kread.cmd

On Thu, Feb 26, 2009 at 4:19 PM, <[email protected]> wrote:

>  I have encountered a sporadic bug which I believe is in ksh (not in my
> ksh scripts).  I'm trying to track this bug down so I can make a proper bug
> report.  I believe it's caused by using a construct like in a "filter"
> script:
>
> /bin/cat FILE | while read a b c ;do
>    blah && break
>    echo $a $b $c
> done
>
> What I see in the process list is that the cat is hung.  What I believe is
> happening is that some parent process which is reading the stdout of this
> process is somehow blocking, which is causing this process to block, which
> causes /bin/cat to block.  I can see in the lsof output that there is a
> single reader of the stdout of /bin/cat.
>
> This is all complicated by several layers of dotted (sourced) scripts,
> which don't show up in the process list.  This bug does not occur in ksh88
> running on AIX or Solaris, only in ksh93 running on Linux.  It doesn't
> always occur in the same place (the bottom level script is the same, but the
> caller can be different from occurance to occurance).  Additionally, it
> occurs only sometimes when it is run, making me think it involves some kind
> or resource issue or race condition.
>
> I have not yet had the chance of viewing this live while it's hung (it
> hangs in the middle of the night and it gets killed before I'm able to see
> what's going on).  If I can see it live, I plan to run strace -p PID on
> the parent process (the reader of the stdin/stdout pipeline) to see if
> anything interesting shows up, but I don't know what to do after that.
>
> I'd like to avoid putting "set -x" in the code, if possible (because the
> code is quite messy and gets run on many servers which don't experience the
> hang).
>
> Any thoughts on how I can tell what's actually happening to cause this
> hang?  Or what would be useful for a bug report?  Any particular tools which
> are useful other than ps/lsof/strace/vim?
>
> Thanks!
> -- John Wiersba
>
> _______________________________________________
> ast-developers mailing list
> [email protected]
> https://mailman.research.att.com/mailman/listinfo/ast-developers
>
>

_______________________________________________
ast-developers mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-developers

Re: [ast-developers] Best way(s) to track down a sporadic bug in ksh?

Reply via email to