*Synopsis*: ksh93 signal handling is broken

CR 6782948 changed on Jan 16 2009 by <User 1-5Q-1267>

=== Field ============ === New Value ============= === Old Value =============

Introduced in Release  solaris_nevada                                         
====================== =========================== ===========================

     
*Change Request ID*: 6782948

*Synopsis*: ksh93 signal handling is broken

  Product: solaris
  Category: shell
  Subcategory: korn93
  Type: Defect
  Subtype: 
  Status: 1-Dispatched
  Substatus: 
  Priority: 2-High
  Introduced In Release: solaris_nevada
  Introduced In Build: 
  Responsible Engineer: 
  Keywords: 

=== *Description* ============================================================
i'm currently running opensolaris snv_99.
i noticed some of my shell script hanging when running in ksh93.
the problem seems to be centered around signal trap and child handling
within ksh93.  i managed to boil my scripts down to an easily reproducible
test case.

you can reproduce the problem by running the following script:
---8<---
#!/bin/ksh

foo() { exit 0; }
trap foo EXIT

/bin/yes | while read yes; do
        (/bin/date)
done

---8<---

then in another window run the following command:
---8<---
while :; do kill -WINCH <script pid>; done
---8<---

this will quickly result in a failure.  i've seen the shell
process emit error messages, hard hang (requiring a kill -9),
and/or core dump.

here's the stack trace from my originaly hung shell script:
---8<---
<email address omitted>$ pstack 15252
15252:  /bin/ksh -p /home/edp/work/bin/explorer_extract
 ff2c5910 waitid   (7, 0, ffbfdad8, f)
 0002993c job_wait (3ba2, 57400, 57800, 0, 1, f0) + 1b0
 00038090 sh_exec  (0, 57800, 0, 57800, 57800, 0) + 2198
 000371fc sh_exec  (8000, 58e538, 4, 57400, ffbfe0ac, 1) + 1304
 0003722c sh_exec  (58e0e0, 4, 9004, 3, 8004, 58e280) + 1334
 000324cc sh_funct (58e0e0, 56b56c, 4, 0, 0, 56b120) + 144
 00036450 sh_exec  (56b120, 4, 509208, 4e97f8, 509208, 57800) + 558
 0003722c sh_exec  (56b460, 4, 57400, 3, 57400, 56b410) + 1334
 00037708 sh_exec  (56acd8, 57800, 0, 57800, 57800, 57800) + 1810
 000371fc sh_exec  (8000, 56b480, 14, 57400, ffbff154, 1) + 1304
 0002d87c exfile   (80000, 57800, 57800, 100000, 57800, 57800) + 73c
 0002d110 main     (2f400, 57400, 4fd948, 57400, 57800, 57800) + a1c
 00019450 _start   (0, 0, 0, 0, 0, 0) + 108
---8<---

in the cases when ksh93 core dumps, the stack traces vary, but the
thing that they all have in common is a call to _ast_[cm]alloc(),
which seems to indicate some kind of memory corruption.  here's one
example stack trace, that shows the memory allocation function
being called from a signal handler.  (normally memory allocation
within signal handlers immediatly indicates an application bug
since the libc memory allocation interfaces are not async-signal
safe, but ksh93 seems to provide it's own memory allocation
interfaces, and i don't know if they are async-signal safe):
---8<---
core 'core.ksh93.348315.89769' of 348315:       /bin/ksh ./bug.sh
 fffffd7ffefc18db bestsearch () + 1ab
 fffffd7ffefc22a3 bestalloc () + 263
 fffffd7ffefc102a _ast_malloc () + 9a
 fffffd7fff0dc7b3 nv_putval () + 8b3
 fffffd7fff0c1468 sh_fault () + 88
 fffffd7fff2750b6 __sighndlr () + 6
 fffffd7fff2698ef call_user_handler () + 2ff
 fffffd7fff269af9 sigacthandler (14, 0, fffffd7fffdfae30) + c9
 --- called from signal handler with signal 20 (SIGWINCH) ---
 fffffd7ffef3404a dthash () + 7a
 fffffd7fff0e0b9b nv_search () + 6b
 fffffd7fff0e9931 path_spawn () + 91
 fffffd7fff0f98ed sh_ntfork () + 3bd
 fffffd7fff0f7127 sh_exec () + 2ff7
 fffffd7fff0f03d6 sh_subshell () + 596
 fffffd7fff0f5dfb sh_exec () + 1ccb
 fffffd7fff0f536c sh_exec () + 123c
 fffffd7fff0f7589 sh_exec () + 3459
 fffffd7fff0f5c44 sh_exec () + 1b14
 fffffd7fff0d98e6 exfile () + 766
 fffffd7fff0d910b sh_main () + 7ab
 0000000000400db1 main () + 21
 0000000000400c3c ???????? ()
---8<---

*** (#1 of 1): 2008-12-09 20:56:00 GMT+00:00 <User 1-5Q-4162>


=== *Public Comments* ========================================================

=== *Workaround* =============================================================

=== *Additional Details* =====================================================
        Targeted Release: 
        Commit To Fix In Build: 
        Fixed In Build: 
        Integrated In Build: 
        Verified In Build: 
  See Also: 
  Duplicate of: 
  Hooks:
        Hook1: 
        Hook2: 
        Hook3: 
        Hook4: 
        Hook5: 
        Hook6: 
  Program Management: 
  Root Cause: 
  Fix Affects Documentation: No
  Fix Affects Localization: No

=== *History* ================================================================
        Date Submitted: 2008-12-09 20:56:00 GMT+00:00
        Submitted By: <User 1-5Q-4162>

        Status Changed    Date Updated                  Updated By


=== *Service Request* ========================================================
        Impact: Significant
        Functionality: Secondary
        Severity: 3
        Product Name: solaris
        Product Release: solaris_nevada
        Product Build: 
        Operating System: snv_99
        Hardware: generic
        Submitted Date: 2008-12-09 20:56:00 GMT+00:00


=== *Multiple Release (MR) Cluster* - 0 ======================================


Reply via email to