*Synopsis*: ksh93 hangs in situations that ksh handles okay

CR 6631006 changed on Sep 17 2009 by <User 1-5Q-6276>

=== Field ============ === New Value ============= === Old Value =============

Public Comments        New Note                                               
====================== =========================== ===========================

     
*Change Request ID*: 6631006

*Synopsis*: ksh93 hangs in situations that ksh handles okay

  Product: solaris
  Category: shell
  Subcategory: korn93
  Type: Defect
  Subtype: 
  Status: 3-Accepted
  Substatus: 
  Priority: 2-High
  Introduced In Release: 
  Introduced In Build: 
  Responsible Engineer: <User 1-GN0KC>
  Keywords: 

=== *Description* ============================================================
This morning some of the elements in my $PATH were inaccessible due to
an offline NFS server.  When I logged in, my GNOME terminal window
didn't give me a shell prompt.  When I entered a ^C, the window went
away.

I then logged in as root and hid /usr/bin/ksh93, so that my login
scripts would use /usr/bin/ksh instead of /usr/bin/ksh93.  I then
logged in as myself, and my GNOME terminal window window gave me the
expected prompt.

I then tried running ksh93 by hand; it was indeed stuck trying to
access one of the inaccessible directories:

    athyra$ truss -p 8666
    stat("/ws/onnv-tools/onbld/bin", 0xFFFFFFFF7FFFE588) (sleeping...)

This appears to be hard to recover from, since one usually needs a
functional shell before one can change one's shell.

ksh93 needs to be at least as robust as the Solaris ksh in
circumstances like this before it can replace the Solaris ksh.

And there's some question in my mind whether ksh93 should be the
default root shell if it hangs in situations like this.  (Though I
suppose it's questionable practice for root to have NFS directories in
its PATH.  So maybe this isn't a critical issue.)

*** (#1 of 2): 2007-11-16 18:04:11 GMT+00:00 <User 1-5Q-12482>

[dep, 15Apr2009]

  This is especially bad considering ksh93 is installed as /bin/sh,
  which means every system(3C) call will hang on startup regardless of
  its dependence on PATH resolution beyond known local entries (usually 
  first in one's path for this reason).

*** (#2 of 2): 2009-04-16 00:44:28 GMT+00:00 <User 1-5Q-4224>


=== *Public Comments* ========================================================
Are you able to reproduce it with build 111?

*** (#1 of 4): 2009-04-16 07:32:20 GMT+00:00 <User 1-1SURPB>

[dep, 16Apr2009]

  ksh93 appears to have the same behavior on build 112.

*** (#2 of 4): 2009-04-16 20:14:59 GMT+00:00 <User 1-5Q-4224>

[dep, 13Aug2009]

  (In response to an unnecessarily non-public comment claiming this has
  something to do with fancy stuck filesystem detection in Sun's ksh88,
  and that somehow caching file descriptors and using openat will
  magically solve the problem.)

  There is *NOT* a matter of Sun's ksh detecting stuck filesystems.
  This is a matter of ksh93 scanning your entire path on startup,
  whereas Sun's ksh (and more importantly, sh) simply did not.
  Period.

  My PATH:

    ; echo $PATH
    
/home/dep/private/bin:/home/dep/bin/i386:/home/dep/bin:/usr/bin:/usr/sbin:/usr/openwin/bin:/usr/sfw/bin:/ws/onnv-tools/SUNWspro/SS11/bin:/ws/onnv-tools/SUNWspro/SOS8/bin:/ws/onnv-tools/onbld/bin:/ws/onnv-tools/onbld/bin/i386:/usr/ccs/bin:/usr/java/bin

  Eliminate effect of dot files:

    ; mkdir /tmp/foo
    ; HOME=/tmp/foo

  stats and opens from ksh (or /usr/xpg4/bin/sh):

    ; truss -t stat,open ksh
    stat64("/usr/bin/ksh", 0x08047608)              = 0
    open("/var/ld/ld.config", O_RDONLY)             Err#2 ENOENT
    stat64("/lib/libc.so.1", 0x08046E08)            = 0
    open("/lib/libc.so.1", O_RDONLY)                = 3
    stat64("/home/dep", 0x080476F0)                 = 0
    stat64(".", 0x08047780)                         = 0
    stat64("/home/dep", 0x08047720)                 = 0
    stat64(".", 0x080477B0)                         = 0
    stat64("/home/dep", 0x08047720)                 = 0
    stat64(".", 0x080477B0)                         = 0
    open64("", O_RDWR|O_APPEND|O_CREAT, 0600)       Err#2 ENOENT
    open64("/tmp/sh827332.1", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
    $

  stats and opens from sh:

    ; truss -t stat,open sh
    stat64("/sbin/sh", 0x08047610)                  = 0
    open("/var/ld/ld.config", O_RDONLY)             Err#2 ENOENT
    stat64("/lib/libc.so.1", 0x08046E10)            = 0
    open("/lib/libc.so.1", O_RDONLY)                = 3
    $

  stats and opens from ksh93:

    ; truss -t stat,open ksh93
    stat64("/usr/bin/ksh93", 0x08047604)            = 0
    open("/var/ld/ld.config", O_RDONLY)             Err#2 ENOENT
    stat64("/lib/libc.so.1", 0x08046E04)            = 0
    open("/lib/libc.so.1", O_RDONLY)                = 3
    open("/proc/self/auxv", O_RDONLY)               = 3
    stat("/usr/bin/amd64/ksh93", 0xFFFFFD7FFFDFF550) = 0
    open("/var/ld/64/ld.config", O_RDONLY)          Err#2 ENOENT
    stat("/lib/64/libc.so.1", 0xFFFFFD7FFFDFE9F0)   = 0
    open("/lib/64/libc.so.1", O_RDONLY)             = 3
    stat("/lib/64/libshell.so.1", 0xFFFFFD7FFFDFEBC0) Err#2 ENOENT
    stat("/usr/lib/64/libshell.so.1", 0xFFFFFD7FFFDFEBC0) = 0
    open("/usr/lib/64/libshell.so.1", O_RDONLY)     = 3
    stat("/lib/64/libcmd.so.1", 0xFFFFFD7FFFDFE730) Err#2 ENOENT
    stat("/usr/lib/64/libcmd.so.1", 0xFFFFFD7FFFDFE730) = 0
    open("/usr/lib/64/libcmd.so.1", O_RDONLY)       = 3
    stat("/lib/64/libast.so.1", 0xFFFFFD7FFFDFE2A0) Err#2 ENOENT
    stat("/usr/lib/64/libast.so.1", 0xFFFFFD7FFFDFE2A0) = 0
    open("/usr/lib/64/libast.so.1", O_RDONLY)       = 3
    stat("/lib/64/libm.so.2", 0xFFFFFD7FFFDFE730)   = 0
    open("/lib/64/libm.so.2", O_RDONLY)             = 3
    stat("/dev/null", 0xFFFFFD7FFFDFF180)           = 0
    stat("/home/dep", 0xFFFFFD7FFFDFF110)           = 0
    stat(".", 0xFFFFFD7FFFDFF190)                   = 0
    stat("/home/dep/private/bin", 0xFFFFFD7FFFDFF4C0) = 0
    open("/home/dep/private/bin/.paths", O_RDONLY)  Err#2 ENOENT
    stat("/home/dep/bin/i386", 0xFFFFFD7FFFDFF4C0)  = 0
    open("/home/dep/bin/i386/.paths", O_RDONLY)     Err#2 ENOENT
    stat("/home/dep/bin", 0xFFFFFD7FFFDFF4C0)       = 0
    open("/home/dep/bin/.paths", O_RDONLY)          Err#2 ENOENT
    stat("/usr/bin", 0xFFFFFD7FFFDFF4C0)            = 0
    open("/usr/bin/.paths", O_RDONLY)               Err#2 ENOENT
    stat("/usr/sbin", 0xFFFFFD7FFFDFF4C0)           = 0
    open("/usr/sbin/.paths", O_RDONLY)              Err#2 ENOENT
    stat("/usr/openwin/bin", 0xFFFFFD7FFFDFF4C0)    = 0
    open("/usr/openwin/bin/.paths", O_RDONLY)       Err#2 ENOENT
    stat("/usr/sfw/bin", 0xFFFFFD7FFFDFF4C0)        = 0
    open("/usr/sfw/bin/.paths", O_RDONLY)           Err#2 ENOENT
    stat("/ws/onnv-tools/SUNWspro/SS11/bin", 0xFFFFFD7FFFDFF4C0) = 0
    open("/ws/onnv-tools/SUNWspro/SS11/bin/.paths", O_RDONLY) Err#2 ENOENT
    stat("/ws/onnv-tools/SUNWspro/SOS8/bin", 0xFFFFFD7FFFDFF4C0) = 0
    open("/ws/onnv-tools/SUNWspro/SOS8/bin/.paths", O_RDONLY) Err#2 ENOENT
    stat("/ws/onnv-tools/onbld/bin", 0xFFFFFD7FFFDFF4C0) = 0
    open("/ws/onnv-tools/onbld/bin/.paths", O_RDONLY) Err#2 ENOENT
    stat("/ws/onnv-tools/onbld/bin/i386", 0xFFFFFD7FFFDFF4C0) = 0
    open("/ws/onnv-tools/onbld/bin/i386/.paths", O_RDONLY) Err#2 ENOENT
    stat("/usr/ccs/bin", 0xFFFFFD7FFFDFF4C0)        = 0
    open("/usr/ccs/bin/.paths", O_RDONLY)           Err#2 ENOENT
    stat("/usr/java/bin", 0xFFFFFD7FFFDFF4C0)       = 0
    open("/usr/java/bin/.paths", O_RDONLY)          Err#2 ENOENT
    open("/etc/ksh.kshrc", O_RDONLY)                = 3
    open("/tmp/foo/.kshrc", O_RDONLY)               Err#2 ENOENT
    open("", O_RDWR|O_APPEND|O_CREAT, 0600)         Err#2 ENOENT
    open("/tmp/astv6s.919", O_RDWR|O_APPEND|O_CREAT, 0600) = 3
        Received signal #18, SIGCLD, in waitid() [caught]
          siginfo: SIGCLD CLD_EXITED pid=827335 status=0x0000
    <email address omitted>:/home/dep$

  As you can see, even though nothing actually made use of my PATH,
  ksh93 performed a stat and open for each PATH element.  This
  preliminary scan of the path is costly and unnecessary, and makes
  ksh93 unusable in many situations.

  Even bash doesn't do this (it searches PATH to find itself, but only
  uses as much as it needs):

    ; truss -t open,stat bash
    stat64("/usr/bin/bash", 0x08047608)             = 0
    open("/var/ld/ld.config", O_RDONLY)             Err#2 ENOENT
    stat64("/lib/libcurses.so.1", 0x08046E08)       = 0
    open("/lib/libcurses.so.1", O_RDONLY)           = 3
    stat64("/lib/libsocket.so.1", 0x08046E08)       = 0
    open("/lib/libsocket.so.1", O_RDONLY)           = 3
    stat64("/lib/libnsl.so.1", 0x08046E08)          = 0
    open("/lib/libnsl.so.1", O_RDONLY)              = 3
    stat64("/lib/libdl.so.1", 0x08046E08)           = 0
    open("/lib/libdl.so.1", O_RDONLY)               = 3
    stat64("/lib/libc.so.1", 0x08046E08)            = 0
    open("/lib/libc.so.1", O_RDONLY)                = 3
    open64("/dev/tty", O_RDWR|O_NONBLOCK)           = 3
    stat64("/dev/pts/0", 0x08047810)                = 0
    open64("/var/run/name_service_door", O_RDONLY)  = 3
    stat64("/home/dep", 0x08047690)                 = 0
    stat64(".", 0x08047720)                         = 0
    stat64(".", 0x080476C0)                         = 0
    stat64("/home/dep/private/bin/bash", 0x080475C0) Err#2 ENOENT
    stat64("/home/dep/bin/i386/bash", 0x080475C0)   Err#2 ENOENT
    stat64("/home/dep/bin/bash", 0x080475C0)        Err#2 ENOENT
    stat64("/usr/bin/bash", 0x080475C0)             = 0
    stat64("/usr/bin/bash", 0x080475E0)             = 0
    open64("/tmp/foo/.bashrc", O_RDONLY)            Err#2 ENOENT
    open64("", O_RDONLY)                            Err#2 ENOENT
    open("/home/dep/.terminfo/x/xterm", O_RDONLY)   Err#2 ENOENT
    open("/usr/share/lib/terminfo//x/xterm", O_RDONLY) = 4
        Received signal #20, SIGWINCH [caught]
    stat64("/tmp/foo/.inputrc", 0x08046ED0)         Err#2 ENOENT
    stat64("/etc/inputrc", 0x08046ED0)              Err#2 ENOENT
        Received signal #20, SIGWINCH [caught]
    bash-3.2$

  Moreover, none of zsh, csh, nor tcsh scan the PATH on startup.  This
  is a ksh93-only phenomenon.

*** (#3 of 4): 2009-08-13 21:07:10 GMT+00:00 <User 1-5Q-4224>

Update from Roland:


1. the original ksh88i build from the AT&T sources behaves the same way as 
ksh93 version 's' and scans the PATH at startup. That's why I _guessed_ that 
someone has modified Solaris's ksh88 to behave differently (as a side-effect 
neither Solaris ksh88 or the derived /usr/xpg4/bin/sh conform to POSIX/SUS if 
they no longer check for this (see [2])).


2. The POSIX/SUS standard _requires_ that shells scan all elements of PATH when 
they try to find a command. This even happens for builtin commands when they 
are bound to a specific PATH since such bound builtins are only allowed to be 
executed if there is a matching file in the filesystem


3. the results of the PATH scan are allowed to be cached. That's why we're 
going to switch to |openat()| the directories in PATH at the time when PATH is 
set/changed for one of the next ksh93 versions (but first we need to complete 
ksh93-integration update2) - if there is a way to detect stuck NFS filesystems 
we're going to add the matching code with that version

*** (#4 of 4): 2009-09-17 10:54:08 GMT+00:00 <User 1-5Q-6276>


=== *Workaround* =============================================================

=== *Additional Details* =====================================================
        Targeted Release: 
        Commit To Fix In Build: 
        Fixed In Build: 
        Integrated In Build: 
        Verified In Build: 
  See Also: 
  Duplicate of: 
  Hooks:
        Hook1: 
        Hook2: 
        Hook3: 
        Hook4: 
        Hook5: 
        Hook6: 
  Program Management: New Defect
  Root Cause: 
  Fix Affects Documentation: No
  Fix Affects Localization: No

=== *History* ================================================================
        Date Submitted: 2007-11-16 18:04:11 GMT+00:00
        Submitted By: <User 1-5Q-12482>

        Status Changed    Date Updated                  Updated By
        3-Accepted        2008-08-20 22:57:39 GMT+00:00 <User 1-5Q-5151>
        2-Incomplete      2009-04-16 07:32:19 GMT+00:00 <User 1-1SURPB>
        3-Accepted        2009-04-16 20:14:59 GMT+00:00 <User 1-5Q-4224>


=== *Service Request* ========================================================
        Impact: Significant
        Functionality: Secondary
        Severity: 3
        Product Name: solaris
        Product Release: solaris_nevada
        Product Build: 
        Operating System: snv_77
        Hardware: ultrasparc
        Submitted Date: 2007-11-16 18:04:11 GMT+00:00


=== *Service Request* ========================================================
        Impact: Significant
        Functionality: Primary
        Severity: 2
        Product Name: solaris
        Product Release: solaris_nevada
        Product Build: snv_110
        Operating System: snv_110
        Hardware: generic
        Submitted Date: 2009-04-16 00:44:28 GMT+00:00


=== *Service Request* ========================================================
        Impact: Critical
        Functionality: Primary
        Severity: 1
        Product Name: solaris
        Product Release: solaris_nevada
        Product Build: snv_122
        Operating System: snv_122
        Hardware: generic
        Submitted Date: 2009-09-15 17:34:12 GMT+00:00


=== *Multiple Release (MR) Cluster* - 0 ======================================

Reply via email to