I managed to reproduce the stuck process problem on our test system (E280) now as well.
The process that gets stuck is the JRunProxyServer, 
/opt/CSCOpx/bin/cwjava -cw /opt/CSCOpx -Xmx196m -cp /opt/CSCOpx/objects/jrun/li

ipfilter is compiled with gcc3.2.1, and is running ipfilter 3.4.32 with the solaris 
patch.

ipf -V:

# ipf -V
ipf: IP Filter: v3.4.32 (496)
Kernel: IP Filter: v3.4.32              
Running: yes
Log Flags: 0 = none set
Default: pass all, Logging: available
Active list: 1


uname -a:

SunOS hagu01 5.8 Generic_108528-11 sun4u sparc SUNW,Sun-Fire-280R


ipfstat:

# ipfstat
dropped packets:        in 0    out 0
non-data packets:       in 0    out 0
no-data packets:        in 0    out 0
non-ip packets:         in 0    out 0
   bad packets:         in 0    out 0
copied messages:        in 0    out 0
 IPv6 packets:          in 0 out 0
 input packets:         blocked 68817 passed 3133469 nomatch 952 counted 0 short 0
output packets:         blocked 3197 passed 23557014 nomatch 234 counted 0 short 0
 input packets logged:  blocked 0 passed 229
output packets logged:  blocked 0 passed 0
 packets logged:        input 0 output 0
 log failures:          input 0 output 0
fragment state(in):     kept 0  lost 0
fragment state(out):    kept 0  lost 0
packet state(in):       kept 0  lost 0
packet state(out):      kept 10366      lost 3712
ICMP replies:   0       TCP RSTs sent:  0
Invalid source(in):     0
Result cache hits(in):  233885  (out):  3484
IN Pullups succeeded:   0       failed: 0
OUT Pullups succeeded:  221     failed: 0
Fastroute successes:    0       failures:       0
TCP cksum fails(in):    0       (out):  0
Packet log flags set: (0)
        none

procname=YOURPROCNAME
procaddr=$(ps -o addr -p `pgrep $procname` | grep -v ADDR)
echo "$procaddr::walk thread | ::findstack" | mdb -k

full output has been attached, entries which didn't show stop:

stack pointer for thread 3000c28ba80: 2a101e2b0d1
[ 000002a101e2b0d1 cv_wait+0x38() ]
  000002a101e2b181 holdlwps+0xb8()
  000002a101e2b231 cfork+0x34()
  000002a101e2b2f1 syscall_trap32+0xa8()

stack pointer for thread 30006a7e540: 2a101c30f01
[ 000002a101c30f01 cv_wait+0x38() ]
  000002a101c30fb1 sowaitack+0x60()
  000002a101c31071 sowaitprim+0xc()
  000002a101c31131 sosetsockopt+0x188()
  000002a101c31231 setsockopt+0xbc()
  000002a101c312f1 syscall_trap32+0xa8()

Niels de Carpentier



-----Original Message-----
From: Frank Hofmann - European Solaris CTE-Sustaining Engineering
[mailto:[EMAIL PROTECTED]
Sent: woensdag 28 mei 2003 16:45
To: De Carpentier, Niels N SITI-ITDTS
Subject: RE: ipfilter causes unkillable processes ?



> Truss doesn't report any data for this process.

Try:

procname=YOURPROCNAME
procaddr=$(ps -o addr -p `pgrep $procname` | grep -v ADDR)
echo "$procaddr::walk thread | ::findstack" | mdb -k

Replace YOURPROCNAME with the name of your process.

This gives kernel stacktraces of the threads of your process, you'll
see where those are stuck. The output is somewhat unreadable compared
to "nice" tools like act, but it's at least all built-in, no outsider
packages needed.


Bye,
FrankH.


> 
> Niels
> 
> 
> 
> -----Original Message-----
> From: Jefferson Ogata [mailto:[EMAIL PROTECTED]
> Sent: woensdag 28 mei 2003 15:47
> To: [EMAIL PROTECTED]
> Subject: Re: ipfilter causes unkillable processes ?
> 
> 
> De Carpentier, Niels N SITI-ITDTS wrote:
> > I've just installed ipfilter on 2 Ciscoworks machines (Running on an 
> > E220 with Solaris 8),
> > and it seems this has caused one of the java processes to fail. (The 
> > exact same
> > process on both machines)
> > The process has a lot of connections in CLOSE_WAIT state, and cannot be 
> > killed
> > with kill -9. The process is running in the S state.
> > 
> > I've tried the following to get the process running again:
> > 
> > - put an empty ruleset in, and clear the state table
> > - remove ipfilter from the machine (including kernel module)
> > 
> > This didn't work, so it looks like I need a reboot to get rid of the 
> > process.
> > I've also got ipfilter running on another CiscoWorks box (an E280 with 
> > Solaris 8),
> > which doesn't seem to have this problem.
> > 
> > Is this a known problem ?
> > If not, is there a way to debug this ?
> > (The logs show nothing unusual)
> 
> Try using truss to attach to the process to see what syscall it's hung in. 
> That may be useful information...
> 
> # truss -v all -p xxx
> 
> -- 
> Jefferson Ogata <[EMAIL PROTECTED]>
> NOAA Computer Incident Response Team (N-CIRT) <[EMAIL PROTECTED]>
> 
> 


Attachment: stacktrace.ZIP
Description: stacktrace.ZIP

Reply via email to