Now I'm considering something about ptrace. Our kernel version is 2.6.32-279. Maybe it doesn't resume the threads correctly. Is it related to http://kernel.opensuse.org/cgit/kernel/commit/?h=openSUSE-13.1&id=d1f26676dad578a65c94782f0c2bd00b7aa68f1b ?
On Tue, Sep 2, 2014 at 8:03 PM, tobe <[email protected]> wrote: > Just like what @mikael said, running jstack -F has the same behaviour > while jstack doesn't. But our processes have been suspended for several > days and it's quite abnormal. I think there's something preventing the > processes from recovering. Is it related to our running environment or > jdk1.6? > > > On Tue, Sep 2, 2014 at 6:05 PM, tobe <[email protected]> wrote: > >> Hi @martijn. Do you mean you can run jmap and jinfo on the Java process >> which has ran over 25 days? Have you checked the status of that process? >> Our 1.6 jvms were suspended but not exited. >> >> If it's the issue on 1.6, can anyone help to find out that issue and >> patch? >> >> >> On Tue, Sep 2, 2014 at 5:38 PM, tobe <[email protected]> wrote: >> >>> Thank @mikael for replying. But I can see the complete message "Server >>> compiler detected" and expect the JVM to continue. It's wired that this >>> doesn't happen when jinfo the new processes. >>> >>> >>> >>> On Tue, Sep 2, 2014 at 5:28 PM, Staffan Larsen < >>> [email protected]> wrote: >>> >>>> >>>> On 2 sep 2014, at 11:15, Mikael Gerdin <[email protected]> >>>> wrote: >>>> >>>> > Hi, >>>> > >>>> > This is the expected behavior for jmap and jinfo. If you call jstack >>>> with the "-F" flag you will see the same behavior. >>>> > >>>> > The reason for this is that jmap, jinfo and jstack -F all attach to >>>> your target JVM as a debugger and read the memory from the process. That >>>> needs to be done when the target process is in a frozen state. >>>> >>>> But when jinfo/jmap/jstack is done with the process it should continue >>>> execution. >>>> >>>> Is this reproducible with JDK 8? >>>> >>>> /Staffan >>>> >>>> >>>> > >>>> > /Mikael >>>> > >>>> > On 2014-09-02 11:08, tobe wrote: >>>> >> When I run jinfo or jmap to any Java process, it will "suspend" the >>>> Java >>>> >> process. It's 100% reproduced for the long running processes. >>>> >> >>>> >> Here're the detailed steps: >>>> >> >>>> >> 1. Pick a Java process which is running over 25 days(It's wired >>>> because >>>> >> this doesn't work for new processes). >>>> >> 2. Run ps to check the state of the process, should be "Sl" which is >>>> >> expected. >>>> >> 3. Run jinfo or jmap to this process(BTY, jstack doesn't have this >>>> issue). >>>> >> 4. Run ps to check the state of the process. This time it changes to >>>> "Tl" >>>> >> which means STOPPED and the process doesn't response any requests. >>>> >> >>>> >> Here's the output of our process: >>>> >> >>>> >> [work@hadoop ~]$ ps aux |grep "qktst" |grep "RegionServer" >>>> >> work 36663 0.1 1.7 24157828 1150820 ? Sl Aug06 72:54 >>>> >> /opt/soft/jdk/bin/java -cp >>>> >> >>>> /home/work/app/hbase/qktst-qk/regionserver/:/home/work/app/hbase/qktst-qk/regionserver/package//:/home/work/app/hbase/qktst-qk/regionserver/package//lib/*:/home/work/app/hbase/qktst-qk/regionserver/package//* >>>> >> >>>> -Djava.library.path=:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/Linux-amd64-64 >>>> >> >>>> -Xbootclasspath/p:/home/work/app/hbase/qktst-qk/regionserver/package/lib/hadoop-security-2.0.0-mdh1.1.0.jar >>>> >> -Xmx10240m -Xms10240m -Xmn1024m -XX:MaxDirectMemorySize=1024m >>>> >> -XX:MaxPermSize=512m >>>> >> >>>> -Xloggc:/home/work/app/hbase/qktst-qk/regionserver/stdout/regionserver_gc_20140806-211157.log >>>> >> -Xss256k -XX:PermSize=64m -XX:+HeapDumpOnOutOfMemoryError >>>> >> -XX:HeapDumpPath=/home/work/app/hbase/qktst-qk/regionserver/log >>>> >> -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC >>>> -verbose:gc >>>> >> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6 >>>> >> -XX:+UseCMSCompactAtFullCollection >>>> -XX:CMSInitiatingOccupancyFraction=75 >>>> >> -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled >>>> >> -XX:+UseNUMA -XX:+CMSClassUnloadingEnabled >>>> >> -XX:CMSMaxAbortablePrecleanTime=10000 -XX:TargetSurvivorRatio=80 >>>> >> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100 >>>> -XX:GCLogFileSize=128m >>>> >> -XX:CMSWaitDuration=2000 -XX:+CMSScavengeBeforeRemark >>>> >> -XX:+PrintPromotionFailure -XX:ConcGCThreads=16 >>>> -XX:ParallelGCThreads=16 >>>> >> -XX:PretenureSizeThreshold=2097088 -XX:+CMSConcurrentMTEnabled >>>> >> -XX:+ExplicitGCInvokesConcurrent -XX:+SafepointTimeout >>>> >> -XX:MonitorBound=16384 -XX:-UseBiasedLocking >>>> -XX:MaxTenuringThreshold=3 >>>> >> -Dproc_regionserver >>>> >> >>>> -Djava.security.auth.login.config=/home/work/app/hbase/qktst-qk/regionserver/jaas.conf >>>> >> -Djava.net.preferIPv4Stack=true >>>> >> -Dhbase.log.dir=/home/work/app/hbase/qktst-qk/regionserver/log >>>> >> -Dhbase.pid=36663 -Dhbase.cluster=qktst-qk -Dhbase.log.level=debug >>>> >> -Dhbase.policy.file=hbase-policy.xml >>>> >> -Dhbase.home.dir=/home/work/app/hbase/qktst-qk/regionserver/package >>>> >> >>>> -Djava.security.krb5.conf=/home/work/app/hbase/qktst-qk/regionserver/krb5.conf >>>> >> -Dhbase.id.str=work >>>> org.apache.hadoop.hbase.regionserver.HRegionServer start >>>> >> [work@hadoop ~]$ jinfo 36663 > tobe.jinfo >>>> >> Attaching to process ID 36663, please wait... >>>> >> Debugger attached successfully. >>>> >> Server compiler detected. >>>> >> JVM version is 20.12-b01 >>>> >> [work@hadoop ~]$ ps aux |grep "qktst" |grep "RegionServer" >>>> >> work 36663 0.1 1.7 24157828 1151008 ? Tl Aug06 72:54 >>>> >> /opt/soft/jdk/bin/java -cp >>>> >> >>>> /home/work/app/hbase/qktst-qk/regionserver/:/home/work/app/hbase/qktst-qk/regionserver/package//:/home/work/app/hbase/qktst-qk/regionserver/package//lib/*:/home/work/app/hbase/qktst-qk/regionserver/package//* >>>> >> >>>> -Djava.library.path=:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/Linux-amd64-64 >>>> >> >>>> -Xbootclasspath/p:/home/work/app/hbase/qktst-qk/regionserver/package/lib/hadoop-security-2.0.0-mdh1.1.0.jar >>>> >> -Xmx10240m -Xms10240m -Xmn1024m -XX:MaxDirectMemorySize=1024m >>>> >> -XX:MaxPermSize=512m >>>> >> >>>> -Xloggc:/home/work/app/hbase/qktst-qk/regionserver/stdout/regionserver_gc_20140806-211157.log >>>> >> -Xss256k -XX:PermSize=64m -XX:+HeapDumpOnOutOfMemoryError >>>> >> -XX:HeapDumpPath=/home/work/app/hbase/qktst-qk/regionserver/log >>>> >> -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC >>>> -verbose:gc >>>> >> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6 >>>> >> -XX:+UseCMSCompactAtFullCollection >>>> -XX:CMSInitiatingOccupancyFraction=75 >>>> >> -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled >>>> >> -XX:+UseNUMA -XX:+CMSClassUnloadingEnabled >>>> >> -XX:CMSMaxAbortablePrecleanTime=10000 -XX:TargetSurvivorRatio=80 >>>> >> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100 >>>> -XX:GCLogFileSize=128m >>>> >> -XX:CMSWaitDuration=2000 -XX:+CMSScavengeBeforeRemark >>>> >> -XX:+PrintPromotionFailure -XX:ConcGCThreads=16 >>>> -XX:ParallelGCThreads=16 >>>> >> -XX:PretenureSizeThreshold=2097088 -XX:+CMSConcurrentMTEnabled >>>> >> -XX:+ExplicitGCInvokesConcurrent -XX:+SafepointTimeout >>>> >> -XX:MonitorBound=16384 -XX:-UseBiasedLocking >>>> -XX:MaxTenuringThreshold=3 >>>> >> -Dproc_regionserver >>>> >> >>>> -Djava.security.auth.login.config=/home/work/app/hbase/qktst-qk/regionserver/jaas.conf >>>> >> -Djava.net.preferIPv4Stack=true >>>> >> -Dhbase.log.dir=/home/work/app/hbase/qktst-qk/regionserver/log >>>> >> -Dhbase.pid=36663 -Dhbase.cluster=qktst-qk -Dhbase.log.level=debug >>>> >> -Dhbase.policy.file=hbase-policy.xml >>>> >> -Dhbase.home.dir=/home/work/app/hbase/qktst-qk/regionserver/package >>>> >> >>>> -Djava.security.krb5.conf=/home/work/app/hbase/qktst-qk/regionserver/krb5.conf >>>> >> -Dhbase.id.str=work >>>> org.apache.hadoop.hbase.regionserver.HRegionServer start >>>> >> >>>> >> >>>> >> I hope some JVM experts here could help. >>>> >> >>>> >> $ java -version >>>> >> java version "1.6.0_37" >>>> >> Java(TM) SE Runtime Environment (build 1.6.0_37-b06) >>>> >> Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode) >>>> >> >>>> >>>> >>> >> >
