That jstack just looks like the trace of the maven process - there should be another JVM which is actually running the tests.
-Todd On Sat, Oct 8, 2011 at 10:14 AM, Li Pi <l...@ucsd.edu> wrote: > I got the thing to fail on my vmware box. Heres the stack trace. > > Doesn't look like the cache itself is hanging. The 4 runnable threads: > > "Attach Listener" daemon prio=10 tid=0x0000000001c48000 nid=0x4cac > waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Thread-5" prio=10 tid=0x00007fb714117800 nid=0x4c03 runnable > [0x00007fb720a1e000] > java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.readBytes(Native Method) > at java.io.FileInputStream.read(FileInputStream.java:236) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) > - locked <0x00000000f20403b0> (a java.io.InputStreamReader) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:153) > at java.io.BufferedReader.readLine(BufferedReader.java:316) > - locked <0x00000000f20403b0> (a java.io.InputStreamReader) > at java.io.BufferedReader.readLine(BufferedReader.java:379) > at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) > > "Thread-4" prio=10 tid=0x00007fb714114800 nid=0x4c01 runnable > [0x00007fb720e36000] > java.lang.Thread.State: RUNNABLE > at java.io.FileInputStream.readBytes(Native Method) > at java.io.FileInputStream.read(FileInputStream.java:236) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > - locked <0x00000000f25c6ce8> (a java.io.BufferedInputStream) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) > - locked <0x00000000f203d858> (a java.io.InputStreamReader) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:153) > at java.io.BufferedReader.readLine(BufferedReader.java:316) > - locked <0x00000000f203d858> (a java.io.InputStreamReader) > at java.io.BufferedReader.readLine(BufferedReader.java:379) > at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) > > "process reaper" daemon prio=10 tid=0x00007fb71401e800 nid=0x4bfe > runnable [0x00007fb720c34000] > java.lang.Thread.State: RUNNABLE > at java.lang.UNIXProcess.waitForProcessExit(Native Method) > at java.lang.UNIXProcess.access$900(UNIXProcess.java:36) > at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:148) > > > Looks like fileInputStream.readBytes() is blocking. > > > On Sat, Oct 8, 2011 at 10:04 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> Scott: >> Do you have time to write a script for analyzing output of Jenkins and put >> it on HBASE-4480 ? >> Here is some idea from Ramkrishna: >> >> All statements that has Running in it can be parsed to see if the every next >> Running happens after one hop. >> Like if the first Running happens to be in 11th line the next Running should >> be in 13th. >> If this breaks some where then that test is hanging. >> This is just one idea. If we can figure out something better we can take it >> up. >> >> Cheers >> >> On Sat, Oct 8, 2011 at 9:53 AM, Jesse Yates <jesse.k.ya...@gmail.com> wrote: >> >>> The script to do this was written in 4480. Just needs some +1s a >>> - It works pretty well. >>> >>> We might want to also mod it to take in a file that is the output of a run >>> and analyze that. >>> >>> - Jesse Yates >>> >>> Sent from my iPhone. >>> >>> On Oct 8, 2011, at 2:51 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>> > Parsing test output will do. >>> > >>> > >>> > >>> > On Oct 7, 2011, at 11:44 PM, Akash Ashok <thehellma...@gmail.com> wrote: >>> > >>> >> Hi Ted & Ram >>> >> >>> >> Just Figured out the hung test case both in >>> >> >>> >> >>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console >>> >> >>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2304/console >>> >> >>> >> Running org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache >>> >> Running org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer >>> >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.858 >>> sec >>> >> >>> >> TestSlabCache is the culprit >>> >> >>> >> Just copied into noteped++ and searched for running and it highlighted >>> it >>> >> and it was easier to find :) >>> >> >>> >> And about the script. Is the idea to parse this output and figure out >>> the >>> >> hung test case or is there a plan to parse the surefire reports xml? >>> >> >>> >> Cheers, >>> >> Akash A >>> >> >>> >> On Sat, Oct 8, 2011 at 11:13 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >> >>> >>> Yeah we need such script. >>> >>> I went over the tests in >>> >>> >>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console >>> >>> and couldn't find out the hanging test. >>> >>> >>> >>> Cheers >>> >>> >>> >>> On Fri, Oct 7, 2011 at 10:33 PM, Ramakrishna S Vasudevan 00902313 < >>> >>> ramakrish...@huawei.com> wrote: >>> >>> >>> >>>> Ted >>> >>>> >>> >>>> Once we were already discussing regarding some script to find out some >>> >>> hung >>> >>>> tests? >>> >>>> >>> >>>> Regards >>> >>>> Ram >>> >>>> >>> >>>> >>> >>>> ----- Original Message ----- >>> >>>> From: Ted Yu <yuzhih...@gmail.com> >>> >>>> Date: Saturday, October 8, 2011 10:58 am >>> >>>> Subject: Re: Jenkins build is back to normal : HBase-TRUNK #2304 >>> >>>> To: dev@hbase.apache.org >>> >>>> >>> >>>>> From >>> >>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase- >>> >>>>> TRUNK/2303/console,it wasn't obvious which test(s) hung. >>> >>>>> But the following error clearly indicated there was some hanging Java >>> >>>>> process: >>> >>>>> >>> >>>>> [ERROR] Failed to execute goal >>> >>>>> org.apache.maven.plugins:maven-surefire-plugin:2.9:test >>> (default-test) >>> >>>>> on project hbase: Failure or timeout -> [Help >>> >>>>> 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to >>> >>>>> execute goal org.apache.maven.plugins:maven-surefire-plugin:2.9:test >>> >>>>> (default-test) on project hbase: Failure or timeout >>> >>>>> >>> >>>>> Unluckily we don't have access to the build machine. >>> >>>>> >>> >>>>> On Fri, Oct 7, 2011 at 10:14 PM, Akash Ashok >>> >>>>> <thehellma...@gmail.com> wrote: >>> >>>>> >>> >>>>>> Oh cool. Build is back to normal. Could someone tell me what the >>> >>>>> issue was. >>> >>>>>> Why was it failing even though there were no failures ? >>> >>>>>> >>> >>>>>> On Sat, Oct 8, 2011 at 4:45 AM, Apache Jenkins Server < >>> >>>>>> jenk...@builds.apache.org> wrote: >>> >>>>>> >>> >>>>>>> See <https://builds.apache.org/job/HBase-TRUNK/2304/> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>> >>> >>>>> >>> >>>> >>> >>> >>> >> > -- Todd Lipcon Software Engineer, Cloudera