Top showed only one java process. I'll try this again once I get back.
On Sun, Oct 9, 2011 at 9:24 PM, Todd Lipcon <[email protected]> wrote: > That jstack just looks like the trace of the maven process - there > should be another JVM which is actually running the tests. > > -Todd > > On Sat, Oct 8, 2011 at 10:14 AM, Li Pi <[email protected]> wrote: >> I got the thing to fail on my vmware box. Heres the stack trace. >> >> Doesn't look like the cache itself is hanging. The 4 runnable threads: >> >> "Attach Listener" daemon prio=10 tid=0x0000000001c48000 nid=0x4cac >> waiting on condition [0x0000000000000000] >> java.lang.Thread.State: RUNNABLE >> >> "Thread-5" prio=10 tid=0x00007fb714117800 nid=0x4c03 runnable >> [0x00007fb720a1e000] >> java.lang.Thread.State: RUNNABLE >> at java.io.FileInputStream.readBytes(Native Method) >> at java.io.FileInputStream.read(FileInputStream.java:236) >> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) >> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) >> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) >> - locked <0x00000000f20403b0> (a java.io.InputStreamReader) >> at java.io.InputStreamReader.read(InputStreamReader.java:184) >> at java.io.BufferedReader.fill(BufferedReader.java:153) >> at java.io.BufferedReader.readLine(BufferedReader.java:316) >> - locked <0x00000000f20403b0> (a java.io.InputStreamReader) >> at java.io.BufferedReader.readLine(BufferedReader.java:379) >> at >> org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) >> >> "Thread-4" prio=10 tid=0x00007fb714114800 nid=0x4c01 runnable >> [0x00007fb720e36000] >> java.lang.Thread.State: RUNNABLE >> at java.io.FileInputStream.readBytes(Native Method) >> at java.io.FileInputStream.read(FileInputStream.java:236) >> at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) >> at java.io.BufferedInputStream.read(BufferedInputStream.java:334) >> - locked <0x00000000f25c6ce8> (a java.io.BufferedInputStream) >> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) >> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) >> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) >> - locked <0x00000000f203d858> (a java.io.InputStreamReader) >> at java.io.InputStreamReader.read(InputStreamReader.java:184) >> at java.io.BufferedReader.fill(BufferedReader.java:153) >> at java.io.BufferedReader.readLine(BufferedReader.java:316) >> - locked <0x00000000f203d858> (a java.io.InputStreamReader) >> at java.io.BufferedReader.readLine(BufferedReader.java:379) >> at >> org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) >> >> "process reaper" daemon prio=10 tid=0x00007fb71401e800 nid=0x4bfe >> runnable [0x00007fb720c34000] >> java.lang.Thread.State: RUNNABLE >> at java.lang.UNIXProcess.waitForProcessExit(Native Method) >> at java.lang.UNIXProcess.access$900(UNIXProcess.java:36) >> at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:148) >> >> >> Looks like fileInputStream.readBytes() is blocking. >> >> >> On Sat, Oct 8, 2011 at 10:04 AM, Ted Yu <[email protected]> wrote: >>> Scott: >>> Do you have time to write a script for analyzing output of Jenkins and put >>> it on HBASE-4480 ? >>> Here is some idea from Ramkrishna: >>> >>> All statements that has Running in it can be parsed to see if the every next >>> Running happens after one hop. >>> Like if the first Running happens to be in 11th line the next Running should >>> be in 13th. >>> If this breaks some where then that test is hanging. >>> This is just one idea. If we can figure out something better we can take it >>> up. >>> >>> Cheers >>> >>> On Sat, Oct 8, 2011 at 9:53 AM, Jesse Yates <[email protected]> wrote: >>> >>>> The script to do this was written in 4480. Just needs some +1s a >>>> - It works pretty well. >>>> >>>> We might want to also mod it to take in a file that is the output of a run >>>> and analyze that. >>>> >>>> - Jesse Yates >>>> >>>> Sent from my iPhone. >>>> >>>> On Oct 8, 2011, at 2:51 AM, Ted Yu <[email protected]> wrote: >>>> >>>> > Parsing test output will do. >>>> > >>>> > >>>> > >>>> > On Oct 7, 2011, at 11:44 PM, Akash Ashok <[email protected]> wrote: >>>> > >>>> >> Hi Ted & Ram >>>> >> >>>> >> Just Figured out the hung test case both in >>>> >> >>>> >> >>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console >>>> >> >>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2304/console >>>> >> >>>> >> Running org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache >>>> >> Running org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer >>>> >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.858 >>>> sec >>>> >> >>>> >> TestSlabCache is the culprit >>>> >> >>>> >> Just copied into noteped++ and searched for running and it highlighted >>>> it >>>> >> and it was easier to find :) >>>> >> >>>> >> And about the script. Is the idea to parse this output and figure out >>>> the >>>> >> hung test case or is there a plan to parse the surefire reports xml? >>>> >> >>>> >> Cheers, >>>> >> Akash A >>>> >> >>>> >> On Sat, Oct 8, 2011 at 11:13 AM, Ted Yu <[email protected]> wrote: >>>> >> >>>> >>> Yeah we need such script. >>>> >>> I went over the tests in >>>> >>> >>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console >>>> >>> and couldn't find out the hanging test. >>>> >>> >>>> >>> Cheers >>>> >>> >>>> >>> On Fri, Oct 7, 2011 at 10:33 PM, Ramakrishna S Vasudevan 00902313 < >>>> >>> [email protected]> wrote: >>>> >>> >>>> >>>> Ted >>>> >>>> >>>> >>>> Once we were already discussing regarding some script to find out some >>>> >>> hung >>>> >>>> tests? >>>> >>>> >>>> >>>> Regards >>>> >>>> Ram >>>> >>>> >>>> >>>> >>>> >>>> ----- Original Message ----- >>>> >>>> From: Ted Yu <[email protected]> >>>> >>>> Date: Saturday, October 8, 2011 10:58 am >>>> >>>> Subject: Re: Jenkins build is back to normal : HBase-TRUNK #2304 >>>> >>>> To: [email protected] >>>> >>>> >>>> >>>>> From >>>> >>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase- >>>> >>>>> TRUNK/2303/console,it wasn't obvious which test(s) hung. >>>> >>>>> But the following error clearly indicated there was some hanging Java >>>> >>>>> process: >>>> >>>>> >>>> >>>>> [ERROR] Failed to execute goal >>>> >>>>> org.apache.maven.plugins:maven-surefire-plugin:2.9:test >>>> (default-test) >>>> >>>>> on project hbase: Failure or timeout -> [Help >>>> >>>>> 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to >>>> >>>>> execute goal org.apache.maven.plugins:maven-surefire-plugin:2.9:test >>>> >>>>> (default-test) on project hbase: Failure or timeout >>>> >>>>> >>>> >>>>> Unluckily we don't have access to the build machine. >>>> >>>>> >>>> >>>>> On Fri, Oct 7, 2011 at 10:14 PM, Akash Ashok >>>> >>>>> <[email protected]> wrote: >>>> >>>>> >>>> >>>>>> Oh cool. Build is back to normal. Could someone tell me what the >>>> >>>>> issue was. >>>> >>>>>> Why was it failing even though there were no failures ? >>>> >>>>>> >>>> >>>>>> On Sat, Oct 8, 2011 at 4:45 AM, Apache Jenkins Server < >>>> >>>>>> [email protected]> wrote: >>>> >>>>>> >>>> >>>>>>> See <https://builds.apache.org/job/HBase-TRUNK/2304/> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>> >>>> >>>>> >>>> >>>> >>>> >>> >>>> >>> >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
