I got the thing to fail on my vmware box. Heres the stack trace. Doesn't look like the cache itself is hanging. The 4 runnable threads:
"Attach Listener" daemon prio=10 tid=0x0000000001c48000 nid=0x4cac waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Thread-5" prio=10 tid=0x00007fb714117800 nid=0x4c03 runnable [0x00007fb720a1e000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:236) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) - locked <0x00000000f20403b0> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:153) at java.io.BufferedReader.readLine(BufferedReader.java:316) - locked <0x00000000f20403b0> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:379) at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) "Thread-4" prio=10 tid=0x00007fb714114800 nid=0x4c01 runnable [0x00007fb720e36000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:236) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0x00000000f25c6ce8> (a java.io.BufferedInputStream) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) - locked <0x00000000f203d858> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:153) at java.io.BufferedReader.readLine(BufferedReader.java:316) - locked <0x00000000f203d858> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:379) at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) "process reaper" daemon prio=10 tid=0x00007fb71401e800 nid=0x4bfe runnable [0x00007fb720c34000] java.lang.Thread.State: RUNNABLE at java.lang.UNIXProcess.waitForProcessExit(Native Method) at java.lang.UNIXProcess.access$900(UNIXProcess.java:36) at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:148) Looks like fileInputStream.readBytes() is blocking. On Sat, Oct 8, 2011 at 10:04 AM, Ted Yu <yuzhih...@gmail.com> wrote: > Scott: > Do you have time to write a script for analyzing output of Jenkins and put > it on HBASE-4480 ? > Here is some idea from Ramkrishna: > > All statements that has Running in it can be parsed to see if the every next > Running happens after one hop. > Like if the first Running happens to be in 11th line the next Running should > be in 13th. > If this breaks some where then that test is hanging. > This is just one idea. If we can figure out something better we can take it > up. > > Cheers > > On Sat, Oct 8, 2011 at 9:53 AM, Jesse Yates <jesse.k.ya...@gmail.com> wrote: > >> The script to do this was written in 4480. Just needs some +1s a >> - It works pretty well. >> >> We might want to also mod it to take in a file that is the output of a run >> and analyze that. >> >> - Jesse Yates >> >> Sent from my iPhone. >> >> On Oct 8, 2011, at 2:51 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> > Parsing test output will do. >> > >> > >> > >> > On Oct 7, 2011, at 11:44 PM, Akash Ashok <thehellma...@gmail.com> wrote: >> > >> >> Hi Ted & Ram >> >> >> >> Just Figured out the hung test case both in >> >> >> >> >> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console >> >> >> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2304/console >> >> >> >> Running org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache >> >> Running org.apache.hadoop.hbase.io.hfile.TestFixedFileTrailer >> >> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.858 >> sec >> >> >> >> TestSlabCache is the culprit >> >> >> >> Just copied into noteped++ and searched for running and it highlighted >> it >> >> and it was easier to find :) >> >> >> >> And about the script. Is the idea to parse this output and figure out >> the >> >> hung test case or is there a plan to parse the surefire reports xml? >> >> >> >> Cheers, >> >> Akash A >> >> >> >> On Sat, Oct 8, 2011 at 11:13 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> >> >>> Yeah we need such script. >> >>> I went over the tests in >> >>> >> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2303/console >> >>> and couldn't find out the hanging test. >> >>> >> >>> Cheers >> >>> >> >>> On Fri, Oct 7, 2011 at 10:33 PM, Ramakrishna S Vasudevan 00902313 < >> >>> ramakrish...@huawei.com> wrote: >> >>> >> >>>> Ted >> >>>> >> >>>> Once we were already discussing regarding some script to find out some >> >>> hung >> >>>> tests? >> >>>> >> >>>> Regards >> >>>> Ram >> >>>> >> >>>> >> >>>> ----- Original Message ----- >> >>>> From: Ted Yu <yuzhih...@gmail.com> >> >>>> Date: Saturday, October 8, 2011 10:58 am >> >>>> Subject: Re: Jenkins build is back to normal : HBase-TRUNK #2304 >> >>>> To: dev@hbase.apache.org >> >>>> >> >>>>> From >> >>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase- >> >>>>> TRUNK/2303/console,it wasn't obvious which test(s) hung. >> >>>>> But the following error clearly indicated there was some hanging Java >> >>>>> process: >> >>>>> >> >>>>> [ERROR] Failed to execute goal >> >>>>> org.apache.maven.plugins:maven-surefire-plugin:2.9:test >> (default-test) >> >>>>> on project hbase: Failure or timeout -> [Help >> >>>>> 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to >> >>>>> execute goal org.apache.maven.plugins:maven-surefire-plugin:2.9:test >> >>>>> (default-test) on project hbase: Failure or timeout >> >>>>> >> >>>>> Unluckily we don't have access to the build machine. >> >>>>> >> >>>>> On Fri, Oct 7, 2011 at 10:14 PM, Akash Ashok >> >>>>> <thehellma...@gmail.com> wrote: >> >>>>> >> >>>>>> Oh cool. Build is back to normal. Could someone tell me what the >> >>>>> issue was. >> >>>>>> Why was it failing even though there were no failures ? >> >>>>>> >> >>>>>> On Sat, Oct 8, 2011 at 4:45 AM, Apache Jenkins Server < >> >>>>>> jenk...@builds.apache.org> wrote: >> >>>>>> >> >>>>>>> See <https://builds.apache.org/job/HBase-TRUNK/2304/> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> >
2011-10-08 10:12:58 Full thread dump OpenJDK 64-Bit Server VM (20.0-b11 mixed mode): "Attach Listener" daemon prio=10 tid=0x0000000001c48000 nid=0x4cac waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Thread-5" prio=10 tid=0x00007fb714117800 nid=0x4c03 runnable [0x00007fb720a1e000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:236) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) - locked <0x00000000f20403b0> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:153) at java.io.BufferedReader.readLine(BufferedReader.java:316) - locked <0x00000000f20403b0> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:379) at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) "Thread-4" prio=10 tid=0x00007fb714114800 nid=0x4c01 runnable [0x00007fb720e36000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:236) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0x00000000f25c6ce8> (a java.io.BufferedInputStream) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176) - locked <0x00000000f203d858> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:153) at java.io.BufferedReader.readLine(BufferedReader.java:316) - locked <0x00000000f203d858> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:379) at org.codehaus.plexus.util.cli.StreamPumper.run(StreamPumper.java:129) "process reaper" daemon prio=10 tid=0x00007fb71401e800 nid=0x4bfe runnable [0x00007fb720c34000] java.lang.Thread.State: RUNNABLE at java.lang.UNIXProcess.waitForProcessExit(Native Method) at java.lang.UNIXProcess.access$900(UNIXProcess.java:36) at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:148) "ThreadedStreamConsumer" prio=10 tid=0x00007fb714100000 nid=0x4bfd waiting on condition [0x00007fb720d35000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000f2021410> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:386) at org.apache.maven.surefire.util.internal.Java15BlockingQueue.take(Java15BlockingQueue.java:40) at org.apache.maven.plugin.surefire.booterclient.output.ThreadedStreamConsumer$Pumper.run(ThreadedStreamConsumer.java:68) at java.lang.Thread.run(Thread.java:679) "pool-1-thread-5" prio=10 tid=0x0000000001661000 nid=0x4beb in Object.wait() [0x00007fb7211b2000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at java.lang.Object.wait(Object.java:502) at hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:316) - locked <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:994) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1054) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575) at java.lang.Thread.run(Thread.java:679) "pool-1-thread-4" prio=10 tid=0x00007fb714004000 nid=0x4bea in Object.wait() [0x00007fb7212b3000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at java.lang.Object.wait(Object.java:502) at hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:316) - locked <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:994) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1054) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575) at java.lang.Thread.run(Thread.java:679) "pool-1-thread-3" prio=10 tid=0x00007fb714002800 nid=0x4be9 in Object.wait() [0x00007fb7213b4000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at java.lang.Object.wait(Object.java:502) at hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:316) - locked <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:994) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1054) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575) at java.lang.Thread.run(Thread.java:679) "pool-1-thread-2" prio=10 tid=0x0000000001754800 nid=0x4be8 in Object.wait() [0x00007fb7214b5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at java.lang.Object.wait(Object.java:502) at hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:316) - locked <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:994) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1054) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575) at java.lang.Thread.run(Thread.java:679) "pool-1-thread-1" prio=10 tid=0x0000000001340000 nid=0x4be7 in Object.wait() [0x00007fb7215cd000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at java.lang.Object.wait(Object.java:502) at hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:316) - locked <0x00000000c2cf78b8> (a hidden.edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:994) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1054) at hidden.edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:575) at java.lang.Thread.run(Thread.java:679) "Low Memory Detector" daemon prio=10 tid=0x00007fb71c001800 nid=0x4be1 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread1" daemon prio=10 tid=0x0000000000c36000 nid=0x4be0 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" daemon prio=10 tid=0x0000000000c33000 nid=0x4bdf waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x0000000000c31800 nid=0x4bde runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=10 tid=0x0000000000c12800 nid=0x4bdd in Object.wait() [0x00007fb7222af000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000c234c128> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133) - locked <0x00000000c234c128> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177) "Reference Handler" daemon prio=10 tid=0x0000000000c0b000 nid=0x4bdc in Object.wait() [0x00007fb7223b0000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000c234c0c0> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x00000000c234c0c0> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x0000000000b9c800 nid=0x4bd6 waiting on condition [0x00007fb729580000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.codehaus.plexus.util.cli.CommandLineUtils.executeCommandLine(CommandLineUtils.java:160) at org.codehaus.plexus.util.cli.CommandLineUtils.executeCommandLine(CommandLineUtils.java:88) at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:191) at org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:128) at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:102) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:199) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAllProviders(AbstractSurefireMojo.java:176) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:134) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:97) at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalWithLifecycle(DefaultLifecycleExecutor.java:556) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:535) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348) at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138) at org.apache.maven.cli.MavenCli.main(MavenCli.java:362) at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430) at org.codehaus.classworlds.Launcher.main(Launcher.java:375) "VM Thread" prio=10 tid=0x0000000000c04000 nid=0x4bdb runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000000ba7800 nid=0x4bd7 runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000000ba9800 nid=0x4bd8 runnable "GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000000bab800 nid=0x4bd9 runnable "GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000000bad000 nid=0x4bda runnable "VM Periodic Task Thread" prio=10 tid=0x00007fb71c004800 nid=0x4be2 waiting on condition JNI global references: 1538