[
https://issues.apache.org/jira/browse/HBASE-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844754#comment-13844754
]
Elliott Clark edited comment on HBASE-10103 at 12/10/13 10:29 PM:
------------------------------------------------------------------
I'm seeing TestNodeHealthCheckChore hang (or run slower than the timeout) on
trunk on my jenkins box (twice in a row). Running just
TestNodeHealthCheckChore locally passes though.
Here's the code stack that I saw one of the two times:
{code}
"pool-1-thread-1" prio=10 tid=0x00007f1f106d1800 nid=0x57b1 runnable
[0x00007f1f14676000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:220)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0x00000000fbac8c28> (a java.io.BufferedInputStream)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
- locked <0x00000000fb8a6e00> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.read1(BufferedReader.java:187)
at java.io.BufferedReader.read(BufferedReader.java:261)
- locked <0x00000000fb8a6e00> (a java.io.InputStreamReader)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:602)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:446)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.hbase.HealthChecker.checkHealth(HealthChecker.java:76)
at
org.apache.hadoop.hbase.TestNodeHealthCheckChore.healthCheckerTest(TestNodeHealthCheckChore.java:88)
at
org.apache.hadoop.hbase.TestNodeHealthCheckChore.testHealthCheckerTimeout(TestNodeHealthCheckChore.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
{code}
When I reverted this I got past the test. Does the timeout need to be adjusted
or does this need to be moved into a medium test ?
was (Author: eclark):
I'm seeing TestNodeHealthCheckChore hang (or run slower than the timeout) on
trunk on my jenkins box (twice in a row).
Here's the code stack that I saw one of the two times:
{code}
"pool-1-thread-1" prio=10 tid=0x00007f1f106d1800 nid=0x57b1 runnable
[0x00007f1f14676000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:220)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0x00000000fbac8c28> (a java.io.BufferedInputStream)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
- locked <0x00000000fb8a6e00> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.read1(BufferedReader.java:187)
at java.io.BufferedReader.read(BufferedReader.java:261)
- locked <0x00000000fb8a6e00> (a java.io.InputStreamReader)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:602)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:446)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.hbase.HealthChecker.checkHealth(HealthChecker.java:76)
at
org.apache.hadoop.hbase.TestNodeHealthCheckChore.healthCheckerTest(TestNodeHealthCheckChore.java:88)
at
org.apache.hadoop.hbase.TestNodeHealthCheckChore.testHealthCheckerTimeout(TestNodeHealthCheckChore.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
{code}
When I reverted this I got past the test. Does the timeout need to be adjusted
or does this need to be moved into a medium test ?
> TestNodeHealthCheckChore#testRSHealthChore: Stoppable must have been stopped
> ----------------------------------------------------------------------------
>
> Key: HBASE-10103
> URL: https://issues.apache.org/jira/browse/HBASE-10103
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.98.0, 0.99.0
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: 10103.patch, 10103.patch, 10103.patch
>
>
> {noformat}
> Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 623.639 sec
> <<< FAILURE!
> testRSHealthChore(org.apache.hadoop.hbase.TestNodeHealthCheckChore) Time
> elapsed: 0.001 sec <<< FAILURE!
> java.lang.AssertionError: Stoppable must have been stopped.
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at
> org.apache.hadoop.hbase.TestNodeHealthCheckChore.testRSHealthChore(TestNodeHealthCheckChore.java:108)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)