[ 
https://issues.apache.org/jira/browse/HDFS-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234823#comment-15234823
 ] 

Gergely Novák commented on HDFS-10270:
--------------------------------------

We observed that the fail is caused by this assert: 
{code}
DFSTestUtil.waitForMetric(jmx, "NumOpenConnections", numDatanodes);
{code}
This checks if the number of open connections equals to the number of data 
nodes. But the number of open connections has absolutely no dependency from the 
data nodes: it's either 0, 1 (DataNodeProtocol) or 2 (DataNodeProtocol and 
ClientProtocol). The test passes in those rare cases when the ClientProtocol 
hasn't timed out when the {{TestNameNode}} runs (this can only happen if the 
tests are run separately). If we increment the number of data nodes (to 3, or 
so) the test will always fail. Contrarily if we increase the client timeout 
({{ipc.client.connection.maxidletime}}) the test will always pass.

Our suggestion is to remove this assert.

> TestJMXGet:testNameNode() fails
> -------------------------------
>
>                 Key: HDFS-10270
>                 URL: https://issues.apache.org/jira/browse/HDFS-10270
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>            Reporter: Andras Bokor
>            Priority: Minor
>         Attachments: TestJMXGet.log, TestJMXGetFails.log
>
>
> It fails with java.util.concurrent.TimeoutException. Actually the problem 
> here is that we expect 2 as NumOpenConnections metric but it is only 1. So 
> the test waits 60 sec then fails.
> Please find maven output so the stack trace attached ([^TestJMXGetFails.log]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to