[
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
HaiBin Huang updated HDFS-14789:
--------------------------------
Description:
With HDFS-11194 and HDFS-11551, we can find slow node through namenode's jmx.
So i think namenode should check these slow nodes when assigning a node for
writing block. If namenode choose a node at
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
we should check whether it's belong to slow node, because choosing a slow one
to write data may take a long time, which can cause a client writing data very
slowly and even encounter a socket timeout exception like this:
{code:java}
2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting
for channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at
java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at
java.io.DataOutputStream.write(DataOutputStream.java:107) at
org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code}
I use *maxChosenCount* to prevent choosing datanode task too long, which is
calculated by the logarithm of probability, and it also can guarantee the
probability of choosing a slow node to write block less than 0.01%.
Finally, i use an expire time to let namnode don't choose these slow nodes
within a specify period, because these slow nodes may have returned to normal
after the period and can use to write block again.
was:With[ HDFS-11194|https://issues.apache.org/jira/browse/HDFS-11194] and
[HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551], we can find slow
node through namenode's jmx. So i think namenode should check these slow node
when
> namenode should check slow node when assigning a node for writing block
> ------------------------------------------------------------------------
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: HaiBin Huang
> Assignee: HaiBin Huang
> Priority: Major
>
> With HDFS-11194 and HDFS-11551, we can find slow node through namenode's jmx.
> So i think namenode should check these slow nodes when assigning a node for
> writing block. If namenode choose a node at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
> we should check whether it's belong to slow node, because choosing a slow
> one to write data may take a long time, which can cause a client writing
> data very slowly and even encounter a socket timeout exception like this:
>
> {code:java}
> 2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting
> for channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at
> java.io.DataOutputStream.write(DataOutputStream.java:107) at
> org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328)
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code}
>
> I use *maxChosenCount* to prevent choosing datanode task too long, which is
> calculated by the logarithm of probability, and it also can guarantee the
> probability of choosing a slow node to write block less than 0.01%.
> Finally, i use an expire time to let namnode don't choose these slow nodes
> within a specify period, because these slow nodes may have returned to normal
> after the period and can use to write block again.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]