[
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Haibin Huang updated HDFS-14789:
--------------------------------
Description:
With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and
SlowDisksReport in jmx. I think namenode can avoid these slow node while
chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node in
pipeline, client might write very slowly.
I use a invalidityTime to let namnode not choose slow node before invalid time
finish. After the invalidityTime, if slow node return to normal, namenode can
choose it again, or it's still very slow, the invalidityTime will update and
keep not choosing it.
Also i consider the fallback, if namenode can't choose any normal node,
chooseTarget will throw NotEnoughReplicasException and retry, this time not
avoiding slow nodes.
was:
With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and
SlowDisksReport in jmx. I think namenode can avoid these slow node information
while choosing target in
we can find slow node through namenode's jmx. So i think namenode should check
these slow nodes when assigning a node for writing block. If namenode choose a
node at
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
we should check whether it's belong to slow node, because choosing a slow one
to write data may take a long time, which can cause a client writing data very
slowly and even encounter a socket timeout exception like this:
{code:java}
2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting
for channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at
java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at
java.io.DataOutputStream.write(DataOutputStream.java:107) at
org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code}
I use *maxChosenCount* to prevent choosing datanode task too long, which is
calculated by the logarithm of probability, and it also can guarantee the
probability of choosing a slow node to write block less than 0.01%.
Finally, i use an expire time to let namnode don't choose these slow nodes
within a specify period, because these slow nodes may have returned to normal
after the period and can use to write block again.
> namenode should avoid slow node when choose target in
> BlockPlacementPolicyDefault
> ---------------------------------------------------------------------------------
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Haibin Huang
> Assignee: Haibin Huang
> Priority: Major
> Attachments: HDFS-14789
>
>
> With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and
> SlowDisksReport in jmx. I think namenode can avoid these slow node while
> chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node
> in pipeline, client might write very slowly.
> I use a invalidityTime to let namnode not choose slow node before invalid
> time finish. After the invalidityTime, if slow node return to normal,
> namenode can choose it again, or it's still very slow, the invalidityTime
> will update and keep not choosing it.
> Also i consider the fallback, if namenode can't choose any normal node,
> chooseTarget will throw NotEnoughReplicasException and retry, this time not
> avoiding slow nodes.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]