[
https://issues.apache.org/jira/browse/HDFS-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978356#comment-16978356
]
Yiqun Lin commented on HDFS-14651:
----------------------------------
Review comments for the latest patch:
*DeadNodeDetector.java*
1.
{code:java}
LOG.debug("Schedule probe type: {}.", type);
{code}
to
{code:java}
LOG.debug("Schedule probe datanode for probe type: {}.", type);
{code}
2.Can we add a debug log here?
{code:java}
if (probeInProg.containsKey(datanodeInfo.getDatanodeUuid())) {
LOG.debug("The datanode {} already contained in probe queue, skip to
add it.", datanodeInfo);
continue;
}
{code}
3. We don't need these over wrap method and directly use the queue operation
there.
{code:java}
+ private boolean addToDeadNodesProbeQueue(DatanodeInfo datanodeInfo) {
+ return deadNodesProbeQueue.offer(datanodeInfo);
+ }
+
+ private DatanodeInfo pollFromDeadNodesProbeQueue() {
+ return deadNodesProbeQueue.poll();
+ }
{code}
4. Can we rename {{dfs.client.deadnode.detection.dead.queue.max}} to
{{dfs.client.deadnode.detection.deadnode.queue.max}}?
5.
{quote}i think that remove from dead node from DFSInputStream local dead nodes
should not DeadNodeDetector's work. DeadNodeDetector's purpose is to handler
sharing deadnode in same client.
{quote}
If we don't remove the deadnode of DFSInputStream local dead nodes in detector,
then it will still regard the updated node as dead node. Here we judge the node
from the combined result from detection dead list and DFSInputStream local dead
list.
{code:java}
if (clientContext.isDeadNodeDetectionEnabled()) {
ConcurrentHashMap<DatanodeInfo, DatanodeInfo> deadNodes =
new ConcurrentHashMap<DatanodeInfo, DatanodeInfo>();
if (dfsInputStream != null) {
deadNodes.putAll(dfsInputStream.getLocalDeadNodes());
}
Set<DatanodeInfo> detectDeadNodes =
clientContext.getDeadNodeDetector().clearAndGetDetectedDeadNodes();
for (DatanodeInfo detectDeadNode : detectDeadNodes) {
deadNodes.put(detectDeadNode, detectDeadNode);
}
return deadNodes;
{code}
6. Can we simplified this comment?
{code:java}
+ /**
+ * The thread pool of probing datanodes' rpc request.
+ * To define additional rpcThreadPool's purpose is to deal this situation
+ * as follow.
+ * That the data node hangs and cannot respond to the client. Release the
+ * thread until timeout.The timeout is too long. If use sync way, it is
likely
+ * that probeThreadPool are filled with the hanged node cannot detect
+ * other normal nodes.
+ */
{code}
to
{code:java}
+ /**
+ * The thread pool of probing datanodes' rpc request. Sometimes the data
node can hang and not respond to the client in a short time. And these node
will filled with probe thread pool and block other normal node probing.
+ */
{code}
*TestDeadNodeDetection.java*
1. We can use a more short time in test, like 1s.
{code:java}
conf.setLong(DFS_CLIENT_DEAD_NODE_DETECTION_PROBE_DEAD_NODE_INTERVAL_MS_KEY,
5000);
{code}
2. Following slepp is not needed.
{code:java}
ThreadUtil.sleepAtLeastIgnoreInterrupts(10 * 1000L);
{code}
3. Following lines are not used.
{code:java}
+ MiniDFSCluster.DataNodeProperties two = cluster.stopDataNode(0);
+ MiniDFSCluster.DataNodeProperties three = cluster.stopDataNode(0);
{code}
4. Can we remove this empty line?
{code:java}
+ } catch (BlockMissingException e) {
+ // delete this line
+ }
{code}
5. We can use {{GenericTestUtils#waitFor}} for the wait condition check, a
example of this {{TestPendingInvalidateBlock#waitForNumPendingDeletionBlocks}}.
{code:java}
while (dfsClient.getDeadNodes(din).size() != 2) {
+ ThreadUtil.sleepAtLeastIgnoreInterrupts(5 * 1000L);
+ }
{code}
> DeadNodeDetector checks dead node periodically
> ----------------------------------------------
>
> Key: HDFS-14651
> URL: https://issues.apache.org/jira/browse/HDFS-14651
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Lisheng Sun
> Assignee: Lisheng Sun
> Priority: Major
> Attachments: HDFS-14651.001.patch, HDFS-14651.002.patch,
> HDFS-14651.003.patch, HDFS-14651.004.patch, HDFS-14651.005.patch
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]