[
https://issues.apache.org/jira/browse/HDDS-1787?focusedWorklogId=277119&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277119
]
ASF GitHub Bot logged work on HDDS-1787:
----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Jul/19 02:33
Start Date: 16/Jul/19 02:33
Worklog Time Spent: 10m
Work Description: ChenSammi commented on pull request #1094: HDDS-1787.
NPE thrown while trying to find DN closest to client.
URL: https://github.com/apache/hadoop/pull/1094#discussion_r303705636
##########
File path:
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMBlockProtocolServer.java
##########
@@ -290,7 +290,12 @@ public ScmInfo getScmInfo() throws IOException {
NodeManager nodeManager = scm.getScmNodeManager();
Node client = nodeManager.getNode(clientMachine);
List<Node> nodeList = new ArrayList();
- nodes.stream().forEach(path -> nodeList.add(nodeManager.getNode(path)));
+ nodes.stream().forEach(path -> {
+ DatanodeDetails node = nodeManager.getNode(path);
+ if (node != null) {
Review comment:
nodeManager.getNode will return null when it can't find the node in the
network topology or the node found is not a leaf node. The first case usually
is because of network topology is not well configured(such as use hostname as
network name while query getNode use Ipaddress). The second case usually will
not happen, otherwise it indicates there is some bugs. I created a unit test
case, which provides illegal inputs to reproduce this case.
The WARN log for all these cases are in nodeManager.getNode function.
if (node != null) {
if (node instanceof InnerNode) {
LOG.warn("Get node for {} return {}, it's an inner node, " +
"not a datanode", address, node.getNetworkFullPath());
} else {
LOG.debug("Get node for {} return {}", address,
node.getNetworkFullPath());
return (DatanodeDetails)node;
}
} else {
LOG.warn("Cannot find node for {}", address);
}
return null;
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 277119)
Time Spent: 1h (was: 50m)
> NPE thrown while trying to find DN closest to client
> ----------------------------------------------------
>
> Key: HDDS-1787
> URL: https://issues.apache.org/jira/browse/HDDS-1787
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Datanode
> Affects Versions: 0.5.0
> Reporter: Siddharth Wagle
> Assignee: Sammi Chen
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> cc: [~xyao] This seems related to the client side topology changes, not sure
> if some other Jira is already addressing this.
> {code}
> 2019-07-10 16:45:53,176 WARN ipc.Server (Server.java:logException(2724)) -
> IPC Server handler 14 on 35066, call Call#127037 Retry#0
> org.apache.hadoop.hdds.scm.protocol.ScmBlockLocationProtocol.send from 17
> 2.31.116.73:52540
> java.lang.NullPointerException
> at
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.lambda$sortDatanodes$0(ScmBlockLocationProtocolServerSideTranslatorPB.java:215)
> at
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
> at
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.sortDatanodes(ScmBlockLocationProtocolServerSideTranslatorPB.java:215)
> at
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.send(ScmBlockLocationProtocolServerSideTranslatorPB.java:124)
> at
> org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:13157)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2019-07-10 16:45:53,176 WARN om.KeyManagerImpl
> (KeyManagerImpl.java:lambda$sortDatanodeInPipeline$7(2129)) - Unable to sort
> datanodes based on distance to client, volume=xqoyzocpse, bucket=vxwajaczqh,
> key=pool-444-thread-7-201077822, client=127.0.0.1,
> datanodes=[10f15723-45d7-4a0c-8f01-8b101744a110{ip: 172.31.116.73, host:
> sid-minichaos.gce.cloudera.com, networkLocation: /default-rack, certSerialId:
> null}, 7ac2777f-0a5c-4414-9e7f-bfbc47d696ea{ip: 172.31.116.73, host:
> sid-minichaos.gce.cloudera.com, networkLocation: /default-rack, certSerialId:
> null}], exception=java.lang.NullPointerException
> at
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.lambda$sortDatanodes$0(ScmBlockLocationProtocolServerSideTranslatorPB.java:215)
> at
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
> at
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.sortDatanodes(ScmBlockLocationProtocolServerSideTranslatorPB.java:215)
> at
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.send(ScmBlockLocationProtocolServerSideTranslatorPB.java:124)
> at
> org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:13157)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]