[ 
https://issues.apache.org/jira/browse/HDFS-13309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425731#comment-16425731
 ] 

Nanda kumar commented on HDFS-13309:
------------------------------------

Thanks [~elek] for the contribution & [~xyao] for the review. I have committed 
this to the feature branch.

> Ozone: Improve error message in case of missing nodes
> -----------------------------------------------------
>
>                 Key: HDFS-13309
>                 URL: https://issues.apache.org/jira/browse/HDFS-13309
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: HDFS-7240
>    Affects Versions: HDFS-7240
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Minor
>         Attachments: HDFS-13309-HDFS-7240.001.patch, 
> HDFS-13309-HDFS-7240.002.patch
>
>
> During testing ozonefs with spark I found multiple error messages in the log:
> {code}
> scm_1              | java.lang.NullPointerException
> scm_1              |  at 
> org.apache.hadoop.ozone.scm.container.ContainerStates.ContainerStateMap.addContainer(ContainerStateMap.java:129)
> scm_1              |  at 
> org.apache.hadoop.ozone.scm.container.ContainerStateManager.allocateContainer(ContainerStateManager.java:308)
> scm_1              |  at 
> org.apache.hadoop.ozone.scm.container.ContainerMapping.allocateContainer(ContainerMapping.java:244)
> scm_1              |  at 
> org.apache.hadoop.ozone.scm.block.BlockManagerImpl.preAllocateContainers(BlockManagerImpl.java:189)
> scm_1              |  at 
> org.apache.hadoop.ozone.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:291)
> scm_1              |  at 
> org.apache.hadoop.ozone.scm.StorageContainerManager.allocateBlock(StorageContainerManager.java:1131)
> scm_1              |  at 
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:109)
> scm_1              |  at 
> org.apache.hadoop.hdsl.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:8038)
> scm_1              |  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> scm_1              |  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1007)
> scm_1              |  at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:873)
> scm_1              |  at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:819)
> scm_1              |  at java.security.AccessController.doPrivileged(Native 
> Method)
> scm_1              |  at javax.security.auth.Subject.doAs(Subject.java:422)
> scm_1              |  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> scm_1              |  at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2679)
> {code}
> The problem is that PiplineManager..getPipeline() may return with null if 
> pipline couldn't be found/establised (for example if I have not enogh nodes 
> for a ratis ring).
> In ContainerStateMap.addContainer this pipline is expected to be not null.
> I suggest to do an additional check in 
> ContainerStateManager.allocateContainer and return with more meaningfull 
> error message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to