[
https://issues.apache.org/jira/browse/HDDS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661946#comment-16661946
]
Mukul Kumar Singh commented on HDDS-721:
----------------------------------------
The reason for this issue is following, on a SCM restart, all the dn are
removed from the pipeline and are added back one by one as they are reported as
part of pipeline report. In this case, some of the datanodes fail to come up
because of HDDS-722. Hence the leaderID field is not available as one of the
nodes in the pipeline.
{code}
2018-10-24 07:45:14,671 INFO scm.XceiverClientRatis: excpetion here
java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.ratis.RatisHelper.toRaftPeerIdString(RatisHelper.java:62)
at org.apache.ratis.RatisHelper.toRaftPeerId(RatisHelper.java:83)
at
org.apache.hadoop.hdds.scm.XceiverClientRatis.connect(XceiverClientRatis.java:161)
at
org.apache.hadoop.hdds.scm.XceiverClientManager$2.call(XceiverClientManager.java:166)
at
org.apache.hadoop.hdds.scm.XceiverClientManager$2.call(XceiverClientManager.java:149)
at
com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
at
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
at
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at
com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
at
org.apache.hadoop.hdds.scm.XceiverClientManager.getClient(XceiverClientManager.java:148)
at
org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:126)
at
org.apache.hadoop.ozone.client.io.ChunkGroupInputStream.getFromOmKeyInfo(ChunkGroupInputStream.java:280)
at
org.apache.hadoop.ozone.client.rpc.RpcClient.getKey(RpcClient.java:493)
at
org.apache.hadoop.ozone.client.OzoneBucket.readKey(OzoneBucket.java:272)
at
org.apache.hadoop.fs.ozone.OzoneFileSystem.open(OzoneFileSystem.java:178)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:950)
at
org.apache.hadoop.fs.shell.Display$Cat.getInputStream(Display.java:108)
at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
at
org.apache.hadoop.fs.shell.Command.processPathInternal(Command.java:367)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
at
org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:304)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:286)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:270)
at
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:120)
at org.apache.hadoop.fs.shell.Command.run(Command.java:177)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:327)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:390)
{code}
> NullPointerException thrown while trying to read a file when datanode
> restarted
> -------------------------------------------------------------------------------
>
> Key: HDDS-721
> URL: https://issues.apache.org/jira/browse/HDDS-721
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Datanode
> Affects Versions: 0.3.0
> Reporter: Nilotpal Nandi
> Priority: Critical
> Attachments: all-node-ozone-logs-1540356965.tar.gz
>
>
> steps taken :
> -------------------
> # Put few files and directories using ozonefs
> # stopped all services of cluster.
> # started the scm, om and then datanodes.
> While datanodes were starting up, tried to read a file. Null pointer
> Exception was thrown.
>
> {noformat}
> [root@ctr-e138-1518143905142-544443-01-000003 ~]#
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -ls -R /
> 2018-10-24 04:48:00,703 WARN util.NativeCodeLoader: Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> drwxrwxrwx - root root 0 2018-10-24 04:12 /testdir1
> -rw-rw-rw- 1 root root 5368709120 1970-02-25 15:29 /testdir1/5GB
> -rw-rw-rw- 1 root root 4798 1970-02-25 15:22 /testdir1/passwd
> drwxrwxrwx - root root 0 2018-10-24 04:46 /testdir3
> [root@ctr-e138-1518143905142-544443-01-000003 ~]#
> /root/hadoop_trunk/ozone-0.3.0-SNAPSHOT/bin/ozone fs -cat
> o3fs://fs-bucket.fs-volume/testdir1/passwd
> 2018-10-24 04:49:24,955 WARN util.NativeCodeLoader: Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> cat: Exception getting XceiverClient:
> com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.NullPointerException{noformat}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]