[
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275901#comment-17275901
]
Stephen O'Donnell commented on HDFS-15792:
------------------------------------------
[~prasad-acit] I am not able to get the line numbers in the stack trace to line
up with the 3.1.1 release branch. Have you some other image loading patches
applied to this release (eg HDFS-14617)? For example any of the parallel image
loading patches or patches which are internal and just on your build?
The error occurred in the class:
org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes
But I don't see that method anywhere in FSImageFormatPBINode.java on 3.1.1 or
trunk. Grepping the source for `grep -R "readPBINodes" *` gives no results
either.
While moving to a concurrent hashMap may make sense, all the structures in the
NN are generally protected by the global NN lock when concurrent operations are
happening. Therefore the NN tends not to use concurrent hash maps etc.
If this structure is getting modified by two treads at once to cause this bug,
I would like to understand how it is happening - ie have we missed
synchronising around this structure somehow during image loading? Prior to
HDFS-14617 the image is loaded by a single thread which makes concurrent access
less likely too.
> ClasscastException while loading FSImage
> ----------------------------------------
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: nn
> Reporter: Renukaprasad C
> Assignee: Renukaprasad C
> Priority: Major
> Attachments: HDFS-15792.001.patch, HDFS-15792.002.patch,
> image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException -
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue. -
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
> at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
> at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
> at java.util.HashMap.treeifyBin(HashMap.java:772)
> at java.util.HashMap.putVal(HashMap.java:644)
> at java.util.HashMap.put(HashMap.java:612)
> at
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
> at
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
> at
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_0000000000198227480,
> cpktTxId=0000000000198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node
> cannot be cast to java.util.HashMap$TreeNode
> at
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:730)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:926)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1665)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1735)
> Caused by: java.lang.ClassCastException: java.util.HashMap$Node cannot be
> cast to java.util.HashMap$TreeNode
> at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
> at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
> at java.util.HashMap.treeifyBin(HashMap.java:772)
> at java.util.HashMap.putVal(HashMap.java:644)
> at java.util.HashMap.put(HashMap.java:612)
> at
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
> at
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
> at
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]