[
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275944#comment-17275944
]
Stephen O'Donnell commented on HDFS-15792:
------------------------------------------
Thanks for the clarifications. I had no idea this static
AclStorage.UNIQUE_ACL_FEATURES cache existed. While building a new file or
directory inode, this cache can be accessed and modified concurrently.
My initial concern was that ReferenceCountMap may be used in may places, so
this change could affect other places, but it seems it is only used in
AclStorage.
Therefore the change here makes sense to me.
I am surprised we have not seen this reported since HDFS-14617 ([~prasad-acit]
is not using HDFS-14617, but something similar).
One other point:
In ACLStorage, we define the ReferenceCountedMap as:
{code}
private final static ReferenceCountMap<AclFeature> UNIQUE_ACL_FEATURES =
new ReferenceCountMap<AclFeature>();
{code}
In ACLFeature, we have:
{code}
public class AclFeature implements INode.Feature, ReferenceCounter {
public static final ImmutableList<AclEntry> EMPTY_ENTRY_LIST =
ImmutableList.of();
private int refCount = 0;
...
@Override
public int incrementAndGetRefCount() {
return ++refCount;
}
@Override
public int decrementAndGetRefCount() {
return (refCount > 0) ? --refCount : 0;
}
{code}
And in ReferenceCountedMap we call:
{code}
value.incrementAndGetRefCount();
{code}
Where value is an instance of AclFeature.
Do we need to change `private int refCount = 0;` in AclFeature to an AtomicInt,
as it can be modified concurrently too, or make the inc and decrement and get
methods synchronized on AclFeature?
> ClasscastException while loading FSImage
> ----------------------------------------
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: nn
> Reporter: Renukaprasad C
> Assignee: Renukaprasad C
> Priority: Major
> Attachments: HDFS-15792.001.patch, HDFS-15792.002.patch,
> image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException -
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue. -
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
> at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
> at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
> at java.util.HashMap.treeifyBin(HashMap.java:772)
> at java.util.HashMap.putVal(HashMap.java:644)
> at java.util.HashMap.put(HashMap.java:612)
> at
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
> at
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
> at
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_0000000000198227480,
> cpktTxId=0000000000198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node
> cannot be cast to java.util.HashMap$TreeNode
> at
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:730)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:926)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1665)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1735)
> Caused by: java.lang.ClassCastException: java.util.HashMap$Node cannot be
> cast to java.util.HashMap$TreeNode
> at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
> at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
> at java.util.HashMap.treeifyBin(HashMap.java:772)
> at java.util.HashMap.putVal(HashMap.java:644)
> at java.util.HashMap.put(HashMap.java:612)
> at
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
> at
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
> at
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
> at
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]