[
https://issues.apache.org/jira/browse/HDFS-16855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641116#comment-17641116
]
dingshun edited comment on HDFS-16855 at 11/30/22 7:39 AM:
-----------------------------------------------------------
[~hexiaoqiao] Thanks for your replay.
We found that when starting the datanode, the
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl#addBlockPool(String
bpid, Configuration conf) method will be called, and a BLOCK_POOl level write
lock will be added.
In the
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList#addBlockPool(final
String bpid, final Configuration conf), multiple threads will be started to
initialize BlockPoolSlice, and the value of dfsUsage needs to be obtained when
BlockPoolSlice is initialized.
Because our fs.getspaceused.classname is configured with
ReplicaCachingGetSpaceUsed, so we need to call
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl#deepCopyReplica(String
bpid), and call
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaMap#replicas(String
bpid, Consumer<Iterator<ReplicaInfo>> consumer), but #replicas add a read lock
at the BLOCK_POOl level.
Since they are not the same thread and they are using the read-write lock of
the same ReentrantReadWriteLock instance, so the write lock cannot be
downgraded to a read lock
was (Author: dingshun):
[~hexiaoqiao] Thanks for your replay.
We found that when starting the datanode, the #addBlockPool(String bpid,
Configuration conf) method of FsDatasetImpl will be called, and a BLOCK_POOl
level write lock will be added.
In the #addBlockPool(final String bpid, final Configuration conf) method of
FsVolumeList, multiple threads will be started to initialize BlockPoolSlice,
and the value of dfsUsage needs to be obtained when BlockPoolSlice is
initialized.
Because our fs.getspaceused.classname is configured with
ReplicaCachingGetSpaceUsed, so we need to call #deepCopyReplica(String bpid) of
FsDatasetImpl, and call #replicas(String bpid, Consumer<Iterator<ReplicaInfo>>
consumer) of ReplicaMap in #deepCopyReplica, but #replicas add a read lock at
the BLOCK_POOl level.
Since they are not the same thread and they are using the read-write lock of
the same ReentrantReadWriteLock instance, so the write lock cannot be
downgraded to a read lock
> Remove the redundant write lock in addBlockPool
> -----------------------------------------------
>
> Key: HDFS-16855
> URL: https://issues.apache.org/jira/browse/HDFS-16855
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: dingshun
> Priority: Major
> Labels: pull-request-available
>
> When patching the datanode's fine-grained lock, we found that the datanode
> couldn't start,maybe happened deadlock,when addBlockPool, so we can remove it.
> {code:java}
> // getspaceused classname
> <property>
> <name>fs.getspaceused.classname</name>
>
> <value>org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed</value>
> </property> {code}
> {code:java}
> //
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl#addBlockPool
>
> // get writeLock
> @Override
> public void addBlockPool(String bpid, Configuration conf)
> throws IOException {
> LOG.info("Adding block pool " + bpid);
> AddBlockPoolException volumeExceptions = new AddBlockPoolException();
> try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl,
> bpid)) {
> try {
> volumes.addBlockPool(bpid, conf);
> } catch (AddBlockPoolException e) {
> volumeExceptions.mergeException(e);
> }
> volumeMap.initBlockPool(bpid);
> Set<String> vols = storageMap.keySet();
> for (String v : vols) {
> lockManager.addLock(LockLevel.VOLUME, bpid, v);
> }
> }
>
> } {code}
> {code:java}
> //
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl#deepCopyReplica
> // need readLock
> void replicas(String bpid, Consumer<Iterator<ReplicaInfo>> consumer) {
> LightWeightResizableGSet<Block, ReplicaInfo> m = null;
> try (AutoCloseDataSetLock l = lockManager.readLock(LockLevel.BLOCK_POOl,
> bpid)) {
> m = map.get(bpid);
> if (m !=null) {
> m.getIterator(consumer);
> }
> }
> } {code}
>
> because it is not the same thread, so the write lock cannot be downgraded to
> a read lock
> {code:java}
> void addBlockPool(final String bpid, final Configuration conf) throws
> IOException {
> long totalStartTime = Time.monotonicNow();
> final Map<FsVolumeSpi, IOException> unhealthyDataDirs =
> new ConcurrentHashMap<FsVolumeSpi, IOException>();
> List<Thread> blockPoolAddingThreads = new ArrayList<Thread>();
> for (final FsVolumeImpl v : volumes) {
> Thread t = new Thread() {
> public void run() {
> try (FsVolumeReference ref = v.obtainReference()) {
> FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
> " on volume " + v + "...");
> long startTime = Time.monotonicNow();
> v.addBlockPool(bpid, conf);
> long timeTaken = Time.monotonicNow() - startTime;
> FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid +
> " on " + v + ": " + timeTaken + "ms");
> } catch (IOException ioe) {
> FsDatasetImpl.LOG.info("Caught exception while scanning " + v +
> ". Will throw later.", ioe);
> unhealthyDataDirs.put(v, ioe);
> }
> }
> };
> blockPoolAddingThreads.add(t);
> t.start();
> }
> for (Thread t : blockPoolAddingThreads) {
> try {
> t.join();
> } catch (InterruptedException ie) {
> throw new IOException(ie);
> }
> }
> } {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]