[
https://issues.apache.org/jira/browse/HDFS-15963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoqiao He reassigned HDFS-15963:
----------------------------------
Assignee: Shuyan Zhang
> Unreleased volume references cause an infinite loop
> ---------------------------------------------------
>
> Key: HDFS-15963
> URL: https://issues.apache.org/jira/browse/HDFS-15963
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Shuyan Zhang
> Assignee: Shuyan Zhang
> Priority: Major
> Labels: pull-request-available
> Attachments: HDFS-15963.001.patch
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> When BlockSender throws an exception because the meta-data cannot be found,
> the volume reference obtained by the thread is not released, which causes the
> thread trying to remove the volume to wait and fall into an infinite loop.
> {code:java}
> boolean checkVolumesRemoved() {
> Iterator<FsVolumeImpl> it = volumesBeingRemoved.iterator();
> while (it.hasNext()) {
> FsVolumeImpl volume = it.next();
> if (!volume.checkClosed()) {
> return false;
> }
> it.remove();
> }
> return true;
> }
> boolean checkClosed() {
> // always be true.
> if (this.reference.getReferenceCount() > 0) {
> FsDatasetImpl.LOG.debug("The reference count for {} is {}, wait to be 0.",
> this, reference.getReferenceCount());
> return false;
> }
> return true;
> }
> {code}
> At the same time, because the thread has been holding checkDirsLock when
> removing the volume, other threads trying to acquire the same lock will be
> permanently blocked.
> Similar problems also occur in RamDiskAsyncLazyPersistService and
> FsDatasetAsyncDiskService.
> This patch releases the three previously unreleased volume references.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]