[
https://issues.apache.org/jira/browse/HDFS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jiaxin Li updated HDFS-4186:
----------------------------
Description:
As pointed out in HDFS-4183, when the lease monitor calls
internalReleaseLease(), it acquires the namespace write lock. Inside
internalReleaseLease(), if a block recovery is needed, the lease is reassigned
to the namenode itself and this is logged & synced in logReassignLease().
Since this is done while the write lock is held, log syncing is blocked. When a
large number of leases are expired and blocks are recovered, namenode can slow
down.
was:
As pointed out in HDFS-4138, when the lease monitor calls
internalReleaseLease(), it acquires the namespace write lock. Inside
internalReleaseLease(), if a block recovery is needed, the lease is reassigned
to the namenode itself and this is logged & synced in logReassignLease().
Since this is done while the write lock is held, log syncing is blocked. When a
large number of leases are expired and blocks are recovered, namenode can slow
down.
> logSync() is called with the write lock held while releasing lease
> ------------------------------------------------------------------
>
> Key: HDFS-4186
> URL: https://issues.apache.org/jira/browse/HDFS-4186
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 0.23.4, 2.0.2-alpha
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.5, 0.23.6
>
> Attachments:
> hdfs-4186-branch-0.23-inaccurate-batched-sync-count.patch,
> hdfs-4186-trunk-skip-standbyException.patch, hdfs-4186-trunk.patch
>
>
> As pointed out in HDFS-4183, when the lease monitor calls
> internalReleaseLease(), it acquires the namespace write lock. Inside
> internalReleaseLease(), if a block recovery is needed, the lease is
> reassigned to the namenode itself and this is logged & synced in
> logReassignLease().
> Since this is done while the write lock is held, log syncing is blocked. When
> a large number of leases are expired and blocks are recovered, namenode can
> slow down.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)