[
https://issues.apache.org/jira/browse/HDFS-16379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18039267#comment-18039267
]
ASF GitHub Bot commented on HDFS-16379:
---------------------------------------
github-actions[bot] commented on PR #3787:
URL: https://github.com/apache/hadoop/pull/3787#issuecomment-3550016799
We're closing this stale PR because it has been open for 100 days with no
activity. This isn't a judgement on the merit of the PR in any way. It's just a
way of keeping the PR queue manageable.
If you feel like this was a mistake, or you would like to continue working
on it, please feel free to re-open it and ask for a committer to remove the
stale tag and review again.
Thanks all for your contribution.
> Reset fullBlockReportLeaseId after any exceptions
> -------------------------------------------------
>
> Key: HDFS-16379
> URL: https://issues.apache.org/jira/browse/HDFS-16379
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Tao Li
> Assignee: Tao Li
> Priority: Major
> Labels: pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Recently we encountered FBR-related problems in the production environment,
> which were solved by introducing HDFS-12914 and HDFS-14314.
> But there may be situations like this:
> 1 DN got *fullBlockReportLeaseId* via heartbeat.
> 2 DN trigger a blockReport, but some exception occurs (this may be rare, but
> it may exist), and then DN does multiple retries *without resetting*
> {*}fullBlockReportLeaseId{*}{*}{*}. Because fullBlockReportLeaseId is reset
> only if it succeeds currently.
> 3 After a while, the exception is cleared, but the fullBlockReportLeaseId has
> expired. *Since NN did not throw an exception after the lease expired, the DN
> considered that the blockReport was successful.* So the blockReport was not
> actually executed this time and needs to wait until the next time.
> Therefore, {*}should we consider resetting the fullBlockReportLeaseId in the
> finally block{*}? The advantage of this is that lease expiration can be
> avoided. The downside is that each heartbeat will apply for a new
> fullBlockReportLeaseId during the exception, but I think this cost is
> negligible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]