[
https://issues.apache.org/jira/browse/HDFS-16261?focusedWorklogId=670840&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-670840
]
ASF GitHub Bot logged work on HDFS-16261:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Oct/21 15:56
Start Date: 27/Oct/21 15:56
Worklog Time Spent: 10m
Work Description: bbeaudreault opened a new pull request #3594:
URL: https://github.com/apache/hadoop/pull/3594
### Description of PR
See https://issues.apache.org/jira/browse/HDFS-16261 for more details.
The most straightforward way to introduce a grace period for replaced blocks
is in the InvalidateBlocks datastructure. Work is periodically pulled from this
class to be divvied up to DataNodes. This class is also reported on with JMX
metrics, so one can easily see how many blocks are currently pending deletion.
I achieved the grace period by adding a new `pollNWithFilter` method to
`LightWeightHashSet`. InvalidateBlocks are added to the LightWeightHashSet with
a calculated `readyForDeleteAt` time. When `getBlocksToInvalidateByLimit` is
called, a filter is passed which only includes those blocks whose
`readyForDeleteAt` is expired.
The default grace period for blocks added to InvalidateBlocks is 0, which
effectively makes all blocks immediately ready for deletion per existing
behavior. When a grace period is configured, it applies only to blocks deleted
through the `delHintNode` passed in RECEIVED_BLOCK messages. This minimizes the
impact of this feature on blocks deleted for other reasons (i.e. if a file is
deleted or through other ongoing namenode auditing).
### How was this patch tested?
I've added tests for the relevant pieces, and have also been running this on
2 clusters internally.
### For code changes:
- [x] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 670840)
Remaining Estimate: 0h
Time Spent: 10m
> Configurable grace period around invalidation of replaced blocks
> ----------------------------------------------------------------
>
> Key: HDFS-16261
> URL: https://issues.apache.org/jira/browse/HDFS-16261
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When a block is moved with REPLACE_BLOCK, the new location is recorded in the
> NameNode and the NameNode instructs the old host to in invalidate the block
> using DNA_INVALIDATE. As it stands today, this invalidation is async but
> tends to happen relatively quickly.
> I'm working on a feature for HBase which enables efficient healing of
> locality through Balancer-style low level block moves (HBASE-26250). One
> issue is that HBase tends to keep open long running DFSInputStreams and
> moving blocks from under them causes lots of warns in the RegionServer and
> increases long tail latencies due to the necessary retries in the DFSClient.
> One way I'd like to fix this is to provide a configurable grace period on
> async invalidations. This would give the DFSClient enough time to refresh
> block locations before hitting any errors.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]