[
https://issues.apache.org/jira/browse/HDFS-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598172#comment-14598172
]
Colin Patrick McCabe commented on HDFS-8646:
--------------------------------------------
Thanks, [~andrew.wang]. Looks great overall.
{code}
@@ -903,8 +907,21 @@ public void setCachedLocations(LocatedBlock block) {
return;
}
List<DatanodeDescriptor> datanodes = cachedBlock.getDatanodes(Type.CACHED);
+ Set<DatanodeInfo> locations =
+ new HashSet<>(Arrays.asList(block.getLocations()));
for (DatanodeDescriptor datanode : datanodes) {
- block.addCachedLoc(datanode);
+ // Filter out cached blocks that do not have a backing replica.
+ //
+ // This should not happen since it means the CacheManager thinks
+ // something is cached that does not exist, but it's a safety
+ // measure.
+ if (locations.contains(datanode)) {
+ block.addCachedLoc(datanode);
+ } else {
+ LOG.warn("Datanode {} is not a valid cache location for block {} "
+ + "because that node does not have a backing replica!",
+ datanode, block.getBlock().getBlockName());
+ }
}
{code}
Creating a new {{HashSet}} here feels very heavyweight. Why not simply have a
loop like this:
{code}
for all DN locations:
if DN location is in the cached list:
block.addCachedLoc(...)
{code}
This iterates over all DN locations (just like your hash table solution), but
it doesn't create a lot of garbage in memory. The cached locations list is
going to be very short on average (1 to 3 elements) so the perf should be good.
> Prune cached replicas from DatanodeDescriptor state on replica invalidation
> ---------------------------------------------------------------------------
>
> Key: HDFS-8646
> URL: https://issues.apache.org/jira/browse/HDFS-8646
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: caching
> Affects Versions: 2.3.0
> Reporter: Andrew Wang
> Assignee: Andrew Wang
> Attachments: hdfs-8646.001.patch
>
>
> Currently we remove blocks from the DD's CachedBlockLists on node failure and
> on cache report, but not on replica invalidation. This can lead to an invalid
> situation where we return a LocatedBlock with cached locations that are not
> backed by an on-disk replica.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)