[ 
https://issues.apache.org/jira/browse/HDFS-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598172#comment-14598172
 ] 

Colin Patrick McCabe commented on HDFS-8646:
--------------------------------------------

Thanks, [~andrew.wang].  Looks great overall.

{code}
@@ -903,8 +907,21 @@ public void setCachedLocations(LocatedBlock block) {
       return;
     }
     List<DatanodeDescriptor> datanodes = cachedBlock.getDatanodes(Type.CACHED);
+    Set<DatanodeInfo> locations =
+        new HashSet<>(Arrays.asList(block.getLocations()));
     for (DatanodeDescriptor datanode : datanodes) {
-      block.addCachedLoc(datanode);
+      // Filter out cached blocks that do not have a backing replica.
+      //
+      // This should not happen since it means the CacheManager thinks
+      // something is cached that does not exist, but it's a safety
+      // measure.
+      if (locations.contains(datanode)) {
+        block.addCachedLoc(datanode);
+      } else {
+        LOG.warn("Datanode {} is not a valid cache location for block {} "
+            + "because that node does not have a backing replica!",
+            datanode, block.getBlock().getBlockName());
+      }
     }
{code}
Creating a new {{HashSet}} here feels very heavyweight.  Why not simply have a 
loop like this:
{code}
for all DN locations:
  if DN location is in the cached list:
    block.addCachedLoc(...)
{code}
This iterates over all DN locations (just like your hash table solution), but 
it doesn't create a lot of garbage in memory.  The cached locations list is 
going to be very short on average (1 to 3 elements) so the perf should be good.

> Prune cached replicas from DatanodeDescriptor state on replica invalidation
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-8646
>                 URL: https://issues.apache.org/jira/browse/HDFS-8646
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: caching
>    Affects Versions: 2.3.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-8646.001.patch
>
>
> Currently we remove blocks from the DD's CachedBlockLists on node failure and 
> on cache report, but not on replica invalidation. This can lead to an invalid 
> situation where we return a LocatedBlock with cached locations that are not 
> backed by an on-disk replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to