sodonnel commented on a change in pull request #2838:
URL: https://github.com/apache/hadoop/pull/2838#discussion_r604765744



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
##########
@@ -1161,10 +1176,12 @@ public Void call() throws IOException {
        */
       @Override
       public int compare(LinkArgs a, LinkArgs b) {
+        File asrc = a.src();
+        File bsrc = b.src();
         return ComparisonChain.start().
-            compare(a.src.getName(), b.src.getName()).
-            compare(a.src, b.src).
-            compare(a.dst, b.dst).
+            compare(asrc.getName(), bsrc.getName()).
+            compare(asrc, bsrc).

Review comment:
       This method already had the two compares, but your comment caused me to 
look at it more closely.
   
   What this seems to be doing is sorting by the block_filename, then by the 
path, then by the dst.
   
   Eg, if we had this as src:
   
   /a/blk1
   /b/blk2
   /c/blk1
   
   The natural sort order by path is as above. However the method would return:
   
   /a/blk1
   /c/blk1
   /b/blk2
   
   This puts duplicate blocks beside each other in the list, even though they 
are in different paths. Later in the method it uses that to check for duplicate 
blocks and then handles them.
   
   Therefore I think we need the chained compares like this. However rather 
than `a.src.getName()` I could use `a.blockFile()` which should be slightly 
more efficient.
   
   I will push another commit with that change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to