srowen commented on a change in pull request #26650: [CORE] Fix a bug in 
getBlockHosts
URL: https://github.com/apache/spark/pull/26650#discussion_r349927646
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/PartitionedFileUtil.scala
 ##########
 @@ -69,8 +69,8 @@ object PartitionedFileUtil {
         b.getHosts -> (b.getOffset + b.getLength - offset).min(length)
 
       // The fragment ends at a position within this block
-      case b if offset <= b.getOffset && offset + length < b.getLength =>
-        b.getHosts -> (offset + length - b.getOffset).min(length)
+      case b if offset <= b.getOffset && offset + length < b.getOffset + 
b.getLength =>
 
 Review comment:
   This change looks correct.
   
   I think there's another issue. This is checking where the end of the 
argument block is, so it should look for where `offset + length` is relative to 
b. Shouldn't the first condition be `offset + length >= b.getOffset`? Otherwise 
this is handling the case where the argument doesn't overlap at all with b -- 
imagine offset is much smaller than b.getOffset. 
   
   The result here could be negative. I think that's masked by the fact that 
these are filtered for size > 0 below, and so I guess the logic ends up 
correct, in that these are not considered. It might be worth adjusting for 
clarity as it took me a few minutes to reason about why this was different.
   
   Anyway, in this case, the argument isn't fully contained in b (that is 
handled in the case above actually, by `.min(length)` -- might update the 
comment). Then it's true that the `.min()` below is not needed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to