Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19222#discussion_r171822360
  
    --- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/hash/Murmur3_x86_32.java ---
    @@ -52,13 +53,31 @@ public int hashUnsafeWords(Object base, long offset, 
int lengthInBytes) {
         return hashUnsafeWords(base, offset, lengthInBytes, seed);
       }
     
    +  public static int hashUnsafeWordsBlock(MemoryBlock base, long offset, 
int lengthInBytes, int seed) {
    +    return hashUnsafeWords(base.getBaseObject(), offset, lengthInBytes, 
seed);
    +  }
    +
       public static int hashUnsafeWords(Object base, long offset, int 
lengthInBytes, int seed) {
         // This is based on Guava's 
`Murmur32_Hasher.processRemaining(ByteBuffer)` method.
         assert (lengthInBytes % 8 == 0): "lengthInBytes must be a multiple of 
8 (word-aligned)";
         int h1 = hashBytesByInt(base, offset, lengthInBytes, seed);
         return fmix(h1, lengthInBytes);
       }
     
    +  public static int hashUnsafeBytesBlock(MemoryBlock base, int seed) {
    +    long offset = base.getBaseOffset();
    --- End diff --
    
    I am considering to change to call `hashUnsafeBytesBlock()` from 
`hashUnsafeBytes()`. It can also reduce the duplication. Does it make sense?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to