Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19180#discussion_r138453195
  
    --- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
    @@ -1097,8 +1101,21 @@ public UTF8String copy() {
       @Override
       public int compareTo(@Nonnull final UTF8String other) {
         int len = Math.min(numBytes, other.numBytes);
    -    // TODO: compare 8 bytes as unsigned long
    -    for (int i = 0; i < len; i ++) {
    +    int words = len / Longs.BYTES;
    +    long roffset = other.getBaseOffset();
    +    Object rbase = other.getBaseObject();
    +    for (int i = 0; i < words * Longs.BYTES; i += Longs.BYTES) {
    +      long left = getLong(base, offset + i);
    +      long right = getLong(rbase, roffset + i);
    +      if (left != right) {
    +        if (!IS_LITTLE_ENDIAN) {
    --- End diff --
    
    (Nit: flip the condition to avoid the negation)
    Is the point that if JIT thinks the value can't change then either way it 
can snip out the branch that will never happen? then yes checking the condition 
in or outside the loop doesn't matter. If that not always true, then I'm pretty 
clear that checking the condition inside this hot loop is slower.
    
    My instinct would be to check it outside the loop just in case. However, I 
also note that it only matters once in the execution of the method, because as 
soon as `left != right`, the loop terminates. I think it's OK this way, 
therefore.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to