Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r170819172 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -195,15 +205,15 @@ private static int numBytesForFirstByte(final byte b) { * Returns the number of bytes */ public int numBytes() { - return numBytes; + return (int)base.size(); --- End diff -- ah now I see the point of having `UTF8String.numBytes`. `MemoryBlock.size` is long and here we need a int, and `numBytes()` is called many times so performance matters.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org