Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/7197#discussion_r33841830
--- Diff:
unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
@@ -201,7 +234,7 @@ public int compare(final UTF8String other) {
@Override
public boolean equals(final Object other) {
if (other instanceof UTF8String) {
- return Arrays.equals(bytes, ((UTF8String) other).getBytes());
+ return compareTo((UTF8String) other) == 0;
--- End diff --
Since I suspect that string equality comparisons could be a very frequent /
expensive operation, this might be a case where it would be worthwhile to use a
fast byte array equality method (see my suggestion upthread on taking the byte
comparison loop in `matches` and factoring it out into a static method in
`ByteArrayMethods`). It might also be faster to express this as a check that
the strings have the same length and that one string matches at the start of
the other.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]