Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22227#discussion_r214562493
--- Diff:
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
@@ -952,6 +952,11 @@ public static UTF8String concatWs(UTF8String
separator, UTF8String... inputs) {
}
public UTF8String[] split(UTF8String pattern, int limit) {
+ // Java String's split method supports "ignore empty string" behavior
when the limit is 0.
+ // To avoid this, we fall back to -1 when the limit is 0.
--- End diff --
I also would leave a short justification for this given
https://github.com/apache/spark/pull/22227#issuecomment-417471241
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]