Github user phegstrom commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22227#discussion_r214926165
  
    --- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
    @@ -394,12 +394,14 @@ public void substringSQL() {
     
       @Test
       public void split() {
    -    
assertTrue(Arrays.equals(fromString("ab,def,ghi").split(fromString(","), -1),
    -      new UTF8String[]{fromString("ab"), fromString("def"), 
fromString("ghi")}));
    -    
assertTrue(Arrays.equals(fromString("ab,def,ghi").split(fromString(","), 2),
    -      new UTF8String[]{fromString("ab"), fromString("def,ghi")}));
    -    
assertTrue(Arrays.equals(fromString("ab,def,ghi").split(fromString(","), 2),
    -      new UTF8String[]{fromString("ab"), fromString("def,ghi")}));
    +    UTF8String[] negativeAndZeroLimitCase =
    +            new UTF8String[]{fromString("ab"), fromString("def"), 
fromString("ghi"), fromString("")};
    +    
assertTrue(Arrays.equals(fromString("ab,def,ghi,").split(fromString(","), 0),
    +            negativeAndZeroLimitCase));
    +    
assertTrue(Arrays.equals(fromString("ab,def,ghi,").split(fromString(","), -1),
    --- End diff --
    
    @HyukjinKwon the last two were duplicates:
    ```
          new UTF8String[]{fromString("ab"), fromString("def,ghi")}));
        
assertTrue(Arrays.equals(fromString("ab,def,ghi").split(fromString(","), 2),
          new UTF8String[]{fromString("ab"), fromString("def,ghi")}));
    ```
    
    And I also thought it better to include the case where you do get an empty 
string (adding one more instance of the regex at the end). Want me to revert? 
My view is it's more exhaustive of the expected behavior, and also easier to 
see that limit = -1 should behave exactly like limit = 0.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to