Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22975#discussion_r231977392
  
    --- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
    @@ -411,7 +412,7 @@ public UTF8String toUpperCase() {
       }
     
       private UTF8String toUpperCaseSlow() {
    -    return fromString(toString().toUpperCase());
    +    return fromString(toString().toUpperCase(Locale.ROOT));
    --- End diff --
    
    I think we explicitly didn't change this on purpose; the point of fixing 
Locale.ROOT is to make sure that strings that aren't really user data that 
could well be locale-dependent don't vary. For example internal identifiers for 
compression types or impurity types.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to