Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20849#discussion_r175282994
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala
 ---
    @@ -85,6 +85,12 @@ private[sql] class JSONOptions(
     
       val multiLine = 
parameters.get("multiLine").map(_.toBoolean).getOrElse(false)
     
    +  /**
    +   * Standard charset name. For example UTF-8, UTF-16 and UTF-32.
    +   * If charset is not specified (None), it will be detected automatically.
    --- End diff --
    
    Json's schema inference use the text datasource to separate the lines 
before we go through jackson parser where the charset for newlines should be 
respected. Shouldn't we better fix text datasource with the hadoop's line 
reader first?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to