Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20937#discussion_r178427139
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala
 ---
    @@ -86,14 +86,30 @@ private[sql] class JSONOptions(
     
       val multiLine = 
parameters.get("multiLine").map(_.toBoolean).getOrElse(false)
     
    -  val lineSeparator: Option[String] = parameters.get("lineSep").map { sep 
=>
    -    require(sep.nonEmpty, "'lineSep' cannot be an empty string.")
    -    sep
    +  /**
    +   * A sequence of bytes between two consecutive json records.
    +   */
    +  val lineSeparator: Option[String] = parameters.get("lineSep")
    +
    +  /**
    +   * Standard charset name. For example UTF-8, UTF-16 and UTF-32.
    +   * If charset is not specified (None), it will be detected automatically.
    +   */
    +  val charset: Option[String] = parameters.get("charset")
    +    .orElse(parameters.get("encoding")).map { cs =>
    +      if (multiLine == false && cs != "UTF-8" && lineSeparator.isEmpty) {
    +        throw new IllegalArgumentException(
    +          s"""Please, set the 'lineSep' option for the given charset $cs.
    +             |Example: .option("lineSep", "|^|")
    +             |Note: lineSep can be detected automatically for UTF-8 
only.""".stripMargin
    +        )
    --- End diff --
    
    Let's move this parenthesis move up and inlined if you are also fine with 
that (just because that's what I have seen more frequently).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to