Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/20937#discussion_r179954099
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala
---
@@ -86,14 +85,34 @@ private[sql] class JSONOptions(
val multiLine =
parameters.get("multiLine").map(_.toBoolean).getOrElse(false)
- val lineSeparator: Option[String] = parameters.get("lineSep").map { sep
=>
- require(sep.nonEmpty, "'lineSep' cannot be an empty string.")
- sep
+ /**
+ * A string between two consecutive JSON records.
+ */
+ val lineSeparator: Option[String] = parameters.get("lineSep")
+
+ /**
+ * Standard encoding (charset) name. For example UTF-8, UTF-16LE and
UTF-32BE.
+ * If the encoding is not specified (None), it will be detected
automatically.
+ */
+ val encoding: Option[String] = parameters.get("encoding")
+ .orElse(parameters.get("charset")).map { enc =>
+ val blacklist = List("UTF16", "UTF32")
--- End diff --
I hesitated what to take `List` or `Set`. `Set` because order is not
important here, `List` because `blacklist` is common used name, and `List` is
more appropriate for the value which has the `list` word in its name. I don't
see any reasons for `Seq` but if you believe it is important to have `Seq`
instead of `List` here I will replace it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]