Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21247#discussion_r194117346
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala
 ---
    @@ -138,3 +121,40 @@ private[sql] class JSONOptions(
         factory.configure(JsonParser.Feature.ALLOW_UNQUOTED_CONTROL_CHARS, 
allowUnquotedControlChars)
       }
     }
    +
    +private[sql] class JSONOptionsInRead(
    +    @transient override val parameters: CaseInsensitiveMap[String],
    +    defaultTimeZoneId: String,
    +    defaultColumnNameOfCorruptRecord: String)
    +  extends JSONOptions(parameters, defaultTimeZoneId, 
defaultColumnNameOfCorruptRecord) {
    +
    +  def this(
    +    parameters: Map[String, String],
    +    defaultTimeZoneId: String,
    +    defaultColumnNameOfCorruptRecord: String = "") = {
    +    this(
    +      CaseInsensitiveMap(parameters),
    +      defaultTimeZoneId,
    +      defaultColumnNameOfCorruptRecord)
    +  }
    +
    +  protected override def checkedEncoding(enc: String): String = {
    +    // The following encodings are not supported in per-line mode 
(multiline is false)
    +    // because they cause some problems in reading files with BOM which is 
supposed to
    +    // present in the files with such encodings. After splitting input 
files by lines,
    +    // only the first lines will have the BOM which leads to impossibility 
for reading
    +    // the rest lines. Besides of that, the lineSep option must have the 
BOM in such
    +    // encodings which can never present between lines.
    +    val blacklist = Seq(Charset.forName("UTF-16"), 
Charset.forName("UTF-32"))
    +    val isBlacklisted = blacklist.contains(Charset.forName(enc))
    +    require(multiLine || !isBlacklisted,
    --- End diff --
    
    So I could be missing something, but it seems like we might allow folks to 
write data they can't read back in the same mode as write, would it make sense 
to have an equivalent checkedEncoding on the write side that just logs a 
warning for folks? I could have also misunderstood.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to