Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20949#discussion_r197662087
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -513,6 +513,43 @@ class CSVSuite extends QueryTest with SharedSQLContext
with SQLTestUtils {
}
}
+ test("Save csv with custom charset") {
+ Seq("iso-8859-1", "utf-8", "windows-1250").foreach { encoding =>
+ withTempDir { dir =>
+ val csvDir = new File(dir, "csv").getCanonicalPath
+ // scalastyle:off
+ val originalDF = Seq("µà áâä ÃÃÃ").toDF("_c0")
+ // scalastyle:on
+ originalDF.write
+ .option("header", "false")
+ .option("encoding", encoding)
+ .csv(csvDir)
+
+ val df = spark
+ .read
+ .option("header", "false")
+ .option("encoding", encoding)
--- End diff --
Now it's fine. I think we decided to support encoding in CSV/JSON
datasources. Ignore the comment above. We can proceed separately.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]