Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20796#discussion_r175215287
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -1279,4 +1280,57 @@ class CSVSuite extends QueryTe
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20796#discussion_r174355311
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -1279,4 +1279,22 @@ class CSVSuite extends Query
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20796#discussion_r174293523
--- Diff: sql/core/src/test/resources/test-data/utf8xFF.csv ---
@@ -0,0 +1,3 @@
+channel,code
+United,123
+ABGUN�,456
--- End diff --
GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/20796
[SPARK-23649][SQL] Prevent crashes on schema inferring of CSV containing
wrong UTF-8 chars
## What changes were proposed in this pull request?
The mapping of UTF-8 char's first byte to cha