Josh Rosen created SPARK-3103:
---------------------------------

             Summary: Fix UTF8 encoding in PySpark saveAsTextFile().
                 Key: SPARK-3103
                 URL: https://issues.apache.org/jira/browse/SPARK-3103
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.0.2, 1.1.0
            Reporter: Josh Rosen


This is a follow-up JIRA for https://github.com/apache/spark/pull/1914, where 
Ahir and Davies identified a bug in Python JsonRDD when trying to encode 
non-ASCII strings into unicode.

The same underlying issue affects saveAsTextFile, so we should apply the same 
fix there, too, and search for any other code that needs to be updated (and 
maybe refactor this out into a utility function).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to