Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r178495813 --- Diff: python/pyspark/sql/readwriter.py --- @@ -237,6 +237,8 @@ def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None, :param allowUnquotedControlChars: allows JSON Strings to contain unquoted control characters (ASCII characters with value less than 32, including tab and line feed characters) or not. + :param encoding: standard charset name, for example UTF-8, UTF-16 and UTF-32. If None is + set, the encoding of input json will be detected automatically. --- End diff -- I would agree to you about the per-line mode, but not about the multiline mode. For example, the test checks that: https://github.com/MaxGekk/spark-1/blob/53834005ba22c0c3e7be883949ab01a5bf1a0b9a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala#L2148-L2156
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org