subject:"\[GitHub\] \[spark\] itholic commented on a change in pull request #32204\: \[SPARK\-34494\]\[SQL\]\[DOCS\] Move JSON data source options from Python and Scala into a single page"

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r636575493



##
File path: python/pyspark/sql/streaming.py
##
@@ -507,102 +479,15 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 schema : :class:`pyspark.sql.types.StructType` or str, optional
 an optional :class:`pyspark.sql.types.StructType` for the input 
schema
 or a DDL-formatted string (For example ``col0 INT, col1 DOUBLE``).
-primitivesAsString : str or bool, optional
-infers all primitive values as a string type. If None is set,
-it uses the default value, ``false``.
-prefersDecimal : str or bool, optional
-infers all floating-point values as a decimal type. If the values
-do not fit in decimal, then it infers them as doubles. If None is
-set, it uses the default value, ``false``.
-allowComments : str or bool, optional
-ignores Java/C++ style comment in JSON records. If None is set,
-it uses the default value, ``false``.
-allowUnquotedFieldNames : str or bool, optional
-allows unquoted JSON field names. If None is set,
-it uses the default value, ``false``.
-allowSingleQuotes : str or bool, optional
-allows single quotes in addition to double quotes. If None is
-set, it uses the default value, ``true``.
-allowNumericLeadingZero : str or bool, optional
-allows leading zeros in numbers (e.g. 00012). If None is
-set, it uses the default value, ``false``.
-allowBackslashEscapingAnyCharacter : str or bool, optional
-allows accepting quoting of all character
-using backslash quoting mechanism. If None is
-set, it uses the default value, ``false``.
-mode : str, optional
-allows a mode for dealing with corrupt records during parsing. If 
None is
-set, it uses the default value, ``PERMISSIVE``.
-
-* ``PERMISSIVE``: when it meets a corrupted record, puts the 
malformed string \
-  into a field configured by ``columnNameOfCorruptRecord``, and 
sets malformed \
-  fields to ``null``. To keep corrupt records, an user can set a 
string type \
-  field named ``columnNameOfCorruptRecord`` in an user-defined 
schema. If a \
-  schema does not have the field, it drops corrupt records during 
parsing. \
-  When inferring a schema, it implicitly adds a 
``columnNameOfCorruptRecord`` \
-  field in an output schema.
-*  ``DROPMALFORMED``: ignores the whole corrupted records.
-*  ``FAILFAST``: throws an exception when it meets corrupted 
records.
-
-columnNameOfCorruptRecord : str, optional
-allows renaming the new field having malformed string
-created by ``PERMISSIVE`` mode. This overrides
-``spark.sql.columnNameOfCorruptRecord``. If None is set,
-it uses the value specified in
-``spark.sql.columnNameOfCorruptRecord``.
-dateFormat : str, optional
-sets the string that indicates a date format. Custom date formats
-follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to date type. If None is set, it uses the
-default value, ``-MM-dd``.
-timestampFormat : str, optional
-sets the string that indicates a timestamp format.
-Custom date formats follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to timestamp type. If None is set, it uses the
-default value, ``-MM-dd'T'HH:mm:ss[.SSS][XXX]``.
-multiLine : str or bool, optional
-parse one record, which may span multiple lines, per file. If None 
is
-set, it uses the default value, ``false``.
-allowUnquotedControlChars : str or bool, optional
-allows JSON Strings to contain unquoted control
-characters (ASCII characters with value less than 32,
-including tab and line feed characters) or not.
-lineSep : str, optional
-defines the line separator that should be used for parsing. If 
None is
-set, it covers all ``\\r``, ``\\r\\n`` and ``\\n``.
-locale : str, optional
-sets a locale as language tag in IETF BCP 47 format. If None is 
set,
-it uses the default value, ``en-US``. For instance, ``locale`` is 
used while
-parsing dates and timestamps.
-dropFieldIfAllNull : str or bool, optional
-whether to ignore column of all null values or empty
-

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r636575493



##
File path: python/pyspark/sql/streaming.py
##
@@ -507,102 +479,15 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 schema : :class:`pyspark.sql.types.StructType` or str, optional
 an optional :class:`pyspark.sql.types.StructType` for the input 
schema
 or a DDL-formatted string (For example ``col0 INT, col1 DOUBLE``).
-primitivesAsString : str or bool, optional
-infers all primitive values as a string type. If None is set,
-it uses the default value, ``false``.
-prefersDecimal : str or bool, optional
-infers all floating-point values as a decimal type. If the values
-do not fit in decimal, then it infers them as doubles. If None is
-set, it uses the default value, ``false``.
-allowComments : str or bool, optional
-ignores Java/C++ style comment in JSON records. If None is set,
-it uses the default value, ``false``.
-allowUnquotedFieldNames : str or bool, optional
-allows unquoted JSON field names. If None is set,
-it uses the default value, ``false``.
-allowSingleQuotes : str or bool, optional
-allows single quotes in addition to double quotes. If None is
-set, it uses the default value, ``true``.
-allowNumericLeadingZero : str or bool, optional
-allows leading zeros in numbers (e.g. 00012). If None is
-set, it uses the default value, ``false``.
-allowBackslashEscapingAnyCharacter : str or bool, optional
-allows accepting quoting of all character
-using backslash quoting mechanism. If None is
-set, it uses the default value, ``false``.
-mode : str, optional
-allows a mode for dealing with corrupt records during parsing. If 
None is
-set, it uses the default value, ``PERMISSIVE``.
-
-* ``PERMISSIVE``: when it meets a corrupted record, puts the 
malformed string \
-  into a field configured by ``columnNameOfCorruptRecord``, and 
sets malformed \
-  fields to ``null``. To keep corrupt records, an user can set a 
string type \
-  field named ``columnNameOfCorruptRecord`` in an user-defined 
schema. If a \
-  schema does not have the field, it drops corrupt records during 
parsing. \
-  When inferring a schema, it implicitly adds a 
``columnNameOfCorruptRecord`` \
-  field in an output schema.
-*  ``DROPMALFORMED``: ignores the whole corrupted records.
-*  ``FAILFAST``: throws an exception when it meets corrupted 
records.
-
-columnNameOfCorruptRecord : str, optional
-allows renaming the new field having malformed string
-created by ``PERMISSIVE`` mode. This overrides
-``spark.sql.columnNameOfCorruptRecord``. If None is set,
-it uses the value specified in
-``spark.sql.columnNameOfCorruptRecord``.
-dateFormat : str, optional
-sets the string that indicates a date format. Custom date formats
-follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to date type. If None is set, it uses the
-default value, ``-MM-dd``.
-timestampFormat : str, optional
-sets the string that indicates a timestamp format.
-Custom date formats follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to timestamp type. If None is set, it uses the
-default value, ``-MM-dd'T'HH:mm:ss[.SSS][XXX]``.
-multiLine : str or bool, optional
-parse one record, which may span multiple lines, per file. If None 
is
-set, it uses the default value, ``false``.
-allowUnquotedControlChars : str or bool, optional
-allows JSON Strings to contain unquoted control
-characters (ASCII characters with value less than 32,
-including tab and line feed characters) or not.
-lineSep : str, optional
-defines the line separator that should be used for parsing. If 
None is
-set, it covers all ``\\r``, ``\\r\\n`` and ``\\n``.
-locale : str, optional
-sets a locale as language tag in IETF BCP 47 format. If None is 
set,
-it uses the default value, ``en-US``. For instance, ``locale`` is 
used while
-parsing dates and timestamps.
-dropFieldIfAllNull : str or bool, optional
-whether to ignore column of all null values or empty
-

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r636008760



##
File path: docs/sql-data-sources-json.md
##
@@ -94,3 +94,171 @@ SELECT * FROM jsonTable
 
 
 
+
+## Data Source Option
+
+Data source options of JSON can be set via:
+* the `.option`/`.options` methods of `DataFrameReader` or `DataFrameWriter`

Review comment:
   > yes, or we can itemize them:
   > the `.option`/`.options` methods of
   > 
   > * DataFrameReader
   > * DataFrameWriter
   > * ...
   
   Sounds good!!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r636007949



##
File path: docs/sql-data-sources-json.md
##
@@ -94,3 +94,171 @@ SELECT * FROM jsonTable
 
 
 
+
+## Data Source Option
+
+Data source options of JSON can be set via:
+* the `.option`/`.options` methods of `DataFrameReader` or `DataFrameWriter`
+* the `.option`/`.options` methods of `DataStreamReader` or `DataStreamWriter`
+
+

Review comment:
   Sure, I'll add it. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r635906645



##
File path: docs/sql-data-sources-json.md
##
@@ -94,3 +94,171 @@ SELECT * FROM jsonTable
 
 
 
+
+## Data Source Option
+
+Data source options of JSON can be set via:
+* the `.option`/`.options` methods of `DataFrameReader` or `DataFrameWriter`

Review comment:
   Thanks for the review, @gengliangwang .
   
   Do you mean combine the 101 and 102 ??
   
   such as
   
   ```
   the `.option`/`.options` methods of `DataFrameReader` or `DataFrameWriter` 
or `DataStreamReader` or `DataStreamWriter`
   ```
   
   ???




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-13 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r632177319



##
File path: python/pyspark/sql/streaming.py
##
@@ -504,105 +504,15 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 path : str
 string represents path to the JSON dataset,
 or RDD of Strings storing JSON objects.
-schema : :class:`pyspark.sql.types.StructType` or str, optional

Review comment:
   ditto. I documented this to the "Data Source Options" table in JSON 
Files page, and removed here.
   
   ![Screen Shot 2021-05-14 at 9 18 52 
AM](https://user-images.githubusercontent.com/44108233/118202716-76220580-b495-11eb-9342-06da779c8098.png)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-13 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r632177276



##
File path: python/pyspark/sql/readwriter.py
##
@@ -1196,39 +1097,13 @@ def json(self, path, mode=None, compression=None, 
dateFormat=None, timestampForm
 --
 path : str
 the path in any Hadoop supported file system
-mode : str, optional

Review comment:
   yeah, so I documented this to the "Data Source Options" table in JSON 
Files page, and removed here.
   
   ![Screen Shot 2021-05-14 at 9 16 57 
AM](https://user-images.githubusercontent.com/44108233/118202555-2ba08900-b495-11eb-9c65-c7ffec03cf03.png)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-13 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r632177319



##
File path: python/pyspark/sql/streaming.py
##
@@ -504,105 +504,15 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 path : str
 string represents path to the JSON dataset,
 or RDD of Strings storing JSON objects.
-schema : :class:`pyspark.sql.types.StructType` or str, optional

Review comment:
   ditto. so I documented this to the JSON data source options table, and 
removed here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-13 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r632177276



##
File path: python/pyspark/sql/readwriter.py
##
@@ -1196,39 +1097,13 @@ def json(self, path, mode=None, compression=None, 
dateFormat=None, timestampForm
 --
 path : str
 the path in any Hadoop supported file system
-mode : str, optional

Review comment:
   yeah, so I documented this to the JSON data source options table, and 
removed here.

##
File path: python/pyspark/sql/streaming.py
##
@@ -504,105 +504,15 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 path : str
 string represents path to the JSON dataset,
 or RDD of Strings storing JSON objects.
-schema : :class:`pyspark.sql.types.StructType` or str, optional

Review comment:
   ditto




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-12 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r631553255



##
File path: python/pyspark/sql/streaming.py
##
@@ -504,105 +504,13 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 path : str
 string represents path to the JSON dataset,
 or RDD of Strings storing JSON objects.
-schema : :class:`pyspark.sql.types.StructType` or str, optional

Review comment:
   I added it to Data Source Options table!
   
   https://user-images.githubusercontent.com/44108233/118077601-62bc5f00-b3ef-11eb-9350-c62b370e167c.png;>
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-11 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r630662846



##
File path: python/pyspark/sql/readwriter.py
##
@@ -233,114 +233,13 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 path : str, list or :class:`RDD`
 string represents path to the JSON dataset, or a list of paths,
 or RDD of Strings storing JSON objects.
-schema : :class:`pyspark.sql.types.StructType` or str, optional
-an optional :class:`pyspark.sql.types.StructType` for the input 
schema or
-a DDL-formatted string (For example ``col0 INT, col1 DOUBLE``).
-primitivesAsString : str or bool, optional
-infers all primitive values as a string type. If None is set,
-it uses the default value, ``false``.
-prefersDecimal : str or bool, optional
-infers all floating-point values as a decimal type. If the values
-do not fit in decimal, then it infers them as doubles. If None is
-set, it uses the default value, ``false``.
-allowComments : str or bool, optional
-ignores Java/C++ style comment in JSON records. If None is set,
-it uses the default value, ``false``.
-allowUnquotedFieldNames : str or bool, optional
-allows unquoted JSON field names. If None is set,
-it uses the default value, ``false``.
-allowSingleQuotes : str or bool, optional
-allows single quotes in addition to double quotes. If None is
-set, it uses the default value, ``true``.
-allowNumericLeadingZero : str or bool, optional
-allows leading zeros in numbers (e.g. 00012). If None is
-set, it uses the default value, ``false``.
-allowBackslashEscapingAnyCharacter : str or bool, optional
-allows accepting quoting of all character
-using backslash quoting mechanism. If None is
-set, it uses the default value, ``false``.
-mode : str, optional
-allows a mode for dealing with corrupt records during parsing. If 
None is
- set, it uses the default value, ``PERMISSIVE``.
-
-* ``PERMISSIVE``: when it meets a corrupted record, puts the 
malformed string \
-  into a field configured by ``columnNameOfCorruptRecord``, and 
sets malformed \
-  fields to ``null``. To keep corrupt records, an user can set a 
string type \
-  field named ``columnNameOfCorruptRecord`` in an user-defined 
schema. If a \
-  schema does not have the field, it drops corrupt records during 
parsing. \
-  When inferring a schema, it implicitly adds a 
``columnNameOfCorruptRecord`` \
-  field in an output schema.
-*  ``DROPMALFORMED``: ignores the whole corrupted records.
-*  ``FAILFAST``: throws an exception when it meets corrupted 
records.
 
-columnNameOfCorruptRecord: str, optional
-allows renaming the new field having malformed string
-created by ``PERMISSIVE`` mode. This overrides
-``spark.sql.columnNameOfCorruptRecord``. If None is set,
-it uses the value specified in
-``spark.sql.columnNameOfCorruptRecord``.
-dateFormat : str, optional
-sets the string that indicates a date format. Custom date formats
-follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to date type. If None is set, it uses the
-default value, ``-MM-dd``.
-timestampFormat : str, optional
-sets the string that indicates a timestamp format.
-Custom date formats follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to timestamp type. If None is set, it uses the
-default value, ``-MM-dd'T'HH:mm:ss[.SSS][XXX]``.
-multiLine : str or bool, optional
-parse one record, which may span multiple lines, per file. If None 
is
-set, it uses the default value, ``false``.
-allowUnquotedControlChars : str or bool, optional
-allows JSON Strings to contain unquoted control
-characters (ASCII characters with value less than 32,
-including tab and line feed characters) or not.
-encoding : str or bool, optional
-allows to forcibly set one of standard basic or extended encoding 
for
-the JSON files. For example UTF-16BE, UTF-32LE. If None is set,
-the encoding of input JSON will be detected automatically
-when the multiLine option is set to ``true``.
-lineSep : str, optional
-defines the

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-11 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r630662846



##
File path: python/pyspark/sql/readwriter.py
##
@@ -233,114 +233,13 @@ def json(self, path, schema=None, 
primitivesAsString=None, prefersDecimal=None,
 path : str, list or :class:`RDD`
 string represents path to the JSON dataset, or a list of paths,
 or RDD of Strings storing JSON objects.
-schema : :class:`pyspark.sql.types.StructType` or str, optional
-an optional :class:`pyspark.sql.types.StructType` for the input 
schema or
-a DDL-formatted string (For example ``col0 INT, col1 DOUBLE``).
-primitivesAsString : str or bool, optional
-infers all primitive values as a string type. If None is set,
-it uses the default value, ``false``.
-prefersDecimal : str or bool, optional
-infers all floating-point values as a decimal type. If the values
-do not fit in decimal, then it infers them as doubles. If None is
-set, it uses the default value, ``false``.
-allowComments : str or bool, optional
-ignores Java/C++ style comment in JSON records. If None is set,
-it uses the default value, ``false``.
-allowUnquotedFieldNames : str or bool, optional
-allows unquoted JSON field names. If None is set,
-it uses the default value, ``false``.
-allowSingleQuotes : str or bool, optional
-allows single quotes in addition to double quotes. If None is
-set, it uses the default value, ``true``.
-allowNumericLeadingZero : str or bool, optional
-allows leading zeros in numbers (e.g. 00012). If None is
-set, it uses the default value, ``false``.
-allowBackslashEscapingAnyCharacter : str or bool, optional
-allows accepting quoting of all character
-using backslash quoting mechanism. If None is
-set, it uses the default value, ``false``.
-mode : str, optional
-allows a mode for dealing with corrupt records during parsing. If 
None is
- set, it uses the default value, ``PERMISSIVE``.
-
-* ``PERMISSIVE``: when it meets a corrupted record, puts the 
malformed string \
-  into a field configured by ``columnNameOfCorruptRecord``, and 
sets malformed \
-  fields to ``null``. To keep corrupt records, an user can set a 
string type \
-  field named ``columnNameOfCorruptRecord`` in an user-defined 
schema. If a \
-  schema does not have the field, it drops corrupt records during 
parsing. \
-  When inferring a schema, it implicitly adds a 
``columnNameOfCorruptRecord`` \
-  field in an output schema.
-*  ``DROPMALFORMED``: ignores the whole corrupted records.
-*  ``FAILFAST``: throws an exception when it meets corrupted 
records.
 
-columnNameOfCorruptRecord: str, optional
-allows renaming the new field having malformed string
-created by ``PERMISSIVE`` mode. This overrides
-``spark.sql.columnNameOfCorruptRecord``. If None is set,
-it uses the value specified in
-``spark.sql.columnNameOfCorruptRecord``.
-dateFormat : str, optional
-sets the string that indicates a date format. Custom date formats
-follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to date type. If None is set, it uses the
-default value, ``-MM-dd``.
-timestampFormat : str, optional
-sets the string that indicates a timestamp format.
-Custom date formats follow the formats at
-`datetime pattern 
`_.  # noqa
-This applies to timestamp type. If None is set, it uses the
-default value, ``-MM-dd'T'HH:mm:ss[.SSS][XXX]``.
-multiLine : str or bool, optional
-parse one record, which may span multiple lines, per file. If None 
is
-set, it uses the default value, ``false``.
-allowUnquotedControlChars : str or bool, optional
-allows JSON Strings to contain unquoted control
-characters (ASCII characters with value less than 32,
-including tab and line feed characters) or not.
-encoding : str or bool, optional
-allows to forcibly set one of standard basic or extended encoding 
for
-the JSON files. For example UTF-16BE, UTF-32LE. If None is set,
-the encoding of input JSON will be detected automatically
-when the multiLine option is set to ``true``.
-lineSep : str, optional
-defines the

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-10 Thread GitBox



itholic commented on a change in pull request #32204:
URL: https://github.com/apache/spark/pull/32204#discussion_r629756911



##
File path: python/pyspark/sql/readwriter.py
##
@@ -209,14 +209,7 @@ def load(self, path=None, format=None, schema=None, 
**options):
 else:
 return self._df(self._jreader.load())
 
-def json(self, path, schema=None, primitivesAsString=None, 
prefersDecimal=None,
- allowComments=None, allowUnquotedFieldNames=None, 
allowSingleQuotes=None,
- allowNumericLeadingZero=None, 
allowBackslashEscapingAnyCharacter=None,
- mode=None, columnNameOfCorruptRecord=None, dateFormat=None, 
timestampFormat=None,
- multiLine=None, allowUnquotedControlChars=None, lineSep=None, 
samplingRatio=None,
- dropFieldIfAllNull=None, encoding=None, locale=None, 
pathGlobFilter=None,
- recursiveFileLookup=None, allowNonNumericNumbers=None,
- modifiedBefore=None, modifiedAfter=None):
+def json(self, path):

Review comment:
   Just reverted the changes. Thanks, @HyukjinKwon !




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

[GitHub] [spark] itholic commented on a change in pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

13 matches

Site Navigation

Mail list logo

Footer information