[GitHub] [spark] HyukjinKwon commented on a change in pull request #29765: [SPARK-32888][DOCS] Add user document about header flag and RDD as path for reading CSV

GitBox Tue, 15 Sep 2020 17:05:49 -0700


HyukjinKwon commented on a change in pull request #29765:
URL: https://github.com/apache/spark/pull/29765#discussion_r489084356




##########
File path: python/pyspark/sql/readwriter.py
##########
@@ -429,7 +429,8 @@ def csv(self, path, schema=None, sep=None, encoding=None, 
quote=None, escape=Non
         :param comment: sets a single character used for skipping lines 
beginning with this
                         character. By default (None), it is disabled.
         :param header: uses the first line as names of columns. If None is 
set, it uses the
-                       default value, ``false``.
+                       default value, ``false``. Note that if the given path 
is a RDD of Strings,
+                       this header option will remove all lines same with the 
header if exists.

Review comment:
       Here too. I would use `.. note:` under this parameter.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##########
@@ -600,6 +600,9 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
    * If the enforceSchema is set to `false`, only the CSV header in the first 
line is checked
    * to conform specified or inferred schema.
    *
+   * Note that if `header` option is set to `true` when calling this API, all 
lines same with

Review comment:
       I would just `@note`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29765: [SPARK-32888][DOCS] Add user document about header flag and RDD as path for reading CSV

Reply via email to