HyukjinKwon commented on a change in pull request #29765:
URL: https://github.com/apache/spark/pull/29765#discussion_r489084356
##########
File path: python/pyspark/sql/readwriter.py
##########
@@ -429,7 +429,8 @@ def csv(self, path, schema=None, sep=None, encoding=None,
quote=None, escape=Non
:param comment: sets a single character used for skipping lines
beginning with this
character. By default (None), it is disabled.
:param header: uses the first line as names of columns. If None is
set, it uses the
- default value, ``false``.
+ default value, ``false``. Note that if the given path
is a RDD of Strings,
+ this header option will remove all lines same with the
header if exists.
Review comment:
Here too. I would use `.. note:` under this parameter.
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##########
@@ -600,6 +600,9 @@ class DataFrameReader private[sql](sparkSession:
SparkSession) extends Logging {
* If the enforceSchema is set to `false`, only the CSV header in the first
line is checked
* to conform specified or inferred schema.
*
+ * Note that if `header` option is set to `true` when calling this API, all
lines same with
Review comment:
I would just `@note`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]