[GitHub] [spark] soxofaan opened a new pull request, #38399: [SPARK-40922][PYTHON] document multiple path support in `pyspark.pandas.read_csv`

GitBox Wed, 26 Oct 2022 06:59:24 -0700


soxofaan opened a new pull request, #38399:
URL: https://github.com/apache/spark/pull/38399


   ### What changes were proposed in this pull request?
   
   as discussed in https://issues.apache.org/jira/browse/SPARK-40922: 
   
   > The path argument of `pyspark.pandas.read_csv(path, ...)` currently has 
type annotation `str` and is documented as
   >
   >       path : str
   >           The path string storing the CSV file to be read.
   >The implementation however uses `pyspark.sql.DataFrameReader.csv(path, 
...)` which does support multiple paths:
   >
   >        path : str or list
   >            string, or list of strings, for input path(s),
   >            or RDD of Strings storing CSV rows.
   >
   
   This PR updates the type annotation and documentation of `path` argument of 
`pyspark.pandas.read_csv`
   
   ### Why are the changes needed?
   
   Loading multiple CSV files at once is a useful feature to have and should be 
documented 
   
   ### Does this PR introduce _any_ user-facing change?
   it documents and existing feature
   
   ### How was this patch tested?
   No need for tests (so far): only type annotations and docblocks were changed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] soxofaan opened a new pull request, #38399: [SPARK-40922][PYTHON] document multiple path support in `pyspark.pandas.read_csv`

Reply via email to