Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/20705#discussion_r171729865
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -147,8 +147,8 @@ def load(self, path=None, format=None, schema=None,
**options):
or a DDL-formatted string (For example ``col0 INT,
col1 DOUBLE``).
:param options: all other string options
- >>> df =
spark.read.load('python/test_support/sql/parquet_partitioned', opt1=True,
- ... opt2=1, opt3='str')
+ >>> df =
spark.read.format("parquet").load('python/test_support/sql/parquet_partitioned',
+ ... opt1=True, opt2=1, opt3='str')
--- End diff --
Unlike the other things, there is some difference from the original
semantics.
As another approach, we can add the following with keeping the original
`spark.read.load`.
```python
spark.conf.set("spark.sql.sources.default", "parquet")
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]