peter-gergely-horvath commented on a change in pull request #23742: [SPARK-26835][DOCS] Document configuration properties of Spark SQL Generic Load/Save Functions URL: https://github.com/apache/spark/pull/23742#discussion_r254992672
########## File path: docs/sql-data-sources-load-save-functions.md ########## @@ -41,6 +41,11 @@ name (i.e., `org.apache.spark.sql.parquet`), but for built-in sources you can al names (`json`, `parquet`, `jdbc`, `orc`, `libsvm`, `csv`, `text`). DataFrames loaded from any data source type can be converted into other types using this syntax. +For built-in sources, the available extra options are documented in the API documentation, Review comment: @HyukjinKwon can you please summarise what does "_There are more places like `org.apache.spark.sql.DataStreamReader`" mean: where are those properties documented? I think it would be important to capture at least the places one should be looking at, not just a vague reference that "please see the API docs for the details" If you put yourself into the shoes of a new Spark developer, who has no understanding of the internal classes, configuring data loading/saving/ properly becomes a hard task. For the `the method corresponding to the format`: I believe avro and kafka are not built-in by default are they? So that statement is correct isn't it? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
