[GitHub] [spark] HyukjinKwon commented on a change in pull request #32723: [SPARK-35583][DOCS] Move JDBC data source options from Python and Scala into a single page
HyukjinKwon commented on a change in pull request #32723: URL: https://github.com/apache/spark/pull/32723#discussion_r643577724 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -301,23 +306,22 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * Don't create too many partitions in parallel on a large cluster; otherwise Spark might crash * your external database systems. * - * @param url JDBC database url of the form `jdbc:subprotocol:subname`. + * You can find the JDBC-specific options for reading table via JDBC in Review comment: Can we change: "JDBC-specific options for reading table" -> "JDBC-specific option and parameter documentation for reading tables"? ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -282,6 +282,10 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * Construct a `DataFrame` representing the database table accessible via JDBC URL * url named table and connection properties. * + * You can find the JDBC-specific options for reading table via JDBC in Review comment: reading a table or reading tables -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #32723: [SPARK-35583][DOCS] Move JDBC data source options from Python and Scala into a single page
HyukjinKwon commented on a change in pull request #32723: URL: https://github.com/apache/spark/pull/32723#discussion_r643577193 ## File path: docs/sql-data-sources-jdbc.md ## @@ -39,6 +39,8 @@ following command: ./bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9.4.1207.jar {% endhighlight %} +## Data Source Option + Tables from the remote database can be loaded as a DataFrame or Spark SQL temporary view using Review comment: `Tables from the remote database can be ... ction properties for logging into the data sources` this description isn't about Data source option. Can you fix the description such as: Spark supports the following case-insensitive options for JDBC. The Data source options of JDBC can be set via: the .option/.options methods of ... For connection properties, users can specify the JDBC connection properties in the data source options. user; and password are normally provided as connection properties for logging into the data sources. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #32723: [SPARK-35583][DOCS] Move JDBC data source options from Python and Scala into a single page
HyukjinKwon commented on a change in pull request #32723: URL: https://github.com/apache/spark/pull/32723#discussion_r642803769 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ## @@ -754,6 +746,8 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { * or "SERIALIZABLE", corresponding to standard transaction * isolation levels defined by JDBC's Connection object, with default * of "READ_UNCOMMITTED". + * + * Review comment: Let's remove these empty lines -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #32723: [SPARK-35583][DOCS] Move JDBC data source options from Python and Scala into a single page
HyukjinKwon commented on a change in pull request #32723: URL: https://github.com/apache/spark/pull/32723#discussion_r642803647 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -301,23 +306,22 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * Don't create too many partitions in parallel on a large cluster; otherwise Spark might crash * your external database systems. * - * @param url JDBC database url of the form `jdbc:subprotocol:subname`. + * You can find the JDBC-specific options for reading table via JDBC in + * https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html#data-source-option;> + * Data Source Option in the version you use. + * * @param table Name of the table in the external database. - * @param columnName the name of a column of numeric, date, or timestamp type - * that will be used for partitioning. - * @param lowerBound the minimum value of `columnName` used to decide partition stride. - * @param upperBound the maximum value of `columnName` used to decide partition stride. - * @param numPartitions the number of partitions. This, along with `lowerBound` (inclusive), - * `upperBound` (exclusive), form partition strides for generated WHERE - * clause expressions used to split the column `columnName` evenly. When - * the input is less than 1, the number is set to 1. + * @param columnName alias of `partitionColumn` option. Refer to `partitionColumn` in Review comment: ```suggestion * @param columnName Alias of `partitionColumn` option. Refer to `partitionColumn` in ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #32723: [SPARK-35583][DOCS] Move JDBC data source options from Python and Scala into a single page
HyukjinKwon commented on a change in pull request #32723: URL: https://github.com/apache/spark/pull/32723#discussion_r642788618 ## File path: python/pyspark/sql/readwriter.py ## @@ -627,8 +627,6 @@ def jdbc(self, url, table, column=None, lowerBound=None, upperBound=None, numPar Parameters -- -url : str -a JDBC URL of the form ``jdbc:subprotocol:subname`` table : str the name of the table column : str, optional Review comment: I think we can remove `lowerBound`, `upperBound`, and `numPartitions`. And, fix the description of `column` to something like: Alias of `partitionColumn` option. Refer to `partitionColumn` in `Data Source Option <...>`_ in the version you use. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org