[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15292 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82541651 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -17,47 +17,132 @@ package org.apache.spark.sql.execution.datasources.jdbc +import java.sql.{Connection, DriverManager} +import java.util.Properties + /** * Options for the JDBC data source. */ class JDBCOptions( @transient private val parameters: Map[String, String]) extends Serializable { + import JDBCOptions._ + + def this(url: String, table: String, parameters: Map[String, String]) = { +this(parameters ++ Map( + JDBCOptions.JDBC_URL -> url, + JDBCOptions.JDBC_TABLE_NAME -> table)) + } + + val asConnectionProperties: Properties = { +val properties = new Properties() +// We should avoid to pass the options into properties. See SPARK-17776. +parameters.filterKeys(!jdbcOptionNames.contains(_)) + .foreach { case (k, v) => properties.setProperty(k, v) } +properties + } + // // Required parameters // - require(parameters.isDefinedAt("url"), "Option 'url' is required.") - require(parameters.isDefinedAt("dbtable"), "Option 'dbtable' is required.") + require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is required.") + require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option '$JDBC_TABLE_NAME' is required.") // a JDBC URL - val url = parameters("url") + val url = parameters(JDBC_URL) // name of table - val table = parameters("dbtable") + val table = parameters(JDBC_TABLE_NAME) + + // + // Optional parameters + // + val driverClass = { +val userSpecifiedDriverClass = parameters.get(JDBC_DRIVER_CLASS) +userSpecifiedDriverClass.foreach(DriverRegistry.register) + +// Performing this part of the logic on the driver guards against the corner-case where the +// driver returned for a URL is different on the driver and executors due to classpath +// differences. +userSpecifiedDriverClass.getOrElse { + DriverManager.getDriver(url).getClass.getCanonicalName +} + } // - // Optional parameter list + // Optional parameters only for reading // // the column used to partition - val partitionColumn = parameters.getOrElse("partitionColumn", null) + val partitionColumn = parameters.getOrElse(JDBC_PARTITION_COLUMN, null) // the lower bound of partition column - val lowerBound = parameters.getOrElse("lowerBound", null) + val lowerBound = parameters.getOrElse(JDBC_LOWER_BOUND, null) // the upper bound of the partition column - val upperBound = parameters.getOrElse("upperBound", null) + val upperBound = parameters.getOrElse(JDBC_UPPER_BOUND, null) // the number of partitions - val numPartitions = parameters.getOrElse("numPartitions", null) - + val numPartitions = parameters.getOrElse(JDBC_NUM_PARTITIONS, null) require(partitionColumn == null || (lowerBound != null && upperBound != null && numPartitions != null), -"If 'partitionColumn' is specified then 'lowerBound', 'upperBound'," + - " and 'numPartitions' are required.") +s"If '$JDBC_PARTITION_COLUMN' is specified then '$JDBC_LOWER_BOUND', '$JDBC_UPPER_BOUND'," + + s" and '$JDBC_NUM_PARTITIONS' are required.") + val fetchSize = { +val size = parameters.getOrElse(JDBC_BATCH_FETCH_SIZE, "0").toInt +require(size >= 0, + s"Invalid value `${size.toString}` for parameter " + +s"`$JDBC_BATCH_FETCH_SIZE`. The minimum value is 0. When the value is 0, " + +"the JDBC driver ignores the value and does the estimates.") +size + } // - // The options for DataFrameWriter + // Optional parameters only for writing // // if to truncate the table from the JDBC database - val isTruncate = parameters.getOrElse("truncate", "false").toBoolean + val isTruncate = parameters.getOrElse(JDBC_TRUNCATE, "false").toBoolean // the create table option , which can be table_options or partition_options. // E.g., "CREATE TABLE t (name string)
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82538242 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -17,47 +17,132 @@ package org.apache.spark.sql.execution.datasources.jdbc +import java.sql.{Connection, DriverManager} +import java.util.Properties + /** * Options for the JDBC data source. */ class JDBCOptions( @transient private val parameters: Map[String, String]) extends Serializable { + import JDBCOptions._ + + def this(url: String, table: String, parameters: Map[String, String]) = { +this(parameters ++ Map( + JDBCOptions.JDBC_URL -> url, + JDBCOptions.JDBC_TABLE_NAME -> table)) + } + + val asConnectionProperties: Properties = { +val properties = new Properties() +// We should avoid to pass the options into properties. See SPARK-17776. +parameters.filterKeys(!jdbcOptionNames.contains(_)) + .foreach { case (k, v) => properties.setProperty(k, v) } +properties + } + // // Required parameters // - require(parameters.isDefinedAt("url"), "Option 'url' is required.") - require(parameters.isDefinedAt("dbtable"), "Option 'dbtable' is required.") + require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is required.") + require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option '$JDBC_TABLE_NAME' is required.") // a JDBC URL - val url = parameters("url") + val url = parameters(JDBC_URL) // name of table - val table = parameters("dbtable") + val table = parameters(JDBC_TABLE_NAME) + + // + // Optional parameters + // + val driverClass = { +val userSpecifiedDriverClass = parameters.get(JDBC_DRIVER_CLASS) +userSpecifiedDriverClass.foreach(DriverRegistry.register) + +// Performing this part of the logic on the driver guards against the corner-case where the +// driver returned for a URL is different on the driver and executors due to classpath +// differences. +userSpecifiedDriverClass.getOrElse { + DriverManager.getDriver(url).getClass.getCanonicalName +} + } // - // Optional parameter list + // Optional parameters only for reading // // the column used to partition - val partitionColumn = parameters.getOrElse("partitionColumn", null) + val partitionColumn = parameters.getOrElse(JDBC_PARTITION_COLUMN, null) // the lower bound of partition column - val lowerBound = parameters.getOrElse("lowerBound", null) + val lowerBound = parameters.getOrElse(JDBC_LOWER_BOUND, null) // the upper bound of the partition column - val upperBound = parameters.getOrElse("upperBound", null) + val upperBound = parameters.getOrElse(JDBC_UPPER_BOUND, null) // the number of partitions - val numPartitions = parameters.getOrElse("numPartitions", null) - + val numPartitions = parameters.getOrElse(JDBC_NUM_PARTITIONS, null) require(partitionColumn == null || (lowerBound != null && upperBound != null && numPartitions != null), -"If 'partitionColumn' is specified then 'lowerBound', 'upperBound'," + - " and 'numPartitions' are required.") +s"If '$JDBC_PARTITION_COLUMN' is specified then '$JDBC_LOWER_BOUND', '$JDBC_UPPER_BOUND'," + + s" and '$JDBC_NUM_PARTITIONS' are required.") + val fetchSize = { +val size = parameters.getOrElse(JDBC_BATCH_FETCH_SIZE, "0").toInt +require(size >= 0, + s"Invalid value `${size.toString}` for parameter " + +s"`$JDBC_BATCH_FETCH_SIZE`. The minimum value is 0. When the value is 0, " + +"the JDBC driver ignores the value and does the estimates.") +size + } // - // The options for DataFrameWriter + // Optional parameters only for writing // // if to truncate the table from the JDBC database - val isTruncate = parameters.getOrElse("truncate", "false").toBoolean + val isTruncate = parameters.getOrElse(JDBC_TRUNCATE, "false").toBoolean // the create table option , which can be table_options or partition_options. // E.g., "CREATE TABLE t (name string)
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82537727 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -17,47 +17,132 @@ package org.apache.spark.sql.execution.datasources.jdbc +import java.sql.{Connection, DriverManager} +import java.util.Properties + /** * Options for the JDBC data source. */ class JDBCOptions( @transient private val parameters: Map[String, String]) extends Serializable { + import JDBCOptions._ + + def this(url: String, table: String, parameters: Map[String, String]) = { +this(parameters ++ Map( + JDBCOptions.JDBC_URL -> url, + JDBCOptions.JDBC_TABLE_NAME -> table)) + } + + val asConnectionProperties: Properties = { +val properties = new Properties() +// We should avoid to pass the options into properties. See SPARK-17776. +parameters.filterKeys(!jdbcOptionNames.contains(_)) + .foreach { case (k, v) => properties.setProperty(k, v) } +properties + } + // // Required parameters // - require(parameters.isDefinedAt("url"), "Option 'url' is required.") - require(parameters.isDefinedAt("dbtable"), "Option 'dbtable' is required.") + require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is required.") + require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option '$JDBC_TABLE_NAME' is required.") // a JDBC URL - val url = parameters("url") + val url = parameters(JDBC_URL) // name of table - val table = parameters("dbtable") + val table = parameters(JDBC_TABLE_NAME) + + // + // Optional parameters + // + val driverClass = { +val userSpecifiedDriverClass = parameters.get(JDBC_DRIVER_CLASS) +userSpecifiedDriverClass.foreach(DriverRegistry.register) + +// Performing this part of the logic on the driver guards against the corner-case where the +// driver returned for a URL is different on the driver and executors due to classpath +// differences. +userSpecifiedDriverClass.getOrElse { + DriverManager.getDriver(url).getClass.getCanonicalName +} + } // - // Optional parameter list + // Optional parameters only for reading // // the column used to partition - val partitionColumn = parameters.getOrElse("partitionColumn", null) + val partitionColumn = parameters.getOrElse(JDBC_PARTITION_COLUMN, null) // the lower bound of partition column - val lowerBound = parameters.getOrElse("lowerBound", null) + val lowerBound = parameters.getOrElse(JDBC_LOWER_BOUND, null) // the upper bound of the partition column - val upperBound = parameters.getOrElse("upperBound", null) + val upperBound = parameters.getOrElse(JDBC_UPPER_BOUND, null) // the number of partitions - val numPartitions = parameters.getOrElse("numPartitions", null) - + val numPartitions = parameters.getOrElse(JDBC_NUM_PARTITIONS, null) require(partitionColumn == null || (lowerBound != null && upperBound != null && numPartitions != null), -"If 'partitionColumn' is specified then 'lowerBound', 'upperBound'," + - " and 'numPartitions' are required.") +s"If '$JDBC_PARTITION_COLUMN' is specified then '$JDBC_LOWER_BOUND', '$JDBC_UPPER_BOUND'," + + s" and '$JDBC_NUM_PARTITIONS' are required.") + val fetchSize = { +val size = parameters.getOrElse(JDBC_BATCH_FETCH_SIZE, "0").toInt +require(size >= 0, + s"Invalid value `${size.toString}` for parameter " + +s"`$JDBC_BATCH_FETCH_SIZE`. The minimum value is 0. When the value is 0, " + +"the JDBC driver ignores the value and does the estimates.") +size + } // - // The options for DataFrameWriter + // Optional parameters only for writing // // if to truncate the table from the JDBC database - val isTruncate = parameters.getOrElse("truncate", "false").toBoolean + val isTruncate = parameters.getOrElse(JDBC_TRUNCATE, "false").toBoolean // the create table option , which can be table_options or partition_options. // E.g., "CREATE TABLE t (name string) ENGINE=InnoDB
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82524289 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -229,13 +229,9 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { table: String, parts: Array[Partition], connectionProperties: Properties): DataFrame = { -val props = new Properties() -extraOptions.foreach { case (key, value) => - props.put(key, value) -} -// connectionProperties should override settings in extraOptions --- End diff -- I will make this back just to be safe. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82515285 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -229,13 +229,9 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { table: String, parts: Array[Partition], connectionProperties: Properties): DataFrame = { -val props = new Properties() -extraOptions.foreach { case (key, value) => - props.put(key, value) -} -// connectionProperties should override settings in extraOptions --- End diff -- should we still keep this comment? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82339188 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -17,47 +17,130 @@ package org.apache.spark.sql.execution.datasources.jdbc +import java.sql.{Connection, DriverManager} +import java.util.Properties + /** * Options for the JDBC data source. */ class JDBCOptions( @transient private val parameters: Map[String, String]) extends Serializable { + import JDBCOptions._ + + def this(url: String, table: String, parameters: Map[String, String]) = { +this(parameters ++ Map("url" -> url, "dbtable" -> table)) + } + + val asProperties: Properties = { --- End diff -- I hear you. Let me fix then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82333056 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -17,47 +17,130 @@ package org.apache.spark.sql.execution.datasources.jdbc +import java.sql.{Connection, DriverManager} +import java.util.Properties + /** * Options for the JDBC data source. */ class JDBCOptions( @transient private val parameters: Map[String, String]) extends Serializable { + import JDBCOptions._ + + def this(url: String, table: String, parameters: Map[String, String]) = { +this(parameters ++ Map("url" -> url, "dbtable" -> table)) + } + + val asProperties: Properties = { --- End diff -- Sounds good but it seems both are fine as actual class name is `Properties` but I guess the meaning is connection properties. Maybe this one depends on personal taste. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82332285 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -17,47 +17,130 @@ package org.apache.spark.sql.execution.datasources.jdbc +import java.sql.{Connection, DriverManager} +import java.util.Properties + /** * Options for the JDBC data source. */ class JDBCOptions( @transient private val parameters: Map[String, String]) extends Serializable { + import JDBCOptions._ + + def this(url: String, table: String, parameters: Map[String, String]) = { +this(parameters ++ Map("url" -> url, "dbtable" -> table)) + } + + val asProperties: Properties = { --- End diff -- I think the function name needs an update. How about `asConnectionProperties`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82330973 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -17,47 +17,130 @@ package org.apache.spark.sql.execution.datasources.jdbc +import java.sql.{Connection, DriverManager} +import java.util.Properties + /** * Options for the JDBC data source. */ class JDBCOptions( @transient private val parameters: Map[String, String]) extends Serializable { + import JDBCOptions._ + + def this(url: String, table: String, parameters: Map[String, String]) = { +this(parameters ++ Map("url" -> url, "dbtable" -> table)) --- End diff -- Change them to `JDBC_URL` and `JDBC_TABLE_NAME`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82329856 --- Diff: docs/sql-programming-guide.md --- @@ -1014,16 +1014,31 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. {% endhighlight %} Tables from the remote database can be loaded as a DataFrame or Spark SQL Temporary table using -the Data Sources API. The following options are supported: +the Data Sources API. The following case-sensitive options are supported: Property NameMeaning url - The JDBC URL to connect to. + The JDBC URL to connect to. It might contain user and password information. e.g., jdbc:postgresql://localhost/test?user=fred=secret --- End diff -- Sure, that sounds more clean and correct. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82329644 --- Diff: docs/sql-programming-guide.md --- @@ -1014,16 +1014,31 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. {% endhighlight %} Tables from the remote database can be loaded as a DataFrame or Spark SQL Temporary table using -the Data Sources API. The following options are supported: +the Data Sources API. The following case-sensitive options are supported: Property NameMeaning url - The JDBC URL to connect to. + The JDBC URL to connect to. It might contain user and password information. e.g., jdbc:postgresql://localhost/test?user=fred=secret + + +user + + The user to connect as. --- End diff -- Sorry, after rethinking it, I think we do not need `user` and `password` here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82329571 --- Diff: docs/sql-programming-guide.md --- @@ -1014,16 +1014,31 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. {% endhighlight %} Tables from the remote database can be loaded as a DataFrame or Spark SQL Temporary table using -the Data Sources API. The following options are supported: +the Data Sources API. The following case-sensitive options are supported: Property NameMeaning url - The JDBC URL to connect to. + The JDBC URL to connect to. It might contain user and password information. e.g., jdbc:postgresql://localhost/test?user=fred=secret --- End diff -- How about this change? _The source-specific connection properties may be specified in the URL. e.g., jdbc:postgresql://localhost/test?user=fred=secret_ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82329341 --- Diff: docs/sql-programming-guide.md --- @@ -1014,16 +1014,31 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. {% endhighlight %} Tables from the remote database can be loaded as a DataFrame or Spark SQL Temporary table using -the Data Sources API. The following options are supported: +the Data Sources API. The following case-sensitive options are supported: Property NameMeaning url - The JDBC URL to connect to. + The JDBC URL to connect to. It might contain user and password information. e.g., jdbc:postgresql://localhost/test?user=fred=secret --- End diff -- Actually, this is not accurate. Let me think about it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r82306624 --- Diff: docs/sql-programming-guide.md --- @@ -1014,16 +1014,31 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. {% endhighlight %} Tables from the remote database can be loaded as a DataFrame or Spark SQL Temporary table using -the Data Sources API. The following options are supported: +the Data Sources API. The following case-sensitive options are supported: Property NameMeaning url - The JDBC URL to connect to. + The JDBC URL to connect to. It might contain user and password information. e.g., jdbc:postgresql://localhost/test?user=fred=secret + + +user + + The user to connect as. + + + + +password + + The password to use when connecting. --- End diff -- `the password for logging into the database ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r81868131 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -35,7 +50,12 @@ class JDBCOptions( val table = parameters("dbtable") // - // Optional parameter list + // Optional parameters + // + val driver = parameters.get(JDBCOptions.JDBC_DRIVER_CLASS) --- End diff -- Oh, right. Yeap, will check and try to address the comments below too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r81804213 --- Diff: docs/sql-programming-guide.md --- @@ -1024,6 +1024,7 @@ the Data Sources API. The following options are supported: The JDBC URL to connect to. --- End diff -- Could you also improve this? You know, `url` might contain `user` and `password`, if users do not provide `user` and `password` in the option list. BTW, I think we also need to document `user` and `password` here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15292#discussion_r81802372 --- Diff: docs/sql-programming-guide.md --- @@ -1048,28 +1049,42 @@ the Data Sources API. The following options are supported: partitionColumn must be a numeric column from the table in question. Notice that lowerBound and upperBound are just used to decide the partition stride, not for filtering the rows in table. So all rows in the table will be - partitioned and returned. + partitioned and returned. This option applies only to reading. fetchsize - The JDBC fetch size, which determines how many rows to fetch per round trip. This can help performance on JDBC drivers which default to low fetch size (eg. Oracle with 10 rows). + The JDBC fetch size, which determines how many rows to fetch per round trip. This can help performance on JDBC drivers which default to low fetch size (eg. Oracle with 10 rows). This option applies only to reading. - + + + batchsize + + The JDBC batch size, which determines how many rows to insert per round trip. This can help performance on JDBC drivers. This option applies only to writing. --- End diff -- Could you also add the default size here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org