gaborgsomogyi commented on a change in pull request #29024:
URL: https://github.com/apache/spark/pull/29024#discussion_r452729440
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
##########
@@ -23,12 +23,15 @@ import java.util.{Locale, Properties}
import org.apache.commons.io.FilenameUtils
import org.apache.spark.SparkFiles
+import org.apache.spark.annotation.DeveloperApi
import org.apache.spark.internal.Logging
import org.apache.spark.sql.catalyst.util.CaseInsensitiveMap
/**
+ * ::DeveloperApi::
* Options for the JDBC data source.
*/
+@DeveloperApi
Review comment:
I've done the initial steps but I see bad and worse tradeoffs. The API
is less dependent on `JDBCOptions` changes but the provider side implementation
becames more complicated with either duplicated code or duplicated
instantiation. Let me explain in examples:
Option1: Re-instantiate `JDBCOptions` where duplicated instantiation worries
me + the general concern down below.
```
override def canHandle(driver: Driver, options: Map[String, String]):
Boolean = {
val jdbcOptions = new JDBCOptions(options)
jdbcOptions.keytab == null || jdbcOptions.principal == null
}
```
Option2: Find out the needed parameters on by own where re-inventing the
wheel feeling worries me + the general concern down below.
```
override def canHandle(driver: Driver, options: Map[String, String]):
Boolean = {
val keytab = {
val keytabParam = options.getOrElse(JDBC_KEYTAB, null)
if (keytabParam != null && FilenameUtils.getPath(keytabParam).isEmpty)
{
val result = SparkFiles.get(keytabParam)
logDebug(s"Keytab path not found, assuming --files, file name used
on executor: $result")
result
} else {
logDebug("Keytab path found, assuming manual upload")
keytabParam
}
}
val principal = options.getOrElse(JDBC_PRINCIPAL, null)
keytab == null || principal == null
}
```
General concern: Both cases Spark can be good because in the first case new
`JDBCOptions` instantiation, in the second case copy paste moved into a base
class can fill the gap. However considering myself as a 3rd-party developer I
have not much options (since `JDBCOptions` is not exposed as developer API):
* Copy and paste the code from `JDBCOptions` which may change over time
* Implement the parameter parsing on my own which may contain bugs
Considering these findings I think it's better to keep `JDBCOptions`. WDTY?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]