gaborgsomogyi commented on a change in pull request #29024:
URL: https://github.com/apache/spark/pull/29024#discussion_r452729440



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
##########
@@ -23,12 +23,15 @@ import java.util.{Locale, Properties}
 import org.apache.commons.io.FilenameUtils
 
 import org.apache.spark.SparkFiles
+import org.apache.spark.annotation.DeveloperApi
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.catalyst.util.CaseInsensitiveMap
 
 /**
+ * ::DeveloperApi::
  * Options for the JDBC data source.
  */
+@DeveloperApi

Review comment:
       I've done the initial steps but I see bad and worse tradeoffs. The API 
is less dependent on `JDBCOptions` changes but the provider side implementation 
becames more complicated with either duplicated code or duplicated 
instantiation. Let me explain in examples:
   
   Option1: Re-instantiate `JDBCOptions` where duplicated instantiation worries 
me + the general concern down below.
   ```
     override def canHandle(driver: Driver, options: Map[String, String]): 
Boolean = {
       val jdbcOptions = new JDBCOptions(options)
       jdbcOptions.keytab == null || jdbcOptions.principal == null
     }
   ```
   
   Option2: Find out the needed parameters on by own where re-inventing the 
wheel feeling worries me + the general concern down below.
   ```
     override def canHandle(driver: Driver, options: Map[String, String]): 
Boolean = {
       val keytab = {
         val keytabParam = options.getOrElse(JDBC_KEYTAB, null)
         if (keytabParam != null && FilenameUtils.getPath(keytabParam).isEmpty) 
{
           val result = SparkFiles.get(keytabParam)
           logDebug(s"Keytab path not found, assuming --files, file name used 
on executor: $result")
           result
         } else {
           logDebug("Keytab path found, assuming manual upload")
           keytabParam
         }
       }
       val principal = options.getOrElse(JDBC_PRINCIPAL, null)
       keytab == null || principal == null
     }
   ```
   
   General concern: Both cases Spark can be good because in the first case new 
`JDBCOptions` instantiation, in the second case copy paste moved into a base 
class can fill the gap. However considering myself as a 3rd-party developer I 
have not much options (since `JDBCOptions` is not exposed as developer API):
   * Copy and paste the code from `JDBCOptions` which may change over time
   * Implement the parameter parsing on my own which may contain bugs
   
   Considering these findings I think it's better to keep `JDBCOptions`. WDYT?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to