jamisonbennett commented on a change in pull request #23398: [SPARK-26493][SQL] 
Allow multiple spark.sql.extensions
URL: https://github.com/apache/spark/pull/23398#discussion_r244414275
 
 

 ##########
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala
 ##########
 @@ -114,6 +124,25 @@ class SparkSessionExtensionSuite extends SparkFunSuite {
       stop(session)
     }
   }
+
+  test("use multiple custom class for extensions") {
+    val session = SparkSession.builder()
+      .master("local[1]")
+      .config("spark.sql.extensions", Seq(
+        classOf[MyExtensions].getCanonicalName,
+        classOf[MyExtensions2].getCanonicalName).mkString(","))
+      .getOrCreate()
+    try {
+      
assert(session.sessionState.planner.strategies.contains(MySparkStrategy(session)))
+      
assert(session.sessionState.analyzer.extendedResolutionRules.contains(MyRule(session)))
+      assert(session.sessionState.functionRegistry
+        .lookupFunction(MyExtensions.myFunction._1).isDefined)
+      assert(session.sessionState.functionRegistry
+        .lookupFunction(MyExtensions2.myFunction._1).isDefined)
 
 Review comment:
   Prior to this change, it was possible to programmatically register multiple 
extensions but it was not possible to do so through the spark.sql.extensions 
configuration. Although it wasn't documented/tested until this pull request. 
E.g. The following works without this pull request:
   ```
   SparkSession.builder()
     .master("..")
     .withExtensions(sparkSessionExtensions1)
     .withExtensions(sparkSessionExtensions2)
     .getOrCreate()
   ```
   
   So I think conflicting function names are already currently possible (but 
not documented). In the following cases:
   1. Conflicting function names are registered by calling `.withExtenions()` 
multiple times
   2. An extension accidentally registers a function that was already 
registered with the builtin functions
   3. An extension accidentally registers a function multiple times by calling 
`injectFunction(myFunction)`
   
   As for the order, it looks to me like the last function to be stored with 
conflicting names is the one which is retrieved:
   ```
   class SimpleFunctionRegistry extends FunctionRegistry {
   
     @GuardedBy("this")
     private val functionBuilders =
       new mutable.HashMap[FunctionIdentifier, (ExpressionInfo, 
FunctionBuilder)]
     override def registerFunction(
         name: FunctionIdentifier,
         info: ExpressionInfo,
         builder: FunctionBuilder): Unit = synchronized {
       functionBuilders.put(normalizeFuncName(name), (info, builder))
     }
   ```
   
   I will update this PR to document what happens in order of operations and 
conflicts. If we need to explicitly block duplicates functions from being 
registered, I can temporarily drop this PR and see about making those changes 
first.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to