peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r321973362
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanner.scala
 ##########
 @@ -47,7 +47,14 @@ class SparkPlanner(
       Window ::
       JoinSelection ::
       InMemoryScans ::
-      BasicOperators :: Nil)
+      BasicOperators(queryExecutionThreadLocal.get()) :: Nil)
+
+  private val queryExecutionThreadLocal = new ThreadLocal[QueryExecution]
 
 Review comment:
   You know, my goal was to pass the current `QueryExecution` instance around 
to `RecursiveRelationExec` for reporting metrics updates.
   - My first attempt was to change the `def apply(plan: LogicalPlan): 
Seq[SparkPlan]` method to accept some other parameters besides the `plan`. I 
tried changing `GenericStrategy`'s `apply()` to something like `def apply(plan: 
LogicalPlan, param: PlanParam): Seq[PhysicalPlan]` where `PlanParam` would be 
`QueryExecution`. I felt that such a big change would not be acceptable, but 
here you can find this approach: 
https://github.com/peter-toth/spark/commit/3a3ac462bb350e1103d640e83394f65994c5175e
   - Then my second try was to create new instance of `SparkPlanner` for each 
`QueryExecution` and I could then pass the current `QueryExecution` to 
`SparkPlanner` and then to `BasicOperators` and then to 
`RecursiveRelationExec`. (`IncrementalExecution` does something similar.) But 
then `SessionState.planner` would became almost useless. I also wasn't sure 
that having as many planners as `QueryExecution`s would be a good idea.
   - Then this current thread-safe approach with a `ThreadLocal` came across my 
mind, because it requires minimal changes.
   
   Please let me know if you have any better idea. Suggestions are very welcome.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to