uros-b commented on code in PR #56662:
URL: https://github.com/apache/spark/pull/56662#discussion_r3479693468


##########
sql/pipelines/src/main/scala/org/apache/spark/sql/pipelines/graph/FlowAnalysis.scala:
##########
@@ -44,17 +45,23 @@ object FlowAnalysis {
       confs: Map[String, String],
       queryContext: QueryContext,
       queryOrigin: QueryOrigin) => {
+      // Flows are resolved in parallel on a shared session, so applying 
per-flow confs by mutating
+      // that session's conf would race across flows. Instead, give each flow 
a private SQLConf
+      // (a clone of the session's conf plus this flow's overrides) and 
install it for the analyzing
+      // thread via SQLConf.withExistingConf. Analysis still runs on the 
shared session, so its
+      // catalog and the resolved DataFrames are unaffected; only the confs 
the analyzer reads are
+      // isolated per flow.
+      val spark = SparkSession.active
       val ctx = FlowAnalysisContext(
         allInputs = allInputs,
         availableInputs = availableInputs,
         queryContext = queryContext,
-        spark = SparkSession.active
+        spark = spark,
+        flowConf = spark.sessionState.conf.clone()

Review Comment:
   Thanks, agreed on both parts!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to