sryza commented on code in PR #51502:
URL: https://github.com/apache/spark/pull/51502#discussion_r2220517277


##########
sql/connect/common/src/main/protobuf/spark/connect/base.proto:
##########
@@ -43,14 +43,19 @@ message Plan {
   }
 }
 
-
-
 // User Context is used to refer to one particular user session that is 
executing
 // queries in the backend.
 message UserContext {
   string user_id = 1;
   string user_name = 2;
 
+  // (Optional) Should be non-null for RPCs that are sent during the execution 
of a Declarative
+  // Pipelines flow query function. Identifies the flow and the dataflow graph 
that it's a part of.
+  // Any plans that are analyzed within the RPC are analyzed "relative to" the 
dataflow graph.
+  // I.e., when determining the existence and schema of a data source that's 
defined in the graph,
+  // the definition from the graph is used instead of the definition in 
physical storage.
+  optional FlowAnalysisContext pipeline_flow_analysis_context = 3;

Review Comment:
   Having investigated further: I think we could avoid the need for this if 
Spark Connect provided the ability to clone sessions. Each time we're about to 
execute a query function, we could clone the session, add some confs that 
specify the dataflow graph ID and flow name, and execute it in there.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to