sryza commented on code in PR #52154:
URL: https://github.com/apache/spark/pull/52154#discussion_r2388283836
##########
sql/connect/common/src/main/protobuf/spark/connect/pipelines.proto:
##########
@@ -90,6 +92,24 @@ message PipelineCommand {
optional string format = 8;
}
+ // Metadata about why a query function failed to be executed successfully.
+ message QueryFunctionFailure {
+ // Identifier for a dataset within the graph that the query function
needed to know the schema
+ // of but which had not yet been analyzed itself.
+ optional string missing_dependency = 1;
Review Comment:
Based on further thinking and discussion, it seems like we might be able to
just leave this out for now: when the server fails to analyze a plan, it knows
what flow that plan was associated with, so it can just do the bookkeeping on
its side.
There might be some edge situations (e.g. at the beginning) where this means
that we end up needing one more query function invocation than we otherwise
would, for query functions that do analysis. But we can bias towards simplicity
for now and optimize later if we need to.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]