HeartSaVioR opened a new pull request, #50015: URL: https://github.com/apache/spark/pull/50015
### What changes were proposed in this pull request? This PR checks whether the logical plan contains streaming source marker when eagerlyExecutedCommands are about to be executed. Here, the meaning of streaming source marker is a placeholder which will be materialized during microbatch planning. That means, if the plan has such a marker, the source is not materialized hence unable to read from that source. This is easily triggered when user constructs the plan for the command (e.g. `df.write.saveToTable`), which includes `df.readStream`, or indirect reference (temp view against `df.readStream`). This has to be caught by `UnsupportedOperationChecker.checkBatch` (which is called from `QueryExecution.assertSupported`), but if the query is a command which is meant to be eagerly executed, it throws an error before reaching to the code path, and the error is cryptic (either StackOverflowError, or AnalysisException but InternalError). We should provide the proper error message to tell user that they have to fix their query. ### Why are the changes needed? Without the fix, StackOverflowError, or AnalysisException but InternalError is thrown for user's fault query. ### Does this PR introduce _any_ user-facing change? Yes, we will provide clearer error (though TODO to be clarified for error class) for the error. ### How was this patch tested? New UT. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
