HeartSaVioR commented on code in PR #42940: URL: https://github.com/apache/spark/pull/42940#discussion_r1329947013
########## docs/ss-migration-guide.md: ########## @@ -26,6 +26,10 @@ Note that this migration guide describes the items specific to Structured Stream Many items of SQL migration can be applied when migrating Structured Streaming to higher versions. Please refer [Migration Guide: SQL, Datasets and DataFrame](sql-migration-guide.html). +## Upgrading from Structured Streaming 3.5 to 4.0 + +- Since Spark 4.0, Spark falls back to single batch execution if any source in the query does not support `Trigger.AvailableNow`. This is to avoid any possible correctness, duplication, and dataloss issue due to incompatibility between source and wrapper implementation. (See [SPARK-45178](https://issues.apache.org/jira/browse/SPARK-45178) for more details.) Review Comment: I intentionally avoid saying it is Trigger.Once. We deprecated it in good reason, and I'd say it is still worth saying users have to use Trigger.AvailableNow. We just have a fallback to Trigger.Once in technical reason, unfortunately. Ideally, we still need to persuade 3rd party to implement Trigger.AvailableNow, but I also see that several data source projects having no update for a couple of years, which is unfortunate. Maybe we shouldn't introduce fallback logic and don't support the source so that 3rd party would indicate the necessity. My bad. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org