HeartSaVioR commented on code in PR #42940:
URL: https://github.com/apache/spark/pull/42940#discussion_r1329947013


##########
docs/ss-migration-guide.md:
##########
@@ -26,6 +26,10 @@ Note that this migration guide describes the items specific 
to Structured Stream
 Many items of SQL migration can be applied when migrating Structured Streaming 
to higher versions.
 Please refer [Migration Guide: SQL, Datasets and 
DataFrame](sql-migration-guide.html).
 
+## Upgrading from Structured Streaming 3.5 to 4.0
+
+- Since Spark 4.0, Spark falls back to single batch execution if any source in 
the query does not support `Trigger.AvailableNow`. This is to avoid any 
possible correctness, duplication, and dataloss issue due to incompatibility 
between source and wrapper implementation. (See 
[SPARK-45178](https://issues.apache.org/jira/browse/SPARK-45178) for more 
details.)

Review Comment:
   I intentionally avoid saying it is Trigger.Once. We deprecated it in good 
reason, and I'd say it is still worth saying users have to use 
Trigger.AvailableNow. We just have a fallback to Trigger.Once in technical 
reason, unfortunately.
   
   Ideally, we still need to persuade 3rd party to implement 
Trigger.AvailableNow, but I also see that several data source projects having 
no update for a couple of years, which is unfortunate. Maybe we shouldn't 
introduce fallback logic and don't support the source so that 3rd party would 
indicate the necessity. My bad.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to