xianyinxin commented on issue #25626: [SPARK-28892][SQL] Add UPDATE support for DataSource V2 URL: https://github.com/apache/spark/pull/25626#issuecomment-529989458 Thank you very much @rdblue . What you comments are exactly right, here the API is just a kind of "push-down" API. We also considered the "row-based" API, which may involve a builder, and start a spark job to calculate which row should be deleted, and issue the deletions to spark tasks. Both the "push-down" API and the "row-based" API is need, and the two have different scenarios. For some data source like JDBC, a simple "push-down" API may work, but for some cases like multi-table delete/update, a "row-based" API is need. Sometimes a "push-down" API is a preferable alternative than "row-based" if both the two can finish the work. Like JDBC, pushing-down the predicates (and the updated value) has a much better performance than deleting/updating the data row by row. Since the design of "push-down" API is simpler then the "row-based", we propose this API first. Later we can consider to add "row-based" API in `SupportsDelete`/`SupportUpdate`.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org