xianyinxin commented on issue #25626: [SPARK-28892][SQL] Add UPDATE support for 
DataSource V2
URL: https://github.com/apache/spark/pull/25626#issuecomment-529989458
 
 
   Thank you very much @rdblue . What you comments are exactly right, here the 
API is just a kind of  "push-down" API. We also considered the "row-based" API, 
which may involve a builder, and start a spark job to calculate which row 
should be deleted, and issue the deletions to spark tasks.
   Both the "push-down" API and the "row-based" API is need, and the two have 
different scenarios. For some data source like JDBC, a simple "push-down" API 
may work, but for some cases like multi-table delete/update, a "row-based" API 
is need. Sometimes a "push-down" API is a preferable alternative than 
"row-based" if both the two can finish the work. Like JDBC, pushing-down the 
predicates (and the updated value) has a much better performance than 
deleting/updating the data row by row.
   Since the design of "push-down" API is simpler then the "row-based", we 
propose this API first. Later we can consider to add "row-based" API in 
`SupportsDelete`/`SupportUpdate`. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to