cloud-fan commented on issue #25626: [SPARK-28892][SQL] Add UPDATE support for 
DataSource V2
URL: https://github.com/apache/spark/pull/25626#issuecomment-529913436
 
 
   > we would add an implementation that reads the rows that might match the 
where query, finds all the rows that actually match, updates those rows, and 
saves the changed rows back to the data source.
   
   This applies to DELETE as well. Spark should be responsible for finding the 
matched rows, and tell data source which rows need to be deleted/updated.
   
   Think about `DELETE FROM t1 WHERE t1.col IN (SELECT col FROM t2)`. It's not 
a metadata-only operation as Spark needs to scan `t2` to find the matched rows.
   
   We need both APIs: one for simple DELETE/UPDATE which is metadata-only. one 
for general DELETE/UPDATE which needs to notify the data source about 
deleted/updated rows.
   
   I don't have a strong preference on which version should be done first. But 
since this PR is already here, I'm OK to have the simpler version first.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to