shardulm94 commented on PR #46707:
URL: https://github.com/apache/spark/pull/46707#issuecomment-2222182510
Thanks @szehon-ho! I think there may be an interesting discussion here. E.g.
consider the `DELETE` statement with some options passed
```
DELETE FROM tbl WITH (`split-size` = 5)
WHERE id > 10
```
What do these options apply to? The scan from the table, or the write to the
table?
1. We can pass the options to both the read and the write, and let the
datasource disambiguate. I don't think this is great because there will likely
be a case in the future whether the same option with different values would
need to be passed to read and write.
2. We could create some syntax disambiguating reads v/s writes. E.g. `WITH
READ OPTIONS('a', 'b') WRITE OPTIONS ('c', 'd')`. I prefer this approach, I
think this is more clear for the user, but it goes against previous discussions
in this PR.
Other ideas welcome. Same scenario applies to `UPDATE` which also has single
table identifier to read from and write to. `MERGE` and `INSERT` have separate
source and target identifiers and so you could specify options separately per
identifier.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]