Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21308
@tigerquoll, what we come up with needs to work across a variety of data
sources, including those like JDBC that can delete at a lower granularity than
partition.
For Hive tables, the partition columns are exposed directly, so users would
supply a predicate that matches partition columns. A Hive table source would
also be free to reject delete requests -- by throwing the documented exception
-- that would require rewriting data. These avoid the case that you're talking
about because the predicate must match entire partitions, the source can reject
predicates on non-partition columns, or could reject predicates that can't be
cleanly deleted with a metadata operation.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]