huaxingao opened a new pull request, #38434:
URL: https://github.com/apache/spark/pull/38434
### What changes were proposed in this pull request?
```
/**
* A mix-in interface for {@link ScanBuilder}. Data sources can implement
this interface to
* push down all the join or aggregate keys to data sources. A return value
true indicates
* that data source will return input partitions (via planInputPartitions}
following the
* clustering keys. Otherwise, a false return value indicates the data
source doesn't make
* such a guarantee, even though it may still report a partitioning that may
or may not
* be compatible with the given clustering keys, and it's Spark's
responsibility to group
* the input partitions whether it can be applied.
*
* @since 3.4.0
*/
@Evolving
public interface SupportsPushDownClusterKeys extends ScanBuilder {
```
### Why are the changes needed?
Pass down the information of join keys to v2 data sources so the data
sources can decide how to combine the input splits according to the joins keys.
### Does this PR introduce _any_ user-facing change?
Yes, new interface `SupportsPushDownClusterKeys`
### How was this patch tested?
new tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]