gene-bordegaray opened a new issue, #20195: URL: https://github.com/apache/datafusion/issues/20195
### Is your feature request related to a problem or challenge? When preserve_file_partitions is enabled, we currently disable dynamic filtering to avoid incorrect assumptions about hash partitioning. This avoids a bug but removes useful pruning. We want to retain dynamic filtering benefits without breaking file‑partitioning guarantees. ### Describe the solution you'd like Enable dynamic filtering for file‑partitioned scans by pruning file groups based on partition values when the dynamic filter keys are a subset (which also implies exact match) of the partition columns. If keys do not align, apply dynamic filtering as a row‑level predicate but do not prune file groups. This should allow dynamic filtering to work safely without requiring repartition or changing join planning. ### Describe alternatives you've considered Introduce a new partitioning implementation via a generic trait‑based scheme (hash/value/range/custom), where users (and datafusion) can implement any type of partitioning scheme they desire. This would provide the interface needed to determine how partitioning schemes are compatible, what satisfies what, etc. The exact details are not fleshed out but this would be a powerful addition and clear up ambiguities in DataFusion's partitioning modes today. ### Additional context cc: @adriangb @NGA-TRAN @fmonjalet @gabotechs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
