gene-bordegaray opened a new issue, #20195:
URL: https://github.com/apache/datafusion/issues/20195

   ### Is your feature request related to a problem or challenge?
   
   When preserve_file_partitions is enabled, we currently disable dynamic 
filtering to avoid incorrect assumptions about hash partitioning. This avoids a 
bug but removes useful pruning. We want to retain dynamic filtering benefits 
without breaking file‑partitioning guarantees.
   
   
   ### Describe the solution you'd like
   
   Enable dynamic filtering for file‑partitioned scans by pruning file groups 
based on partition values when the dynamic filter keys are a subset (which also 
implies exact match) of the partition columns.
   
   If keys do not align, apply dynamic filtering as a row‑level predicate but 
do not prune file groups.
   
   This should allow dynamic filtering to work safely without requiring 
repartition or changing join planning.
   
   
   
   ### Describe alternatives you've considered
   
   Introduce a new partitioning implementation via a generic trait‑based scheme 
(hash/value/range/custom), where users (and datafusion) can implement any type 
of partitioning scheme they desire.
   
   This would provide the interface needed to determine how partitioning 
schemes are compatible, what satisfies what, etc. The exact details are not 
fleshed out but this would be a powerful addition and clear up ambiguities in 
DataFusion's partitioning modes today. 
   
   ### Additional context
   
   cc: @adriangb @NGA-TRAN @fmonjalet @gabotechs 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to