gene-bordegaray commented on PR #18919: URL: https://github.com/apache/datafusion/pull/18919#issuecomment-3598774564
I am making a note here that is follow up work to this PR (listed in order of priority): 1. propagate partitioning through operators: joins and window functions 2. if we move forward with key partitioned approach, group key partitioned file groups into same partition based on size to improve cardinality 3. in either case, introduce some rule / heuristic that recognizes when it is valuable to repartiton / partition by a superset of a partition expression to avoid repartitons. These issues are dependent on this PR so I will hold off on making these for now. Chime in if I missed anything or if some of this is seen as not needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
