alamb commented on issue #22882: URL: https://github.com/apache/datafusion/issues/22882#issuecomment-4672159167
Here are some areas / projects I am monitoring and personally plan to actively help with # Tier 1 ## Performance Performance is one of DataFusion's key value propositions, so I will likely always prioritize items in this list very high. This is something I think @adriangb @Dandandan and @neilconway also care deeply about Some specific projects * Adaptive Predicate Evaluation * More adaptive scheduling (for skew) -- https://github.com/apache/datafusion/issues/21598 etc al * Continued low level optimizations. This are largely in arrow, for example * filter kernel with @ClSlaid in https://github.com/apache/arrow-rs/pull/9755 * avoid allocations with @Rich-T-kid in https://github.com/apache/arrow-rs/pull/10044 ## Range partitioning With @gene-bordegaray, @NGA-TRAN and others, which will help DataFusion take advantage of how data is commonly arranged in storage (not just Hash partitioning). * #22395 ## Statistics improvements Specifically, better framework for statistics calculation, evaluation of predicate cardinality estimation with @xudong963 and @asolimando * https://github.com/apache/datafusion/issues/8227 # Tier 2 (nice to have) Add easier to use semi-structured data support: JSON and Variant support - https://github.com/apache/datafusion/issues/21301 I personally think that making it easier to create a system for processing semi structured data with DataFusion (e.g. JSON and Variant) would increase DataFusion's user base (and usecase) substantially but I am not sure I will have time to drive them -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
