alamb commented on issue #22882:
URL: https://github.com/apache/datafusion/issues/22882#issuecomment-4672159167

   Here are some areas / projects I am monitoring and personally plan to 
actively help with
   
   # Tier 1
   
   ## Performance
   Performance is one of DataFusion's key value propositions, so I will likely 
always prioritize items in this list very high. This is something I think 
@adriangb @Dandandan and @neilconway  also care deeply about
   
   Some specific projects
   * Adaptive Predicate Evaluation 
   * More adaptive scheduling (for skew) -- 
https://github.com/apache/datafusion/issues/21598 etc al
   * Continued low level optimizations. This are largely in arrow, for example
     * filter kernel with @ClSlaid  in 
https://github.com/apache/arrow-rs/pull/9755
     * avoid allocations with @Rich-T-kid in 
https://github.com/apache/arrow-rs/pull/10044
   
   ## Range partitioning
   With @gene-bordegaray, @NGA-TRAN  and others, which will help DataFusion 
take advantage of how data is commonly arranged in storage (not just Hash 
partitioning).
   * #22395 
    
   ## Statistics improvements
   Specifically, better framework for statistics calculation, evaluation of 
predicate cardinality estimation with @xudong963 and @asolimando 
   * https://github.com/apache/datafusion/issues/8227
   
   
   # Tier 2 (nice to have)
   
   Add easier to use semi-structured data support: JSON and Variant support
   - https://github.com/apache/datafusion/issues/21301
   
   I personally think that making it easier to create a system for processing 
semi structured data with DataFusion (e.g. JSON and Variant) would increase 
DataFusion's user base (and usecase) substantially but I am not sure I will 
have time to drive them
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to