alamb commented on issue #20874:
URL: https://github.com/apache/datafusion/issues/20874#issuecomment-4673035612

   Here is the report that was submitted
   
   ```
   ## Description:
   The mission of Apache DataFusion is the creation and maintenance of software 
   related to an extensible query engine
   
   ## Project Status:
   Current project status: New + Ongoing (high activity)
   Issues for the board: None
   
   ## Membership Data:
   
   Apache DataFusion was founded 2024-04-16 (2 years ago)
   There are currently 58 committers and 22 PMC members in this project.
   The Committer-to-PMC ratio is roughly 8:3.
   
   Community changes, past quarter:
   - No new PMC members. Last addition was Adrian Garcia Badaracco on 
2026-02-01.
   - Bhargava Vadlamani was added as committer on 2026-04-28
   - Kumar Ujjawal was added as committer on 2026-04-28
   
   
   ## Project Activity:
   Note that almost all communication for DataFusion and its subprojects happens
   on github and so our dev mailing list traffic is fairly light.
   
   ### DataFusion core
   https://github.com/apache/datafusion
   54.0.0 was released on 2026-06-09.
   53.1.0 was released on 2026-04-16.
   53.0.0 was released on 2026-03-23.
   52.5.0 was released on 2026-04-11.
   52.4.0 was released on 2026-03-22.
   52.3.0 was released on 2026-03-12.
   
   
   Our releases now consist of contributions from over 120 distinct 
contributors (was 100), and we average around [9.2 commits per day] to the main 
repo (up from [7.8 commits per day])
   
   [9.2 commits per day]: git rev-list --count apache/main --since='2026-03-10 
00:00:00' --until='2026-06-08 23:59:59'
   [7.8 commits per day]: git rev-list --count apache/main --since='2026-02-09 
00:00:00' --until='2026-03-09 23:59:59'
   
   The community continues to write blogs highlighting our work, see 
https://datafusion.apache.org/blog/
   
   We continue to hold small scale in person meetups in various locations, 
which have been successful in bringing together contributors. We had events in 
Portland, Seattle, NYC, and Stockholm, and are trying to hold more in Asia, 
such as in China. See a list here: 
   
https://datafusion.apache.org/user-guide/concepts-readings-events.html#community-events
 
   
   The overall number of PRs in need of review has been growing, likely due to 
increasing use of AI coding tools and the overall growth of the community.
   
   As the project matures, time is extending between major releases, likely due 
to increased testing and attention to quality. 
   
   
   ### Sub project: DataFusion Python
   
   https://github.com/apache/datafusion-python 
   DATAFUSION-PYTHON-53.0.0 was released on 2026-04-12.
   DATAFUSION-PYTHON-52.3.0 was released on 2026-03-16.
   
   In version 53.0.0 we introduced new AI workflows into the project. The 
primary outcome of this is to provide a method to ensure we have consistent 
coverage between the exposed datafusion-python APIs and the upstream functions 
in the core repository. This workflow exposed 55 function gaps between the two 
repositories that were then corrected.
   
   Additionally the datafusion-python project went through a massive overhaul 
in the documentation of the API surface area to include usage docstrings 
directly aimed at improving the ability for LLM agents to write effective 
datafusion-python code.
   
   We have additionally released an agent skill that improves the ability of 
LLMs to write idiomatic datafusion-python code. This has been tested against 
the TPC-H queries where agents can now faithfully reproduce queries to pass 
these tests using only the text description of the query.
   
   Since the release of 53.0.0 we have added two new LLM skills to complement 
the above work. First we added a skill that ensures all of the newly exposed 
functions are “pythonic” in nature rather than just exposing the Rust interface 
directly. Second we have a skill that verifies that the user facing skill to 
write idiomatic code is kept up to date with the API surface area of the 
project.
   
   We have published a blog based on the experience of writing these agent 
skills. You can read it here:
   
   
   https://datafusion.apache.org/blog/2026/05/28/writing-agent-skills/
   
   ### New sub project: DataFusion Java
   
   We have added Java Bindings as a subproject. You can read about it here:
   
   https://datafusion.apache.org/blog/output/2026/05/26/datafusion-java-0.1.0/
   
   
   ### Sub project: DataFusion Comet
   
   COMET-0.16.0 was released on 2026-01-29.
   
   https://github.com/apache/datafusion-comet
   
   You can read about the recent happenings in Comet in the blogs:
   https://datafusion.apache.org/blog/2026/05/07/datafusion-comet-0.16.0
   
   ### Sub project: DataFusion Ballista
   
   https://github.com/apache/datafusion-ballista 
   BALLISTA-53.0.0 was released on 2026-05-24
   BALLISTA-52.0.0 was released on 2026-03-07.
   BALLISTA-51.0.0 was released on 2026-01-19.
   
   The community has published new post outlining changes to ballista in last 
12 months 
https://datafusion.apache.org/blog/output/2026/05/24/datafusion-ballista-53.0.0/
   
   There has been an increase of number contributions to Ballista, and PR 
reviews, which is very positive. Efforts were focused on improving 
observability of running jobs and usability. With hope to improve ballista 
robustness and performance for SF1000+ workloads. I hope this trend of 
increased contributions is going to persist in the future.
   
   
   ### Sub project: sqlparser-rs
   
   SQLPARSER-0.62.0 was released on 2026-05-27.
   
   https://github.com/apache/datafusion-sqlparser-rs
   
   Ifeanyi Ubah (iffyio) continues to review most PRs in this repo. 
   
   
   ## Community Health:
   
   While we as always struggle with code review capacity, 
   we have many active committers, and the community in general helps each 
   other out with reviews. We continue to actively grow our committer 
   and PMC ranks.
   
   We continue to merge multiple PRs a day from multiple committers and
   have contributions from a wide variety of individuals with a wide 
   variety of employers, organizations, and backgrounds.
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to