GitHub user SamSynnada closed a discussion: Enhancing DataFusion's Community 
Engagement and Visibility

### Who are we?

I'm Sami, co-founder of Synnada, and I'm working alongside my colleague Kuter 
to support the DataFusion community. We are ready to dedicate some time/energy 
to increase awareness around DataFusion and helping the project expand its 
audience.

We believe our team we can create high-quality semi-technical content that 
makes DataFusion more accessible to a broader audience. We can repurpose 
existing technical information into more digestible formats, conduct user 
interviews, and manage social media to engage the community effectively. 

### **Objectives**

1. Making sure that DF is the go-to choice for data system builders, recognized 
as the "LLVM of data systems" that provides a robust, flexible, and efficient 
foundation.
2. Significantly lowering the barrier to entry for data system builders, 
improving ease of use and reducing time-to-first-prototype, thus accelerating 
adoption and innovation in the data systems space.
3. Transforming DataFusion's brand perception into a mark of quality and 
reliability, so that "Built on DataFusion" becomes synonymous with robust, 
high-performance data systems.
4. Expanding DataFusion's reach and recognition beyond system builders to the 
broader audience of data-intensive application developers, positioning it as an 
essential tool in their toolkit.

## **Proposed actions**

We propose the following short term actions for community management. We can 
take the lead for these. 

- Collecting & presenting DF related content on 
[apache.datafusion.org](http://apache.datafusion.org/)
    - Content on DataFusion is all over the place. We need a centralized 
repository for all relevant content, at least a simple web page linking to 
other sites. Some content should be linked (e.g. content on your personal site, 
our website, etc.), some content should be migrated (e.g. release notes on the 
Arrow website), while some content could be re-posted.
    - [As of October 21, 
2024](https://docs.google.com/spreadsheets/d/1c2QXGhpcYjbXY6hyWlF00IqV347i_ZinyOgmnTe_Dl8),
 we identified 71 core contents, of which 23 are listed on DataFusion website.
- Repurposing and distributing existing core content
    - Once we identify core content, we can repurpose it for other channels.
        - DF Paper → Turn it into a series of blog posts explaining **inner 
workings of DF**.
        - Meetup presentations → Turn into show & tell / use case content.
        - The Apache Arrow DataFusion Architecture series by Andrew Lamb → Turn 
slides into a series of blog posts.
- Initiating **Show-and-Tell** sessions to grow core content
    - **What?** We may start with **Show-and-Tell** blog posts. These could be 
published on 
[[apache.datafusion.org](http://apache.datafusion.org/)](http://apache.datafusion.org)
 (and the co-authors website, if applicable). Authors can present the content 
on blog-posts in meetups (digital or physical), that content can be distributed 
on Youtube. Our main objective will be to keep a comprehensive and accurate 
list of active users of DataFusion and showcase how they are using DF in their 
project.
    - **How?** We can create an interview template, start interviewing people, 
turn transcript into a blog post, post together with the author (on 
DataFusion’s website and the author’s preferred medium), promote/distribute, 
reuse the content in Meetups for presentations. This could be done in reverse 
too — turn meetup presentations to show-and-tells.
    - **Draft Question Set**
        1. **Could you please introduce yourself and your organization?**
        2. **How did you first discover Apache DataFusion?**
            
            *What motivated you to give it a chance over other alternatives? 
Why did you choose DataFusion, and what factors influenced your decision?*
            
        3. **Can you describe your learning process with DataFusion?**
            
            *Include any resources or strategies that were particularly 
helpful. Did you face any challenges during the learning or implementation 
phase? If so, how did you overcome them?*
            
        4. **What challenges or problems were you facing before using 
DataFusion?**
            
            *What tools or solutions were you using at that time? What 
limitations did you encounter with those solutions?*
            
        5. **Please explain your specific use case for Apache DataFusion.**
            
            *Detail how you utilize it in your project or workflow. How did 
DataFusion solve your problem or improve your workflow? What benefits or 
improvements have you observed since implementing it?*
            
        6. **Do you have any performance metrics or results that demonstrate 
the impact of using DataFusion?**
            
            *If available, could you share any performance metrics, 
screenshots, graphs, or diagrams that illustrate your use case or results? Did 
you discover any unexpected benefits or features in DataFusion that were 
particularly helpful? Can you comment on the return on investment (ROI) since 
implementing DataFusion, in terms of time saved, cost reduction, or other 
efficiencies?*
            
        7. **What key insights or lessons have you learned from using 
DataFusion?**
            
            *What advice would you give to others considering using it? What 
are the key takeaways from your experience with DataFusion that you believe 
would be valuable for the community? How satisfied are you with DataFusion 
overall, and would you recommend it to others? Why or why not?*
            
        8. **Are there any features or improvements you would like to see in 
future versions of DataFusion and what are your future plans with it?**
        
        **Additional Thoughts (Optional)**
        
        1. Would you like to share any additional thoughts or experiences 
regarding DataFusion?
        2. Please provide any relevant links (e.g., project repositories, blog 
posts) or contact information if you'd like to be contacted for further 
discussion.
        3. How was your experience interacting with the Apache DataFusion 
community or support channels?
        4. Have you contributed back to the DataFusion project (e.g., bug 
reports, feature requests, code contributions)? If so, could you describe your 
contributions?
- Other content ideas that can be deemed as quick wins.
    - Write regular “What’s New” blog/newsletter covering updates and changes 
to DataFusion.
    - Coordinate community calls, make sure calls are recorded and shared with 
rest of the community.
- Active Twitter/X management
    - Our objective for social media management should be to increase 
DataFusion's visibility, engage the community, and foster growth by 
consistently sharing valuable content and updates. We are volunteering to 
manage the Twitter account for the project, adhering to the following general 
guidelines:
        - Who to follow?
            - ASF Official
            - Other relevant Apache projects (Arrow, sub-project of DF)
            - PMC Members: Consider following either all PMC members or just 
those who actively engage on Twitter.
            - Key Project Users: Identify and follow notable users of DF.
        - What to share?
            - **Tone of voice.** We can adopt the tone of voice from other 
Apache projects that have demonstrated successful community management, such as 
Cassandra, Superset, Airflow.
            - **Regular Updates / Release Notes**: Use a consistent format for 
each post, making it easier for the audience to recognize and engage with the 
content.
            - **Event Announcements**:
                - Announce events at least one month in advance to give 
sufficient notice.
                - Post weekly reminders leading up to the event, each with a 
clear call to action (CTA) to boost engagement.
                - During the event, post live updates to maintain momentum and 
interaction with the community.

### Call for actions for the community

- **Quarterly roadmap**
    - **What?** Create a comprehensive roadmap blog detailing upcoming features 
and improvements.
- **Benchmarks & comparisons**
    - **What?** Create a more comprehensive benchmark methodology
- Comment on this document, any other suggestions? Any one would like to 
contribute?

GitHub link: https://github.com/apache/datafusion/discussions/13049

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to