Hi everyone, I hope you are doing well.
I have recently started contributing to Apache Wayang and had a few PRs merged (related to repository updates and cleanup after graduation). I am very interested in applying for GSoC 2026 with Wayang. I am particularly interested in the "DataFrame API" project idea, as it would significantly improve usability and make Wayang more accessible to users familiar with tabular data abstractions. I wanted to discuss a potential approach: - Designing a DataFrame abstraction (schema, rows, columns) - Supporting operations like select, filter, join, groupBy, aggregation - Translating these operations into Wayang execution plans - Ensuring compatibility with the optimizer Before drafting my full proposal, I would love to get feedback from the community: 1. Are there any existing discussions or partial implementations around this? 2. Are there preferred design directions or constraints I should be aware of? 3. Any suggestions on how to scope this project better? I am also happy to start contributing smaller PRs in this direction. Looking forward to your guidance and feedback! Best regards, Sujay Barui
