Hi , My name is Jun(Jenny) Wang, and I am a Computer Science graduate student at NYU Courant Institute. I am writing to introduce myself as a prospective Google Summer of Code (GSoC) 2026 contributor for Apache Wayang.
I am particularly interested in the following two project ideas: - Make Wayang more datalake-friendly - Support for a Dataframes API Both directions align well with my background in Spark, datalake, streaming process, LLM research, distributed system. I would love to discuss these projects further with the community. Should I start preparing a proposal draft for either (or both) of these ideas? Any guidance on the expected scope or preferred direction would be greatly appreciated. To get familiar with the codebase, I have already made a contribution: - Issue #649 - Revise implementation to support multiple field projection by names https://github.com/apache/wayang/issues/649 PR: https://github.com/apache/wayang/pull/710 I am eager to learn more about the project and contribute meaningfully over the summer. Please let me know if there is anything else I should do to get started! Thank you for your time. Best regards, Jun Wang
