Hi Zoi and the Wayang community, My name is Harsha Kamalur and I am applying for GSoC 2026. I am interested in the "Make Wayang more datalake-friendly" project.
I have spent the past week studying the codebase and have already submitted two PRs: - PR #728: Fix test resources and List type handling in wayang-tensorflow https://github.com/apache/wayang/pull/728 - PR #729: Implement cardinality tracking in DataStreamChannel (fixes #678) by replacing the always-zero size field with an AtomicLong counter, mirroring the RddChannel pattern from the Spark platform. All 25 Flink tests pass. https://github.com/apache/wayang/pull/729 >From reading the mailing list discussions, I understand that Trino and DuckDB are promising targets given their compatibility with the existing JDBC template, and that a Java Parquet sink is currently missing. I am planning to scope my proposal around these specific gaps. I am working on my proposal and plan to submit it before March 30. I would greatly appreciate any feedback on my PRs or guidance on the proposal scope. Best regards, Harsha Kamalur GitHub: https://github.com/Harshakamalur
