Hi Zoi and the Wayang community,

My name is Harsha Kamalur and I am applying for GSoC 2026. I am
interested in the "Make Wayang more datalake-friendly" project.

I have spent the past week studying the codebase and have already
submitted two PRs:

- PR #728: Fix test resources and List type handling in wayang-tensorflow
  https://github.com/apache/wayang/pull/728

- PR #729: Implement cardinality tracking in DataStreamChannel (fixes
#678) by replacing the always-zero size field with an AtomicLong
counter, mirroring the RddChannel pattern from the Spark platform. All
25 Flink tests pass.
  https://github.com/apache/wayang/pull/729

>From reading the mailing list discussions, I understand that Trino and
DuckDB are promising targets given their compatibility with the
existing JDBC template, and that a Java Parquet sink is currently
missing. I am planning to scope my proposal around these specific
gaps.

I am working on my proposal and plan to submit it before March 30. I
would greatly appreciate any feedback on my PRs or guidance on the
proposal scope.

Best regards,
Harsha Kamalur
GitHub: https://github.com/Harshakamalur

Reply via email to