wombatu-kun opened a new pull request, #19018: URL: https://github.com/apache/hudi/pull/19018
### Describe the issue this Pull Request addresses `HoodieSinkTask.put` allocates a new `TopicPartition(topic, partition)` for every incoming record solely to look the record's participant up in the `transactionParticipants` map, then discards it. On a high-throughput sink this is one short-lived allocation per record. A JMH micro-benchmark confirms the allocation is real and is not eliminated by escape analysis. ### Summary and Changelog Maintain a secondary `topic -> partition -> participant` index alongside the existing `transactionParticipants` map, populated and cleared at the same lifecycle points (`bootstrap`, `close`, `cleanup`). Route records through this index in `put()` using a topic string lookup plus a partition `int` lookup (small ints are cached by the JVM), which removes the per-record `TopicPartition` allocation. The primary `TopicPartition`-keyed map is unchanged and still used by the assignment loop, `preCommit`, and partition close. ### Impact Performance only; no public API or behavior change. JMH micro-benchmark of routing one record to its participant (AverageTime mode, gc profiler): | Metric (per record) | Baseline (new TopicPartition) | After (nested map) | |---------------------|------------------------------:|-------------------:| | Time | 11.76 ns/op | 10.80 ns/op (-8%) | | Allocations | 24 B/op | ~0 B/op | This is a small per-record win; at high record rates it removes roughly 24 B of garbage per record on the `put()` path. Benchmark code is not included in this PR. ### Risk Level low Behavior-preserving routing refactor: the secondary index mirrors `transactionParticipants` and is maintained at the same points, and lookups are equivalent to the previous `TopicPartition`-keyed lookup, read on the single task thread. The full `hudi-kafka-connect` unit suite passes. `HoodieSinkTask.put` is not directly unit-tested, so the change is intentionally a minimal mirror of the existing lookup. ### Documentation Update none ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Enough context is provided in the sections above - [ ] Adequate tests were added if applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
