wombatu-kun opened a new pull request, #19017: URL: https://github.com/apache/hudi/pull/19017
### Describe the issue this Pull Request addresses `BufferedConnectWriter.flushRecords` materializes the buffered records into a new `LinkedList` before passing them to `writeClient.upsertPreppedRecords` / `bulkInsertPreppedRecords`. A `LinkedList` allocates a separate node object (each holding two link references) for every record and then offers poor locality when the write client iterates it once. ### Summary and Changelog Replace the two `new LinkedList<>(bufferedRecords.values())` call sites with `new ArrayList<>(bufferedRecords.values())`. The `ArrayList` is pre-sized from the source collection, stores the elements in a single contiguous array, and is iterated once downstream. Behavior is unchanged: both are a `List<HoodieRecord>` fully materialized from the spillable map and handed to the same write-client methods. ### Impact Performance only; no public API or behavior change. This runs once per commit, with cost proportional to the number of buffered records. JMH micro-benchmark of building the list from a 10,000-record map and iterating it once (AverageTime mode, gc profiler): | Metric (10,000 records) | Baseline (LinkedList) | After (ArrayList) | |-------------------------|----------------------:|------------------:| | Time per flush | 122.3 us | 91.7 us (-25%) | | Allocations | 280049 B | 80033 B (-71%) | The list allocation drops from about 28 B/record (linked nodes) to about 8 B/record (a single backing array). Benchmark code is not included in this PR. ### Risk Level low Drop-in `List` replacement; the downstream write clients accept any `List` and iterate it once. Covered by the existing `hudi-kafka-connect` unit tests; `TestBufferedConnectWriter` asserts the records passed to `bulkInsertPreppedRecords` and passes. ### Documentation Update none ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Enough context is provided in the sections above - [ ] Adequate tests were added if applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
