wombatu-kun opened a new pull request, #16657: URL: https://github.com/apache/iceberg/pull/16657
`RecordConverter.convertListValue` built its result with `list.stream().map(...).collect(Collectors.toList())`, and `convertMapValue` collected into an unsized `Maps.newHashMap()`; both also recomputed the element/key/value field ids (`type.fields().get(...).fieldId()`) once per element inside the lambda. This converts both to a pre-sized loop: `convertListValue` fills a `Lists.newArrayListWithCapacity(list.size())` with a plain `for` loop, and `convertMapValue` collects into `Maps.newHashMapWithExpectedSize(map.size())`, with the field ids and element/key/value types hoisted out of the per-element body. For lists this removes the stream pipeline allocation; for maps the pre-sizing avoids rehashing as the map grows. Behavior is unchanged (same elements and order, same collection types). A throwaway A/B microbench over the whole conversion method (200k iterations x 9 trials, median; the private per-element `convertValue` is identical in both versions and is replaced by the same identity stub on both sides, so the delta is exactly the structural change; real `ListType`/`MapType` are used so the `fields().get(0).fieldId()` cost is faithful) showed: | collection | size | before | after | faster | |---|---|---|---|---| | list | 10 | 202.6 ns | 44.7 ns | 78% | | list | 100 | 1098 ns | 382 ns | 65% | | list | 1000 | 10746 ns | 3900 ns | 64% | | map | 10 | 190.3 ns | 164.5 ns | 14% | | map | 100 | 2568 ns | 1581 ns | 38% | | map | 1000 | 26198 ns | 15697 ns | 40% | That is roughly 6-7 ns saved per list element and ~10 ns per map entry, paid per list/map field per record. The numbers are wall-clock from a microbench (with a stubbed per-element conversion that inflates the percentages; the absolute per-element saving is what carries over), not JMH. Existing `TestRecordConverter` covers list and map conversion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
