[PR] Kafka Connect: Precompute UUID-as-bytes flag in RecordConverter [iceberg]

via GitHub Mon, 01 Jun 2026 20:53:19 -0700


wombatu-kun opened a new pull request, #16654:
URL: https://github.com/apache/iceberg/pull/16654


   `RecordConverter.convertUUID` recomputed 
`FileFormat.PARQUET.name().toLowerCase(Locale.ROOT).equals(config.writeProps().get(DEFAULT_FILE_FORMAT))`
 for every UUID-typed value. The write file format is fixed for the converter's 
lifetime (`writeProps` is set once on the config), so this boolean is constant, 
yet `enum.name()` + `toLowerCase` allocated a fresh `"parquet"` String on every 
call, plus a map lookup and an equals.
   
   This resolves the flag once in the constructor (`writeUuidAsBytes`), 
reducing `convertUUID` to a field read. Behavior is unchanged: the same 16-byte 
representation is returned for Parquet and the same UUID otherwise.
   
   A throwaway A/B microbench over the whole `convertUUID` method (2M 
iterations x 9 trials, median; baseline mirrors the current inline expression, 
optimized uses the precomputed boolean) showed the per-value cost drop:
   
   | input | format | before | after | faster |
   |---|---|---|---|---|
   | String | parquet | 53.6 ns | 32.5 ns | 39% |
   | String | orc | 46.1 ns | 26.1 ns | 43% |
   | UUID | parquet | 32.8 ns | 5.9 ns | 82% |
   | UUID | orc | 22.3 ns | 2.4 ns | 89% |
   
   That is roughly 20-27 ns saved per UUID value, about 40% of the method on 
String inputs (the common Kafka Connect case). The numbers are indicative 
wall-clock from a microbench, not JMH.
   
   Existing `TestRecordConverter` covers the conversion (including 
`testUUIDConversionWithParquet`); its mock now defaults `writeProps()` to an 
empty map to mirror production, where it is never null.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Kafka Connect: Precompute UUID-as-bytes flag in RecordConverter [iceberg]

Reply via email to