[PR] Flink: Avoid redundant java.time allocations in ORC timestamp writers [iceberg]

via GitHub Mon, 08 Jun 2026 21:36:45 -0700


wombatu-kun opened a new pull request, #16733:
URL: https://github.com/apache/iceberg/pull/16733


   Follow-up to #16731, which removed the same redundant `java.time` 
allocations from the ORC timestamp *readers* (generic and Flink). This PR does 
the writer side for Flink.
   
   `FlinkOrcWriters` defines four ORC timestamp writer classes - 
`TimestampWriter`, `TimestampTzWriter`, `TimestampNanoWriter` and 
`TimestampNanoTzWriter` - each of which converted a `TimestampData` to a 
`java.time` object per value to derive the ORC column-vector fields. The 
no-zone writers (`TimestampWriter`, `TimestampNanoWriter`) used 
`data.toInstant().atOffset(ZoneOffset.UTC)`, which allocates, per value, a 
transient `Instant`, an `OffsetDateTime`, and a `ZoneRules` (plus a small 
backing array) - because `instant.atOffset(ZoneOffset.UTC)` routes through 
`OffsetDateTime.ofInstant(instant, zone)`, which calls `zone.getRules()`, and 
`ZoneOffset.getRules()` returns a freshly allocated `ZoneRules` on every call.
   
   `TimestampData` already exposes the values ORC needs without any allocation: 
`getMillisecond()` (epoch millis) and `getNanoOfMillisecond()`. This rewrites 
all four writers to read those directly and fill the `TimestampColumnVector` 
with plain arithmetic. ORC's `TimestampColumnVector` stores `time` as epoch 
millis and `nanos` as nanoseconds-within-the-second, so:
   
       long millis = data.getMillisecond();
       cv.time[rowId] = millis;
       int nanosOfSecond = (int) Math.floorMod(millis, 1_000L) * 1_000_000 + 
data.getNanoOfMillisecond();
       cv.nanos[rowId] = nanosOfSecond / 1_000 * 1_000;   // micros writers; 
the nano writers keep nanosOfSecond
   
   This is behavior-preserving. The old code computed `cv.time` as 
`instant.toEpochMilli()`, which equals `data.getMillisecond()`, and `cv.nanos` 
from the instant's nanos-of-second, which equals `Math.floorMod(millis, 1000) * 
1_000_000 + data.getNanoOfMillisecond()`; `Math.floorMod` yields the correct 
non-negative milli-of-second for pre-1970 (negative) timestamps. The now-unused 
`java.time` imports and two stale 
`@SuppressWarnings("JavaInstantGetSecondsGetNano")` are removed. The change is 
applied identically across the three Flink version trees (v1.20, v2.0, v2.1) - 
that is, all four writer classes in each of the three copies of 
`FlinkOrcWriters` (3 files total).
   
   Benchmark (JMH, JDK 17, `SingleShotTime`, `gc` profiler; allocation is 
`gc.alloc.rate.norm`, which is deterministic), driving the value writers over a 
reused `TimestampColumnVector`, 10.24M writes/op:
   
   | Writer | time (before -> after) | allocation (before -> after) |
   | --- | --- | --- |
   | timestamp (no zone) | 0.376 -> 0.026 s/op | 144 -> ~0 B/value |
   | timestamptz | 0.073 -> 0.030 s/op | already ~0 B/value |
   
   The no-zone writers carried the real allocation (the `ZoneRules` from 
`atOffset`); the tz writers were already allocation-free (the JIT 
scalar-replaces their `Instant`) but still gain from dropping the 
`Instant`/`toEpochMilli` work. The generic data-module ORC writers were checked 
too, but they use `toInstant().toEpochMilli()` without `atOffset`, which the 
JIT already scalar-replaces even on the end-to-end write path (no measurable 
allocation), so they are intentionally left unchanged.
   
   Testing: covered by the existing `TestFlinkOrcReaderWriter` round-trip test 
(writes then reads back, including pre-1970 timestamps and multiple timezones); 
passes for all three Flink versions.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Flink: Avoid redundant java.time allocations in ORC timestamp writers [iceberg]

Reply via email to