wombatu-kun opened a new pull request, #16731:
URL: https://github.com/apache/iceberg/pull/16731

   The generic and Flink ORC timestamp value readers build throwaway 
`java.time` objects on every value. Reading a `timestamp`/`timestamptz` column 
went through `Instant.ofEpochSecond(...).atOffset(ZoneOffset.UTC)[...]`, which 
allocates, per value: a transient `Instant`; an `OffsetDateTime` that is 
immediately discarded (for the readers that ultimately return a 
`LocalDateTime`/`TimestampData`); and, less obviously, a `ZoneRules` object 
plus a small backing array, because `instant.atOffset(ZoneOffset.UTC)` routes 
through `OffsetDateTime.ofInstant(instant, zone)`, which calls 
`zone.getRules()`, and `ZoneOffset.getRules()` returns a freshly allocated 
`ZoneRules` on every call.
   
   This replaces those expressions with direct constructions that pass the 
`ZoneOffset` straight to the factory methods and never call `getRules()`. For 
example, in `GenericOrcReaders`:
   
       // TimestampReader (-> LocalDateTime)
       -  Instant.ofEpochSecond(Math.floorDiv(tcv.time[row], 1_000), 
tcv.nanos[row])
       -      .atOffset(ZoneOffset.UTC)
       -      .toLocalDateTime();
       +  LocalDateTime.ofEpochSecond(Math.floorDiv(tcv.time[row], 1_000), 
tcv.nanos[row], ZoneOffset.UTC);
   
   The `timestamptz` readers follow the same shape 
(`OffsetDateTime.of(LocalDateTime.ofEpochSecond(...), UTC)` in the generic 
reader; `TimestampData.fromInstant(Instant.ofEpochSecond(...))` in Flink). The 
Flink `TimestampTzReader` additionally did 
`Instant.ofEpochSecond(...).atOffset(UTC).toInstant()`, an identity round-trip 
back to an `Instant`, which is now removed.
   
   All rewrites are behavior-preserving. 
`instant.atOffset(UTC).toLocalDateTime()` is exactly what 
`LocalDateTime.ofEpochSecond(seconds, nanoOfSecond, UTC)` produces, and the 
readers always pass `tcv.nanos[row]`, which is in `[0, 1_000_000_000)` (it 
carries `java.sql.Timestamp.getNanos()` semantics; the whole second and the 
sign live in `tcv.time` and are split out with `Math.floorDiv`), so the 
`nanoOfSecond` range contract of `LocalDateTime.ofEpochSecond` is always 
satisfied.
   
   Affected readers:
   
   - iceberg-orc `GenericOrcReaders` (`TimestampReader`, `TimestampTzReader`) - 
the generic `Record` ORC path, used by `IcebergGenerics` and by delete-file 
reads via `BaseDeleteLoader`.
   - iceberg-flink `FlinkOrcReaders` for v1.20, v2.0 and v2.1 
(`TimestampReader`, `TimestampTzReader`) - the Flink ORC scan path.
   
   Benchmarks (JMH, JDK 17, `SingleShotTime`, `gc` profiler; allocation is 
`gc.alloc.rate.norm`, which is deterministic):
   
   Generic `Record` read, 2.5M rows x 8 timestamp columns (20M values/op):
   
   | Metric | Before | After |
   | --- | --- | --- |
   | time | 2.431 s/op | 1.983 s/op (-18%) |
   | allocation | 7.434 GB/op | 5.034 GB/op (-32%, -120 B/value) |
   
   Flink `TimestampData` readers, 10.24M values/op:
   
   | Reader | time (before -> after) | allocation (before -> after) |
   | --- | --- | --- |
   | timestamp | 0.358 -> 0.291 s/op | 144 -> 24 B/value (-83%) |
   | timestamptz | 0.380 -> 0.070 s/op | 168 -> 24 B/value (-86%) |
   
   After the change both Flink readers bottom out at the 24-byte 
`TimestampData` result, because simplifying the expression also lets the JIT 
scalar-replace the residual `LocalDateTime`/`LocalDate`/`LocalTime` (in the 
original, the `atOffset` detour and the escaping `ZoneRules` defeated escape 
analysis).
   
   Testing: covered by existing round-trip tests - `TestGenericData` 
(parameterized over `timestamp`, `timestamptz`, `timestamp_ns`, 
`timestamptz_ns`, including negative epochs and a cross-timezone write/read) 
and `TestFlinkOrcReaderWriter`; both pass with the change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to