wombatu-kun opened a new pull request, #16731:
URL: https://github.com/apache/iceberg/pull/16731
The generic and Flink ORC timestamp value readers build throwaway
`java.time` objects on every value. Reading a `timestamp`/`timestamptz` column
went through `Instant.ofEpochSecond(...).atOffset(ZoneOffset.UTC)[...]`, which
allocates, per value: a transient `Instant`; an `OffsetDateTime` that is
immediately discarded (for the readers that ultimately return a
`LocalDateTime`/`TimestampData`); and, less obviously, a `ZoneRules` object
plus a small backing array, because `instant.atOffset(ZoneOffset.UTC)` routes
through `OffsetDateTime.ofInstant(instant, zone)`, which calls
`zone.getRules()`, and `ZoneOffset.getRules()` returns a freshly allocated
`ZoneRules` on every call.
This replaces those expressions with direct constructions that pass the
`ZoneOffset` straight to the factory methods and never call `getRules()`. For
example, in `GenericOrcReaders`:
// TimestampReader (-> LocalDateTime)
- Instant.ofEpochSecond(Math.floorDiv(tcv.time[row], 1_000),
tcv.nanos[row])
- .atOffset(ZoneOffset.UTC)
- .toLocalDateTime();
+ LocalDateTime.ofEpochSecond(Math.floorDiv(tcv.time[row], 1_000),
tcv.nanos[row], ZoneOffset.UTC);
The `timestamptz` readers follow the same shape
(`OffsetDateTime.of(LocalDateTime.ofEpochSecond(...), UTC)` in the generic
reader; `TimestampData.fromInstant(Instant.ofEpochSecond(...))` in Flink). The
Flink `TimestampTzReader` additionally did
`Instant.ofEpochSecond(...).atOffset(UTC).toInstant()`, an identity round-trip
back to an `Instant`, which is now removed.
All rewrites are behavior-preserving.
`instant.atOffset(UTC).toLocalDateTime()` is exactly what
`LocalDateTime.ofEpochSecond(seconds, nanoOfSecond, UTC)` produces, and the
readers always pass `tcv.nanos[row]`, which is in `[0, 1_000_000_000)` (it
carries `java.sql.Timestamp.getNanos()` semantics; the whole second and the
sign live in `tcv.time` and are split out with `Math.floorDiv`), so the
`nanoOfSecond` range contract of `LocalDateTime.ofEpochSecond` is always
satisfied.
Affected readers:
- iceberg-orc `GenericOrcReaders` (`TimestampReader`, `TimestampTzReader`) -
the generic `Record` ORC path, used by `IcebergGenerics` and by delete-file
reads via `BaseDeleteLoader`.
- iceberg-flink `FlinkOrcReaders` for v1.20, v2.0 and v2.1
(`TimestampReader`, `TimestampTzReader`) - the Flink ORC scan path.
Benchmarks (JMH, JDK 17, `SingleShotTime`, `gc` profiler; allocation is
`gc.alloc.rate.norm`, which is deterministic):
Generic `Record` read, 2.5M rows x 8 timestamp columns (20M values/op):
| Metric | Before | After |
| --- | --- | --- |
| time | 2.431 s/op | 1.983 s/op (-18%) |
| allocation | 7.434 GB/op | 5.034 GB/op (-32%, -120 B/value) |
Flink `TimestampData` readers, 10.24M values/op:
| Reader | time (before -> after) | allocation (before -> after) |
| --- | --- | --- |
| timestamp | 0.358 -> 0.291 s/op | 144 -> 24 B/value (-83%) |
| timestamptz | 0.380 -> 0.070 s/op | 168 -> 24 B/value (-86%) |
After the change both Flink readers bottom out at the 24-byte
`TimestampData` result, because simplifying the expression also lets the JIT
scalar-replace the residual `LocalDateTime`/`LocalDate`/`LocalTime` (in the
original, the `atOffset` detour and the escaping `ZoneRules` defeated escape
analysis).
Testing: covered by existing round-trip tests - `TestGenericData`
(parameterized over `timestamp`, `timestamptz`, `timestamp_ns`,
`timestamptz_ns`, including negative epochs and a cross-timezone write/read)
and `TestFlinkOrcReaderWriter`; both pass with the change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]