Licht-T opened a new pull request, #56848: URL: https://github.com/apache/spark/pull/56848
### What changes were proposed in this pull request? `date_trunc` (`TruncTimestamp`) resolves the session zone offset for each row via `ZoneRules.getOffset(Instant)` -- a binary search over the zone's transition array -- and for non-fixed-offset zones it does so twice per row (the input instant and the candidate truncated instant used by the DST-equality guard from SPARK-56663 / SPARK-56769). This PR adds a per-task `ZoneOffsetCache` that memoizes the resolved offset over the half-open epoch-second interval `[lo, hi)` on which it is provably constant, derived from the surrounding zone transitions (`nextTransition` / `previousTransition`, anchored on an interior point to avoid an off-by-one when an instant sits exactly on a transition). A lookup inside the cached interval reduces to two comparisons instead of a binary search. ### Why are the changes needed? The session time zone is constant for a query and a zone's offset is piecewise-constant between DST/historical transitions, so consecutive rows almost always fall in the same constant-offset window (analytic data is typically temporally clustered -- time series, date-partitioned tables, post-sort). Repeating the transition-array binary search on every row is redundant work on the hot path. `DateTimeBenchmark` Truncation, whole-stage codegen on, session zone `America/Los_Angeles`, OpenJDK 17 on a 12th Gen Intel i7-1260P, ns/row (lower is better): | level | without cache | with cache | speedup | |-------|--------------:|-----------:|--------:| | date_trunc YEAR | 98.2 | 56.8 | 1.73x | | date_trunc QUARTER | 109.3 | 71.7 | 1.52x | | date_trunc MONTH | 90.8 | 53.7 | 1.69x | | date_trunc WEEK | 77.8 | 40.6 | 1.92x | | date_trunc DAY | 64.8 | 33.0 | 1.96x | | date_trunc SECOND (control) | 28.7 | 27.7 | ~1.0x | ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing `DateTimeUtilsSuite` and `DateExpressionsSuite` pass. ### Was this patch authored or co-authored using generative AI tooling? Yes, co-authored with Claude Code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
