LuciferYang commented on PR #55855: URL: https://github.com/apache/spark/pull/55855#issuecomment-4457362978
Closing this. The cross-JDK benchmark run shows no measurable speedup (JDK 17 flat, JDK 21 ~6%, JDK 25 slightly worse, all within noise): | JDK | Baseline | After this PR | Delta | |---|---|---|---| | 17 | 33.9 ns/row | 33.9 ns/row | 0% | | 21 | 27.5 ns/row | 25.9 ns/row | ~+6% | | 25 | 24.7 ns/row | 27.3 ns/row | ~-11% | Root cause: the per-row bottleneck is `DateTimeUtils.daysToMicros(days, zoneId)` itself, which constructs a `LocalDate`, then `ZonedDateTime`, then `Instant` for every value — dominating the ~25-30 ns/row baseline cost. The bulk-read pattern that delivered 3-14× for the sibling PRs (SPARK-56791 / SPARK-56801 / SPARK-56802 / SPARK-56803) saves the per-row virtual dispatch on `readInteger()`, but that's only a few ns and disappears into the conversion overhead here. Will follow up with a focused PR that fast-paths `daysToMicros` when the zone is `ZoneOffset.UTC` (mathematically `days * MICROS_PER_DAY`, no allocation needed) — that's where the real win lives. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
