LuciferYang opened a new pull request, #55893:
URL: https://github.com/apache/spark/pull/55893

   ### What changes were proposed in this pull request?
   
   Add a `ZoneOffset.UTC` fast path to `DateTimeUtils.daysToMicros(days, 
zoneId)`:
   
   ```scala
   if (zoneId eq ZoneOffset.UTC) {
     Math.multiplyExact(days.toLong, MICROS_PER_DAY)
   } else {
     // existing LocalDate -> ZonedDateTime -> Instant path
   }
   ```
   
   For UTC the answer is simply `days * MICROS_PER_DAY`, so the slow path's 
three heap allocations (`LocalDate`, `ZonedDateTime`, `Instant`) are wasted.
   
   ### Why are the changes needed?
   
   `daysToMicros(days, ZoneOffset.UTC)` is on the per-row hot path of the 
vectorized parquet reader (`DateToTimestampNTZUpdater` and the rebase 
variants), the row-based parquet converter (`ParquetRowConverter`), the Avro 
reader (`AvroDeserializer`), and DATE -> TIMESTAMP `Cast` (interpreted + 
codegen). All of them pass the `ZoneOffset.UTC` singleton, so the 
reference-equality fast path triggers everywhere it matters.
   
   `ParquetVectorUpdaterBenchmark` results will be regenerated via GitHub 
Actions.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No -- pure optimization. Behavior is preserved for every input in Spark's 
valid `DateType` range. Both paths use `Math.multiplyExact` internally and 
overflow at the same `|days| ~= 107M` boundary with the same 
`ArithmeticException`, far outside any reachable input.
   
   ### How was this patch tested?
   
   New `DateTimeUtilsSuite` contract test pins down the UTC fast path:
   
   - Asserts it agrees with a fixed-offset zone (`Etc/GMT`) path for a 
representative set of `days` (zero, positive, negative, `+/-maxSafeDays`).
   - Asserts it equals `days * MICROS_PER_DAY` directly, so divergence in some 
future JDK is caught.
   - Asserts `ArithmeticException` on overflow.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to