MaxGekk opened a new pull request, #56602: URL: https://github.com/apache/spark/pull/56602
### What changes were proposed in this pull request? This PR adds a new built-in function `unix_nanos(expr)` that returns the number of nanoseconds since `1970-01-01 00:00:00 UTC` for a nanosecond-precision timestamp. Concretely: - Adds a `UnixNanos` expression in `datetimeExpressions.scala` that accepts only the nanosecond-precision timestamp types `TIMESTAMP_LTZ(p)` / `TIMESTAMP_NTZ(p)` (`p in [7, 9]`, i.e. `AnyTimestampNanoType`) and returns a lossless `DECIMAL(21, 0)`. - Computes `epochMicros * 1000 + nanosWithinMicro` via `BigInteger` in both the interpreted (`eval`) and codegen (`doGenCode`) paths. A `BIGINT` return type was rejected because `epochMicros * 1000` overflows 64 bits across the full `[0001..9999]` calendar range; `DECIMAL(21, 0)` is wide enough for every value (`~2.5e20` max) and stays lossless. - Registers `unix_nanos` in `FunctionRegistry` and adds the Scala `functions.unix_nanos`. - Adds catalyst unit tests (interpreted + codegen), Scala/SQL end-to-end tests, and SQL golden-file coverage for `TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)`. The microsecond `TimestampType` input and the PySpark / Spark Connect / R surfaces are out of scope here and tracked as follow-ups; `unix_nanos` is recorded in the PySpark function-parity allowlist in the meantime. ### Why are the changes needed? Part of the [SPARK-56822](https://issues.apache.org/jira/browse/SPARK-56822) umbrella (timestamps with nanosecond precision). Spark has `unix_seconds` / `unix_millis` / `unix_micros` but no nanosecond counterpart, which is the natural inverse of nanosecond timestamp construction. ### Does this PR introduce _any_ user-facing change? Yes. A new `unix_nanos(timeExp)` function is available in SQL and the Scala API. It accepts `TIMESTAMP_LTZ(p)` / `TIMESTAMP_NTZ(p)` and returns `DECIMAL(21, 0)`. This is a change only within the unreleased nanosecond-timestamp preview. Example: ```sql SELECT unix_nanos(TIMESTAMP_NTZ '2008-12-25 15:30:00.123456789'); -- 1230219000123456789 ``` ### How was this patch tested? - `build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.expressions.DateExpressionsSuite'` - `build/sbt 'sql/testOnly org.apache.spark.sql.TimestampNanosFunctionsAnsiOnSuite org.apache.spark.sql.TimestampNanosFunctionsAnsiOffSuite'` - `build/sbt 'sql/testOnly org.apache.spark.sql.expressions.ExpressionInfoSuite org.apache.spark.sql.ExpressionsSchemaSuite'` - `SPARK_GENERATE_GOLDEN_FILES=1 build/sbt 'sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z "nanos"'` - `./dev/scalastyle` ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
