[ 
https://issues.apache.org/jira/browse/SPARK-57527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-57527:
-----------------------------
    Summary: Add the `unix_nanos` function returning nanoseconds since the 
epoch for timestamps  (was: Add the unix_nanos function returning nanoseconds 
since the epoch for timestamps)

> Add the `unix_nanos` function returning nanoseconds since the epoch for 
> timestamps
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-57527
>                 URL: https://issues.apache.org/jira/browse/SPARK-57527
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Priority: Major
>
> h3. Background
> Part of the SPARK-56822 umbrella. Spark has {{unix_seconds}}, {{unix_millis}} 
> and
> {{unix_micros}}, which return the number of [seconds/millis/micros] since the 
> Unix epoch
> for a timestamp. There is no nanosecond counterpart, which is the natural 
> inverse of
> {{timestamp_nanos}}.
> h3. Proposal
> Add a new built-in function {{unix_nanos(expr)}} that returns the number of 
> nanoseconds
> since {{1970-01-01 00:00:00 UTC}} for a timestamp value.
> * Input: nanosecond-precision timestamps ({{AnyTimestampNanoType}}) and the 
> microsecond
>   {{TimestampType}} (micros-only inputs contribute {{0}} for the sub-micro 
> part).
> * Value: {{epochMicros * 1000 + nanosWithinMicro}}, read from 
> {{TimestampNanosVal}}.
> h3. Open question - return type
> For the full timestamp calendar range ([0001..9999]), {{epochMicros * 1000}} 
> overflows a
> 64-bit {{BIGINT}}. Options:
> # Return {{DECIMAL(p,0)}} wide enough for the full range (recommended; 
> lossless).
> # Return {{BIGINT}} and throw on overflow in ANSI mode / return null 
> otherwise (matches
>   {{unix_micros}} shape but limits the usable range to ~[1677..2262]).
> Recommendation: return {{DECIMAL}} to stay lossless across the supported 
> range.
> {code:sql}
> SELECT unix_nanos(TIMESTAMP_LTZ '2008-12-25 15:30:00.123456789');
> -- 1230219000123456789
> {code}
> h3. Implementation notes
> * Mirror {{UnixMicros}} in {{datetimeExpressions.scala}}; add 
> {{AnyTimestampNanoType}} to the
>   accepted input types.
> * Read the value from {{TimestampNanosVal}} (epochMicros + nanosWithinMicro).
> * Interpreted + codegen paths; register in {{FunctionRegistry}}.
> h3. Acceptance criteria
> * {{unix_nanos}} available across SQL/Scala/Python/R.
> * Round-trips with {{timestamp_nanos}} for in-range values.
> * Return-type / overflow decision documented and tested.
> * Unit tests + golden files added.
> Parent: SPARK-56822



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to