Max Gekk created SPARK-57527:
--------------------------------

             Summary: Add the unix_nanos function returning nanoseconds since 
the epoch for timestamps
                 Key: SPARK-57527
                 URL: https://issues.apache.org/jira/browse/SPARK-57527
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk


h3. Background
Part of the SPARK-56822 umbrella. Spark has {{unix_seconds}}, {{unix_millis}} 
and
{{unix_micros}}, which return the number of [seconds/millis/micros] since the 
Unix epoch
for a timestamp. There is no nanosecond counterpart, which is the natural 
inverse of
{{timestamp_nanos}}.

h3. Proposal
Add a new built-in function {{unix_nanos(expr)}} that returns the number of 
nanoseconds
since {{1970-01-01 00:00:00 UTC}} for a timestamp value.

* Input: nanosecond-precision timestamps ({{AnyTimestampNanoType}}) and the 
microsecond
  {{TimestampType}} (micros-only inputs contribute {{0}} for the sub-micro 
part).
* Value: {{epochMicros * 1000 + nanosWithinMicro}}, read from 
{{TimestampNanosVal}}.

h3. Open question - return type
For the full timestamp calendar range ([0001..9999]), {{epochMicros * 1000}} 
overflows a
64-bit {{BIGINT}}. Options:
# Return {{DECIMAL(p,0)}} wide enough for the full range (recommended; 
lossless).
# Return {{BIGINT}} and throw on overflow in ANSI mode / return null otherwise 
(matches
  {{unix_micros}} shape but limits the usable range to ~[1677..2262]).

Recommendation: return {{DECIMAL}} to stay lossless across the supported range.

{code:sql}
SELECT unix_nanos(TIMESTAMP_LTZ '2008-12-25 15:30:00.123456789');
-- 1230219000123456789
{code}

h3. Implementation notes
* Mirror {{UnixMicros}} in {{datetimeExpressions.scala}}; add 
{{AnyTimestampNanoType}} to the
  accepted input types.
* Read the value from {{TimestampNanosVal}} (epochMicros + nanosWithinMicro).
* Interpreted + codegen paths; register in {{FunctionRegistry}}.

h3. Acceptance criteria
* {{unix_nanos}} available across SQL/Scala/Python/R.
* Round-trips with {{timestamp_nanos}} for in-range values.
* Return-type / overflow decision documented and tested.
* Unit tests + golden files added.

Parent: SPARK-56822



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to