Re: [PR] [SPARK-57579][PYTHON] Add PySpark support for unix_nanos function [spark]

via GitHub Sat, 20 Jun 2026 22:01:05 -0700


MaxGekk commented on code in PR #56626:
URL: https://github.com/apache/spark/pull/56626#discussion_r3447930380



##########
python/pyspark/sql/functions/builtin.py:
##########
@@ -11749,6 +11749,54 @@ def unix_micros(col: "ColumnOrName") -> Column:
     return _invoke_function_over_columns("unix_micros", col)
 
 
+@_try_remote_functions
+def unix_nanos(col: "ColumnOrName") -> Column:

Review Comment:
   Minor: the `unix_*` family in this file is alphabetical (`unix_date`, 
`unix_micros`, `unix_millis`, `unix_seconds`), but `unix_nanos` is inserted 
between `unix_micros` and `unix_millis`. Placing it after `unix_millis` keeps 
the ordering — and matches `__init__.py`, where the export was correctly placed 
after `unix_millis`. Same applies to the Connect wrapper in 
`connect/functions/builtin.py`.



##########
python/pyspark/sql/functions/builtin.py:
##########
@@ -11749,6 +11749,54 @@ def unix_micros(col: "ColumnOrName") -> Column:
     return _invoke_function_over_columns("unix_micros", col)
 
 
+@_try_remote_functions
+def unix_nanos(col: "ColumnOrName") -> Column:
+    """Returns the number of nanoseconds since 1970-01-01 00:00:00 UTC as 
``DECIMAL(21, 0)``.
+    Only supported for ``TIMESTAMP_LTZ(p)`` and ``TIMESTAMP_NTZ(p)`` with 
precision ``p``
+    in ``[7, 9]``.
+
+    .. versionadded:: 4.3.0
+
+    Parameters
+    ----------
+    col : :class:`~pyspark.sql.Column` or column name
+        input column of nanosecond-precision timestamp values to convert.
+
+    Returns
+    -------
+    :class:`~pyspark.sql.Column`
+        the number of nanoseconds since 1970-01-01 00:00:00 UTC as 
``DECIMAL(21, 0)``.
+
+    See Also
+    --------
+    :meth:`pyspark.sql.functions.unix_date`
+    :meth:`pyspark.sql.functions.unix_seconds`
+    :meth:`pyspark.sql.functions.unix_millis`
+    :meth:`pyspark.sql.functions.unix_micros`
+
+    Examples
+    --------
+    >>> import pyspark.sql.functions as sf

Review Comment:
   Both doctests below use nanosecond-precision timestamp types (`TIMESTAMP_NTZ 
'…123456789'` and `cast('timestamp_ntz(9)')`), which only exist when 
`spark.sql.timestampNanosTypes.enabled=true`. That flag defaults to `false` in 
production (it's `Utils.isTesting`, so on only under tests), so these doctests 
pass in CI but a user running the rendered example in a default session hits an 
error — with the flag off the 9-digit literal isn't a nanos type, and 
`UnixNanos` (`inputTypes = AnyTimestampNanoType`) rejects a micros timestamp at 
analysis.
   
   The Scala `UnixNanos` example handles this by prefixing `SET 
spark.sql.timestampNanosTypes.enabled=true`. Suggest doing the equivalent here 
so the example is reproducible:
   ```suggestion
       >>> import pyspark.sql.functions as sf
       >>> spark.conf.set("spark.sql.timestampNanosTypes.enabled", True)
   ```
   (and `>>> spark.conf.unset("spark.sql.timestampNanosTypes.enabled")` at the 
end of the Examples block).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-57579][PYTHON] Add PySpark support for unix_nanos function [spark]

Reply via email to