(superset) 01/01: fix: coerce out-of-bounds nanosecond timestamps to NaT instead of raising

elizabeth Thu, 14 May 2026 14:47:45 -0700

This is an automated email from the ASF dual-hosted git repository.

eschutho pushed a commit to branch fix/out-of-bounds-nanosecond-timestamp
in repository https://gitbox.apache.org/repos/asf/superset.git


commit 44bbc980d30db3f1380580a17e6bddd39609d4a2
Author: Elizabeth Thompson <[email protected]>
AuthorDate: Thu May 14 21:46:35 2026 +0000

    fix: coerce out-of-bounds nanosecond timestamps to NaT instead of raising
    
    Dates beyond ~2262-04-11 overflow pandas' int64 nanosecond representation.
    When a query returns a timezone-aware datetime column containing such a 
value
    (e.g. a sentinel 'far future' date like 3118-01-01), pd.to_datetime() raises
    OutOfBoundsDatetime. In result_set.py the broad except logs at ERROR level,
    causing the error to surface repeatedly in observability tooling for any
    auto-refreshing chart that queries the affected table.
    
    Fix: pass errors="coerce" so out-of-range timestamps become NaT (null) 
rather
    than raising. The same treatment is applied to the dataset importer path as 
a
    belt-and-suspenders fix; NaT is preferable to an unhandled exception during
    import, though a PR comment should note the data-quality tradeoff.
    
    Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
---
 superset/commands/dataset/importers/v1/utils.py | 2 +-
 superset/result_set.py                          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/superset/commands/dataset/importers/v1/utils.py 
b/superset/commands/dataset/importers/v1/utils.py
index f96717c0bbe..4aef8a7aec2 100644
--- a/superset/commands/dataset/importers/v1/utils.py
+++ b/superset/commands/dataset/importers/v1/utils.py
@@ -215,7 +215,7 @@ def load_data(data_uri: str, dataset: SqlaTable, database: 
Database) -> None:
     # convert temporal columns
     for column_name, sqla_type in dtype.items():
         if isinstance(sqla_type, (Date, DateTime)):
-            df[column_name] = pd.to_datetime(df[column_name])
+            df[column_name] = pd.to_datetime(df[column_name], errors="coerce")
 
     # reuse session when loading data if possible, to make import atomic
     if database.sqlalchemy_uri == app.config.get("SQLALCHEMY_DATABASE_URI"):
diff --git a/superset/result_set.py b/superset/result_set.py
index 13446d4c33e..efda04b40e8 100644
--- a/superset/result_set.py
+++ b/superset/result_set.py
@@ -207,7 +207,7 @@ class SupersetResultSet:
                             if sample.tzinfo:
                                 tz = sample.tzinfo
                                 series = pd.Series(array[column])
-                                series = pd.to_datetime(series, utc=True)
+                                series = pd.to_datetime(series, utc=True, 
errors="coerce")
                                 pa_data[i] = pa.Array.from_pandas(
                                     series,
                                     type=pa.timestamp("ns", tz=tz),

(superset) 01/01: fix: coerce out-of-bounds nanosecond timestamps to NaT instead of raising

Reply via email to