Re: [PR] fix(charts): handle PostgreSQL INTERVAL type in bar and pie charts [superset]

via GitHub Thu, 12 Feb 2026 16:07:39 -0800


sadpandajoe commented on code in PR #34513:
URL: https://github.com/apache/superset/pull/34513#discussion_r2801632065



##########
tests/unit_tests/db_engine_specs/test_postgres.py:
##########
@@ -280,3 +280,34 @@ def quote_table(table: Table, dialect: Dialect) -> str:
  LIMIT :param_1
     """.strip()
     )
+
+
+def test_interval_type_mutator() -> None:
+    """
+    DB Eng Specs (postgres): Test INTERVAL type mutator
+
+    INTERVAL values are converted to milliseconds so users can apply
+    the built-in "DURATION" number format for human-readable display.
+    """
+    mutator = spec.column_type_mutators[INTERVAL]
+
+    # Test timedelta conversion (most common case from psycopg2)
+    # Result is in milliseconds for compatibility with DURATION formatter
+    td = timedelta(days=1, hours=2, minutes=30, seconds=45)
+    assert mutator(td) == 95445000.0  # Total ms: (1*86400 + 2*3600 + 30*60 + 
45) * 1000
+
+    # Test numeric values (assumed to be seconds) are converted to milliseconds
+    assert mutator(12345) == 12345000.0
+    assert mutator(123.45) == 123450.0
+
+    # Test None returns 0 (for aggregations)
+    assert mutator(None) == 0
+
+    # Test string values pass through unchanged
+    # (PostgreSQL may return string representations in some cases)
+    assert mutator("1 day 02:30:45") == "1 day 02:30:45"
+    assert mutator("P1DT2H30M45S") == "P1DT2H30M45S"  # ISO 8601 duration
+
+    # Test other types pass through unchanged
+    assert mutator([1, 2, 3]) == [1, 2, 3]
+    assert mutator({"days": 1}) == {"days": 1}

Review Comment:
   Good coverage of the lambda itself. A few additions that would help catch 
integration issues and edge cases:
   
   **1. Integration test via `fetch_data`** (follows the pattern in 
`test_mysql.py:253`)
   
   This verifies the full wiring: `cursor.description` → type lookup → mutator 
application.
   
   ```python
   @pytest.mark.parametrize(
       "data,description,expected_result",
       [
           (
               [(timedelta(hours=1, minutes=30),)],
               [("duration", "interval")],
               [(5400000.0,)],  # 1.5 hours in ms
           ),
           (
               [(None,)],
               [("duration", "interval")],
               [(None,)],  # NULLs preserved
           ),
           (
               [(timedelta(hours=1), "label")],
               [("duration", "interval"), ("name", "varchar(50)")],
               [(3600000.0, "label")],  # Only INTERVAL column mutated
           ),
       ],
   )
   def test_interval_column_type_mutator(
       data: list[tuple[Any, ...]],
       description: list[tuple[str, str]],
       expected_result: list[tuple[Any, ...]],
   ) -> None:
       from unittest.mock import Mock
   
       cursor = Mock()
       cursor.description = description
       cursor.fetchall.return_value = data
       assert spec.fetch_data(cursor) == expected_result
   ```
   
   **2. Column spec test** — verify INTERVAL is classified as NUMERIC:
   
   ```python
   # Add to the existing test_get_column_spec parametrize list:
   ("INTERVAL", INTERVAL, None, GenericDataType.NUMERIC, False),
   ```
   
   **3. Edge cases for the mutator:**
   
   ```python
   # Zero duration
   assert mutator(timedelta(0)) == 0.0
   
   # Negative interval
   assert mutator(timedelta(days=-1)) == -86400000.0
   
   # Bool should not be treated as numeric
   assert mutator(True) is None
   assert mutator(False) is None
   ```



##########
superset/db_engine_specs/postgres.py:
##########
@@ -489,8 +490,30 @@ class PostgresEngineSpec(BasicParametersMixin, 
PostgresBaseEngineSpec):
             ENUM(),
             GenericDataType.STRING,
         ),
+        (
+            re.compile(r"^interval", re.IGNORECASE),
+            INTERVAL(),
+            GenericDataType.NUMERIC,
+        ),
     )
 
+    # PostgreSQL INTERVAL values need normalization for chart rendering.
+    # psycopg2 returns timedelta objects which we convert to milliseconds for
+    # numeric operations in bar/pie charts. Using milliseconds allows users to
+    # apply the built-in "DURATION" number format for human-readable display
+    # (e.g., "1d 2h 30m 45s"). String representations pass through unchanged.
+    column_type_mutators: dict[types.TypeEngine, Callable[[Any], Any]] = {
+        INTERVAL: lambda v: (
+            v.total_seconds() * 1000
+            if hasattr(v, "total_seconds")
+            else float(v) * 1000
+            if isinstance(v, (int, float))
+            else 0
+            if v is None
+            else v  # Pass through string representations and other types
+        ),
+    }

Review Comment:
   A few things in this lambda worth addressing:
   
   **1. `None` → `0` changes NULL semantics**
   NULL INTERVAL means "unknown/missing" — converting to `0` makes it 
indistinguishable from an actual zero-duration interval. It also means AVG 
aggregations on the frontend would include false zeros instead of excluding 
missing data. Returning `None` preserves the correct semantics.
   
   **2. `bool` leaks through the numeric branch**
   In Python, `bool` is a subclass of `int`, so `isinstance(True, (int, 
float))` is `True`. This means `True` → `1000.0` and `False` → `0.0`, which 
probably isn't intended for an INTERVAL column.
   
   **3. String/other pass-through creates mixed-type NUMERIC columns**
   Since the column type mapping above declares INTERVAL as 
`GenericDataType.NUMERIC`, passing through strings/lists/dicts unchanged could 
produce mixed-type columns that break chart rendering on the frontend. Might be 
safer to return `None` for anything that can't be converted to a number.
   
   Here's a version that addresses all three:
   
   ```python
   def _normalize_interval(v: Any) -> Any:
       """Convert PostgreSQL INTERVAL values to milliseconds for the DURATION 
formatter."""
       if v is None:
           return None
       if hasattr(v, "total_seconds"):
           return v.total_seconds() * 1000
       if isinstance(v, (int, float)) and not isinstance(v, bool):
           return float(v) * 1000
       return None  # Can't convert to numeric — treat as missing
   
   column_type_mutators: dict[types.TypeEngine, Callable[[Any], Any]] = {
       INTERVAL: _normalize_interval,
   }
   ```
   
   This also makes it easier to test individual branches and add edge cases 
later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix(charts): handle PostgreSQL INTERVAL type in bar and pie charts [superset]

Reply via email to