codeant-ai-for-open-source[bot] commented on code in PR #37014:
URL: https://github.com/apache/superset/pull/37014#discussion_r3408751528


##########
superset/models/helpers.py:
##########
@@ -2762,19 +2780,47 @@ def get_time_filter(  # pylint: 
disable=too-many-arguments
             )
         )
 
+        # Apply timezone conversion for time filter boundaries
+        # This converts user's local time boundaries to UTC for querying 
UTC-stored data
+        adjusted_start, adjusted_end = start_dttm, end_dttm
+        dataset_timezone = self.get_dataset_timezone()
+
+        if dataset_timezone and (start_dttm or end_dttm):
+            try:
+                tz = ZoneInfo(dataset_timezone)
+
+                # The datetimes from the UI are naive (no timezone info)
+                # We interpret them as being in the dataset's configured 
timezone
+                # and convert them to UTC for comparison with UTC-stored data
+                if start_dttm:
+                    local_start = start_dttm.replace(tzinfo=tz)
+                    adjusted_start = 
local_start.astimezone(timezone.utc).replace(
+                        tzinfo=None
+                    )
+                if end_dttm:
+                    local_end = end_dttm.replace(tzinfo=tz)
+                    adjusted_end = local_end.astimezone(timezone.utc).replace(
+                        tzinfo=None
+                    )

Review Comment:
   **Suggestion:** Using `.replace(tzinfo=tz)` forcibly rewrites timezone info 
and is incorrect when `start_dttm`/`end_dttm` are already timezone-aware (for 
example, ISO8601 inputs with offsets). In that case the original offset is 
discarded and boundaries are shifted incorrectly. Only attach `tz` to naive 
datetimes; for aware datetimes convert with `.astimezone(...)` directly. [logic 
error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Charts with ISO8601 time ranges can use shifted bounds.
   - ⚠️ Time-series metrics may misalign with requested date intervals.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure a SQL dataset with a valid IANA timezone in its extra JSON, for 
example
   `{"timezone": "Europe/Berlin"}`, so that `extra_dict` (in
   `superset/connectors/sqla/models.py:1565-17`) exposes `extra["timezone"]` and
   `get_dataset_timezone()` in `superset/models/helpers.py:19-31` returns 
`"Europe/Berlin"`.
   
   2. Build a chart on this dataset and, in the Explore UI, set a time range 
using explicit
   ISO8601 datetimes with timezone offsets, e.g. `"2026-01-10T00:00:00+00:00 :
   2026-01-11T00:00:00+00:00"`. On the backend `Viz.query_obj()` in 
`superset/viz.py:380-60`
   calls `get_since_until` in `superset/utils/date_parser.py:388-220`, which 
uses
   `parse_human_datetime`/`dateutil.parse` so `from_dttm` and `to_dttm` become 
timezone-aware
   `datetime` instances (with `tzinfo` from the ISO8601 offset).
   
   3. When the dataset builds SQL, `BaseDatasourceMixin` logic in
   `superset/models/helpers.py:3300-74` invokes 
`self.get_time_filter(time_col=dttm_col,
   start_dttm=from_dttm, end_dttm=to_dttm, template_processor=...)`; inside 
`get_time_filter`
   at `superset/models/helpers.py:2788-105`, `dataset_timezone` is truthy and 
the code
   executes `local_start = start_dttm.replace(tzinfo=tz)` / `local_end =
   end_dttm.replace(tzinfo=tz)` even though `start_dttm`/`end_dttm` already 
have a `tzinfo`.
   
   4. For such timezone-aware inputs, `.replace(tzinfo=tz)` discards the 
original offset
   information and reinterprets the wall-clock time in the dataset's timezone 
before
   converting to UTC (via `.astimezone(timezone.utc)` and `dttm_sql_literal` in
   `superset/models/helpers.py:33-61), shifting the actual UTC bounds and 
causing the
   generated WHERE clause to use a different time window than the 
user-specified ISO8601
   range.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=ec9f34b0f51045f0b4680081d889ada1&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=ec9f34b0f51045f0b4680081d889ada1&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/models/helpers.py
   **Line:** 2795:2804
   **Comment:**
        *Logic Error: Using `.replace(tzinfo=tz)` forcibly rewrites timezone 
info and is incorrect when `start_dttm`/`end_dttm` are already timezone-aware 
(for example, ISO8601 inputs with offsets). In that case the original offset is 
discarded and boundaries are shifted incorrectly. Only attach `tz` to naive 
datetimes; for aware datetimes convert with `.astimezone(...)` directly.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=7f78434d900506ee3ffe1ebad1cbd8f3d7456668badd905dc24e02a5b1d952e9&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=7f78434d900506ee3ffe1ebad1cbd8f3d7456668badd905dc24e02a5b1d952e9&reaction=dislike'>👎</a>



##########
superset/utils/core.py:
##########
@@ -2037,8 +2041,28 @@ def normalize_dttm_col(
 
         _process_datetime_column(df, _col)
 
-        if _col.offset:
+        if _col.timezone:
+            try:
+                tz = ZoneInfo(_col.timezone)
+                # Data is stored in UTC, convert to the dataset's configured 
timezone
+                # First make the datetime UTC-aware, then convert to target 
timezone
+                series = df[_col.col_label]
+                if not series.empty and series.notna().any():
+                    # Convert UTC to target timezone
+                    df[_col.col_label] = (
+                        series.dt.tz_localize("UTC")
+                        .dt.tz_convert(tz)

Review Comment:
   **Suggestion:** The timezone conversion assumes every parsed datetime series 
is timezone-naive and always calls `tz_localize("UTC")`. If the column is 
already timezone-aware (for example values coming from backends that return 
tz-aware datetimes), pandas raises `TypeError: Already tz-aware, use tz_convert 
to convert`, breaking chart rendering. Branch on timezone-awareness first 
(`tz_convert` for aware series, `tz_localize` for naive series). [type error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Chart data requests fail for tz-aware timestamp columns.
   - ❌ Dashboards using `get_df` with dataset timezone crash.
   - ⚠️ Any datasource using `normalize_dttm_col` breaks on tz-aware data.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure a dataset with a timezone in `extra_dict["timezone"]` (e.g.
   `"Europe/Berlin"`), so that `get_dataset_timezone()` in
   `superset/models/helpers.py:1158-15` returns a non-null IANA timezone.
   
   2. Query that dataset through a chart (e.g. `/api/v1/chart/<pk>/data` 
handled by
   `ChartDataRestApi.get_data` in `superset/charts/data/api.py:78-117`), which 
builds a
   `QueryObject` and executes it via the datasource mixin, eventually producing 
a pandas
   DataFrame `df` with a datetime column (typically `DTTM_ALIAS`) that is 
already tz-aware
   (e.g. `datetime64[ns, UTC]`).
   
   3. The chart pipeline calls `normalize_dttm_col` with this column, either 
from
   `Viz.get_df` in `superset/viz.py:289-40` or from the datasource path in
   `superset/models/helpers.py:1616-31`, passing a `DateColumn` where 
`timezone` is set from
   `get_dataset_timezone()`.
   
   4. Inside `normalize_dttm_col` in `superset/utils/core.py:111-158`, after
   `_process_datetime_column` returns, the timezone branch executes `series =
   df[_col.col_label]` and then `df[_col.col_label] =
   (series.dt.tz_localize("UTC").dt.tz_convert(tz).dt.tz_localize(None))` 
(lines ~2051-2054
   in the diff); because `series` is already tz-aware, 
`series.dt.tz_localize("UTC")` raises
   `TypeError: Already tz-aware, use tz_convert to convert`, causing the chart 
data request
   to error instead of returning results.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=a9c870fffe3c4973851153c209f5422e&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=a9c870fffe3c4973851153c209f5422e&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/utils/core.py
   **Line:** 2051:2054
   **Comment:**
        *Type Error: The timezone conversion assumes every parsed datetime 
series is timezone-naive and always calls `tz_localize("UTC")`. If the column 
is already timezone-aware (for example values coming from backends that return 
tz-aware datetimes), pandas raises `TypeError: Already tz-aware, use tz_convert 
to convert`, breaking chart rendering. Branch on timezone-awareness first 
(`tz_convert` for aware series, `tz_localize` for naive series).
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=cd89d9cfc1db366eb40d7eaccf892355dab854cde701e9490943617d54578eeb&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=cd89d9cfc1db366eb40d7eaccf892355dab854cde701e9490943617d54578eeb&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to