codeant-ai-for-open-source[bot] commented on code in PR #37014:
URL: https://github.com/apache/superset/pull/37014#discussion_r3432227648
##########
superset/models/helpers.py:
##########
@@ -2794,19 +2812,47 @@ def get_time_filter( # pylint:
disable=too-many-arguments
)
)
+ # Apply timezone conversion for time filter boundaries
+ # This converts user's local time boundaries to UTC for querying
UTC-stored data
+ adjusted_start, adjusted_end = start_dttm, end_dttm
+ dataset_timezone = self.get_dataset_timezone()
+
+ if dataset_timezone and (start_dttm or end_dttm):
+ try:
+ tz = ZoneInfo(dataset_timezone)
+
+ # The datetimes from the UI are naive (no timezone info)
+ # We interpret them as being in the dataset's configured
timezone
+ # and convert them to UTC for comparison with UTC-stored data
+ if start_dttm:
+ local_start = start_dttm.replace(tzinfo=tz)
+ adjusted_start =
local_start.astimezone(timezone.utc).replace(
+ tzinfo=None
+ )
+ if end_dttm:
+ local_end = end_dttm.replace(tzinfo=tz)
+ adjusted_end = local_end.astimezone(timezone.utc).replace(
Review Comment:
**Suggestion:** After conversion to UTC, the code strips timezone info
(`tzinfo=None`). Later, epoch-formatted columns (`epoch_s`/`epoch_ms`) are
rendered via `dttm.timestamp()`, and naive datetimes are interpreted in
server-local timezone, producing wrong epoch literals on non-UTC hosts. Keep
the UTC tzinfo attached (or compute epoch from an explicitly UTC-aware
datetime) to avoid server-timezone-dependent drift. [logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Epoch-based datetime columns produce shifted time filters.
- ⚠️ Dashboards on affected datasets show incorrect time ranges.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a dataset column with `python_date_format` set to `"epoch_s"`
or `"epoch_ms"`
(validated as allowed in `superset/datasets/schemas.py:60-91`), so that
`TableColumn.python_date_format` reflects one of these epoch formats used in
filtering.
2. Set a dataset-level `timezone` in its `extra` JSON so that
`BaseDatasourceMixin.get_dataset_timezone`
(`superset/models/helpers.py:1159-1172`)
returns a non-null IANA name, enabling the timezone-adjustment branch in
`get_time_filter`.
3. Run a chart or dashboard query with a time range filter for this column,
causing
`get_sqla_query` in `superset/models/helpers.py:3360-3439` to call
`self.get_time_filter(time_col=..., start_dttm=from_dttm, end_dttm=to_dttm,
...)`, where
`get_time_filter` converts the local bounds to UTC and then strips the
timezone via
`.astimezone(timezone.utc).replace(tzinfo=None)` at lines 2829-2834.
4. When building the SQL predicate, `dttm_sql_literal`
(`superset/models/helpers.py:2764-2792`) sees `tf in {"epoch_ms","epoch_s"}`
and computes
`seconds_since_epoch = int(dttm.timestamp())` on the now-naive datetime;
Python interprets
naive datetimes in the server's local timezone, so on non-UTC hosts the
generated epoch
seconds/milliseconds for the filter bounds are offset from true UTC,
returning incorrect
rows for time-filtered queries.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=3d1234503869474a917a03fb7d5071f7&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=3d1234503869474a917a03fb7d5071f7&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 2829:2834
**Comment:**
*Logic Error: After conversion to UTC, the code strips timezone info
(`tzinfo=None`). Later, epoch-formatted columns (`epoch_s`/`epoch_ms`) are
rendered via `dttm.timestamp()`, and naive datetimes are interpreted in
server-local timezone, producing wrong epoch literals on non-UTC hosts. Keep
the UTC tzinfo attached (or compute epoch from an explicitly UTC-aware
datetime) to avoid server-timezone-dependent drift.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=e29946471cf7f8104be115ccbb7910395d2a5921f252385f021fa53afd4f5b52&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=e29946471cf7f8104be115ccbb7910395d2a5921f252385f021fa53afd4f5b52&reaction=dislike'>👎</a>
##########
superset/models/helpers.py:
##########
@@ -2794,19 +2812,47 @@ def get_time_filter( # pylint:
disable=too-many-arguments
)
)
+ # Apply timezone conversion for time filter boundaries
+ # This converts user's local time boundaries to UTC for querying
UTC-stored data
+ adjusted_start, adjusted_end = start_dttm, end_dttm
+ dataset_timezone = self.get_dataset_timezone()
+
+ if dataset_timezone and (start_dttm or end_dttm):
+ try:
+ tz = ZoneInfo(dataset_timezone)
Review Comment:
**Suggestion:** The timezone value from dataset `extra` is used directly in
`ZoneInfo(...)`, but only `ZoneInfoNotFoundError` is handled. If
`extra["timezone"]` is a non-string JSON value (number, object, list),
`ZoneInfo` raises `TypeError` and query generation fails with a 500. Validate
that the timezone is a string before calling `ZoneInfo`, or also catch
`TypeError` and fall back gracefully. [type error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Time-filtered queries crash for datasets with bad timezone.
- ⚠️ Dashboards using such datasets return HTTP 500 errors.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure a SQLA dataset so that its `extra` JSON contains a non-string
`timezone`
value (e.g. `{\"timezone\": 123}`), which is parsed by
`SqlaTable.extra_dict` at
`superset/connectors/sqla/models.py:1577-1581`.
2. Access a chart or dashboard backed by this dataset with a time range
filter so that a
`QueryObject` with non-null `from_dttm`/`to_dttm` is built and passed into
`BaseDatasource.get_sqla_query` in `superset/models/helpers.py:3360-3439`.
3. Inside `get_sqla_query`, the datasource calls `self.get_time_filter(...)`
at
`superset/models/helpers.py:3385-3390` with `start_dttm` and/or `end_dttm`
set.
4. In `get_time_filter` (`superset/models/helpers.py:2794-2818`),
`dataset_timezone =
self.get_dataset_timezone()` returns the non-string value from `extra_dict`,
and the
branch `if dataset_timezone and (start_dttm or end_dttm):` then executes `tz
=
ZoneInfo(dataset_timezone)` at lines 2820-2822, raising a `TypeError`
because `ZoneInfo`
expects a string key, causing the query generation request to fail with a
500 error.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=0ac549dc4a804de4a454448b798cf1ba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=0ac549dc4a804de4a454448b798cf1ba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 2820:2822
**Comment:**
*Type Error: The timezone value from dataset `extra` is used directly
in `ZoneInfo(...)`, but only `ZoneInfoNotFoundError` is handled. If
`extra["timezone"]` is a non-string JSON value (number, object, list),
`ZoneInfo` raises `TypeError` and query generation fails with a 500. Validate
that the timezone is a string before calling `ZoneInfo`, or also catch
`TypeError` and fall back gracefully.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=93916e227312263162addd2e9aaa6cacc05ac2a9e3113e9576e9f0cfcfc2ba55&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F37014&comment_hash=93916e227312263162addd2e9aaa6cacc05ac2a9e3113e9576e9f0cfcfc2ba55&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]