codeant-ai-for-open-source[bot] commented on code in PR #41060:
URL: https://github.com/apache/superset/pull/41060#discussion_r3415509196


##########
superset/models/helpers.py:
##########
@@ -1564,6 +1565,49 @@ def exc_query(self, qry: Any) -> QueryResult:
         """
         return self.query(qry)
 
+    def _python_date_format(self, column: str | None) -> str | None:
+        """Return the column's configured ``python_date_format`` (e.g. 
``epoch_s``
+        or a strftime pattern), or ``None`` if the column declares no 
format."""
+        if not hasattr(self, "get_column"):
+            return None
+        column_obj = self.get_column(column)
+        if (
+            column_obj
+            and hasattr(column_obj, "python_date_format")
+            and (formatter := column_obj.python_date_format)
+        ):
+            return str(formatter)

Review Comment:
   **Suggestion:** `_python_date_format` only reads `python_date_format` via 
attribute access, but `_collect_dttm_labels` explicitly supports `dict` columns 
in `_is_dttm`. If `get_column()` returns a mapping-style column (as allowed by 
existing typing), this method always returns `None`, so raw temporal columns 
with configured formats are skipped and remain unnormalized. Handle both object 
and dict column metadata consistently. [api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Raw temporal columns from dict-based datasources stay unnormalized.
   - ⚠️ Configured python_date_format ignored for affected temporal columns.
   - ⚠️ Charts using such datasources show wrong datetime values.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Use any ExploreMixin-based datasource (e.g. a subclass of `ExploreMixin` 
in
   `superset/models/helpers.py:1058-195`) that implements `get_column()` and 
can return
   mapping-style column metadata (dicts) consistent with `DatasetColumnData`
   (`superset/superset_typing.py:40-12`, which includes `is_dttm` and 
`python_date_format`
   keys). Configure a temporal column whose metadata dict has `"is_dttm": True` 
and
   `"python_date_format": "epoch_s"` (schema validated in
   `superset/datasets/schemas.py:82-91`).
   
   2. Execute a query for this datasource via the shared query pipeline by 
calling
   `get_query_result()` on the datasource 
(`superset/models/helpers.py:164-211`) with a
   `QueryObject` that selects this temporal column in raw-records mode (so it 
appears in
   `query_object.columns` but is not used as the base time axis or 
`granularity`).
   
   3. Inside `get_query_result()`, after the SQL is run and a `DataFrame` is 
obtained,
   `self.normalize_df(df, query_object)` is invoked 
(`superset/models/helpers.py:181-185`).
   `normalize_df()` first calls `self._collect_dttm_labels(query_object)`
   (`superset/models/helpers.py:121-122`). In `_collect_dttm_labels()`
   (`superset/models/helpers.py:83-110`), `_is_dttm(label)` reads 
`self.get_column(label)`
   and, because it explicitly supports dicts (`col.get("is_dttm") if 
isinstance(col, dict)
   else col.is_dttm` at lines 90-93), the dict-based column is correctly 
recognized as
   temporal and treated as a candidate for normalization.
   
   4. Still in `_collect_dttm_labels()`, when building `raw_labels`
   (`superset/models/helpers.py:103-109`), the comprehension requires that
   `self._python_date_format(label)` be truthy. `_python_date_format()`
   (`superset/models/helpers.py:169-181`, diff lines 1568-1580) calls
   `self.get_column(label)` again, but only checks `hasattr(column_obj,
   "python_date_format")` and then `column_obj.python_date_format`. For a 
dict-based column,
   `hasattr(...)` is false, so `_python_date_format()` returns `None`. 
Consequently
   `self._python_date_format(label)` is falsy, the temporal label is dropped 
from
   `raw_labels`, no `DateColumn` is created for it in `normalize_df()`, and the 
column's
   epoch values remain unnormalized in the returned `DataFrame`, despite having 
a configured
   `python_date_format`, breaking datetime formatting for such datasources.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=47cb2d3c49754b3cae77e80fc48e3c1b&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=47cb2d3c49754b3cae77e80fc48e3c1b&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/models/helpers.py
   **Line:** 1574:1579
   **Comment:**
        *Api Mismatch: `_python_date_format` only reads `python_date_format` 
via attribute access, but `_collect_dttm_labels` explicitly supports `dict` 
columns in `_is_dttm`. If `get_column()` returns a mapping-style column (as 
allowed by existing typing), this method always returns `None`, so raw temporal 
columns with configured formats are skipped and remain unnormalized. Handle 
both object and dict column metadata consistently.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41060&comment_hash=c4de45e7da55ce27f4384eef72013b9bc6c3c6c017470937df093a7b0d04416e&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41060&comment_hash=c4de45e7da55ce27f4384eef72013b9bc6c3c6c017470937df093a7b0d04416e&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to