codeant-ai-for-open-source[bot] commented on code in PR #37593:
URL: https://github.com/apache/superset/pull/37593#discussion_r2752115589
##########
superset/common/query_context_processor.py:
##########
@@ -492,6 +512,14 @@ def get_viz_annotation_data( # noqa: C901
if time_grain_sqla := overrides.get("time_grain_sqla"):
for query_object in query_context.queries:
query_object.extras["time_grain_sqla"] =
time_grain_sqla
+ # Remove time grain values from columns list to
prevent them
+ # from being treated as physical column names. Time
grains
+ # (P1D, PT1H, etc.) should only exist in extras
metadata.
+ query_object.columns = [
+ col
+ for col in query_object.columns
+ if not self._is_time_grain_value(col)
Review Comment:
**Suggestion:** Inside a static method, the code calls
`_is_time_grain_value` via `self`, but `self` is not defined in a
`@staticmethod`, which will raise a `NameError` at runtime when this path is
executed. [type error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Annotation rendering raises NameError, causing 500 errors.
- ⚠️ Chart data fetch path (get_annotation_data) blocked.
- ⚠️ Dashboards with such annotations fail to load.
```
</details>
```suggestion
if not
QueryContextProcessor._is_time_grain_value(col)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Create or obtain a QueryObject with annotation layers that include a
"line" or "table"
layer and an overrides dict containing a time_grain_sqla value. Place that
QueryObject
into a QueryContext used to instantiate QueryContextProcessor.
- Files/locations: construct QueryContext and QueryObject used by
superset/common/query_context_processor.py:get_annotation_data (call site
at ~lines
394-405 in that file).
2. Call QueryContextProcessor.get_annotation_data(query_obj)
(superset/common/query_context_processor.py:394-405). That method iterates
annotation
layers and calls self.get_viz_annotation_data(annotation_layer, ...).
3. Execution enters get_viz_annotation_data (function header around
superset/common/query_context_processor.py:502). The overrides branch is
taken because
overrides contains time_grain_sqla.
4. Inside the loop over query_context.queries, the code executes the list
comprehension at
superset/common/query_context_processor.py:518-522 which refers to
self._is_time_grain_value. Because get_viz_annotation_data is decorated as
@staticmethod
(no bound self), the name self is undefined and Python raises NameError:
name 'self' is
not defined at that line.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/common/query_context_processor.py
**Line:** 521:521
**Comment:**
*Type Error: Inside a static method, the code calls
`_is_time_grain_value` via `self`, but `self` is not defined in a
`@staticmethod`, which will raise a `NameError` at runtime when this path is
executed.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/common/query_context_processor.py:
##########
@@ -404,6 +404,26 @@ def get_annotation_data(self, query_obj: QueryObject) ->
dict[str, Any]:
)
return annotation_data
+ @staticmethod
+ def _is_time_grain_value(column: Any) -> bool:
+ """
+ Check if a column value is a time grain identifier (e.g., P1D, PT1H).
+ Time grain values follow ISO 8601 duration format and should never be
+ treated as physical column names.
+
+ :param column: Column value to check
+ :return: True if the value is a time grain identifier
+ """
+ if not isinstance(column, str):
+ return False
+ # Match ISO 8601 duration format: P[n]Y/M/W/D or PT[n]H/M/S
+ # Also match week-with-epoch formats like "1969-12-28T00:00:00Z/P1W"
+ return bool(
+ re.match(r"^(P(\d+(\.\d+)?[YMWD])+|PT(\d+(\.\d+)?[HMS])+)", column)
+ or re.match(r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z/P\d+[YMWD]",
column)
+ or re.match(r"^P\d+[YMWD]/\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z",
column)
Review Comment:
**Suggestion:** The time-grain detection regexes use `re.match`, which only
anchors at the start of the string, so any column whose name merely starts with
a duration (e.g. `P1D_sales`) will be misclassified as a time grain and removed
from the columns list, potentially breaking queries that rely on such columns.
[logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Dataset columns dropped unexpectedly from annotation queries.
- ⚠️ Annotation queries return incorrect or incomplete data.
- ⚠️ Users see missing column errors during annotation rendering.
```
</details>
```suggestion
re.fullmatch(r"^(P(\d+(\.\d+)?[YMWD])+|PT(\d+(\.\d+)?[HMS])+)",
column)
or re.fullmatch(
r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z/P\d+[YMWD]", column
)
or re.fullmatch(
r"^P\d+[YMWD]/\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z", column
)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Prepare a chart whose query_context contains a QueryObject with a column
whose name
begins with a duration-like prefix, for example "P1D_sales". This column
appears in
query_object.columns used by the chart's QueryContext.
2. Add an annotation layer of type "line" or "table" that supplies an
overrides dict with
"time_grain_sqla" so the code path in get_viz_annotation_data
(superset/common/query_context_processor.py:502 and specifically the filter
at 518-522)
runs.
3. When processing overrides, the comprehension calls
QueryContextProcessor._is_time_grain_value(column) (definition uses the
regex at
superset/common/query_context_processor.py:407-425). Because re.match is
used, "P1D_sales"
matches the pattern (match at start) and is treated as a time grain.
4. The column "P1D_sales" is removed from query_object.columns
(superset/common/query_context_processor.py:518-522), altering the query and
potentially
causing missing-column errors or incorrect results downstream.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/common/query_context_processor.py
**Line:** 408:424
**Comment:**
*Logic Error: The time-grain detection regexes use `re.match`, which
only anchors at the start of the string, so any column whose name merely starts
with a duration (e.g. `P1D_sales`) will be misclassified as a time grain and
removed from the columns list, potentially breaking queries that rely on such
columns.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/common/query_context_processor.py:
##########
@@ -492,6 +512,14 @@ def get_viz_annotation_data( # noqa: C901
if time_grain_sqla := overrides.get("time_grain_sqla"):
for query_object in query_context.queries:
query_object.extras["time_grain_sqla"] =
time_grain_sqla
Review Comment:
**Suggestion:** The code assigns into `query_object.extras` assuming it is a
dictionary, but if `extras` is `None` (as allowed by the `QueryObject` type),
this will raise a `TypeError` (`'NoneType' object does not support item
assignment`) when applying time grain overrides. [type error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Annotation override application raises TypeError.
- ⚠️ Annotation queries fail, causing chart errors.
- ⚠️ Dashboard updates involving such annotations interrupted.
```
</details>
```suggestion
if query_object.extras is None:
query_object.extras = {}
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Obtain a Chart whose Chart.get_query_context() returns a QueryContext
containing
QueryObject instances that were constructed without an extras dict (i.e.,
query_object.extras is None). QueryObject instances flow into
query_context.queries used
by get_viz_annotation_data (superset/common/query_context_processor.py).
2. Add an annotation layer of type "line" or "table" whose overrides include
"time_grain_sqla", so the branch in get_viz_annotation_data that assigns into
query_object.extras runs (assignment at
superset/common/query_context_processor.py:514).
3. When the code executes query_object.extras["time_grain_sqla"] =
time_grain_sqla at line
514, Python raises TypeError: 'NoneType' object does not support item
assignment if extras
is None.
4. The TypeError propagates up and is converted into a
QueryObjectValidationError or an
internal server error, causing annotation processing to fail for that chart.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/common/query_context_processor.py
**Line:** 514:514
**Comment:**
*Type Error: The code assigns into `query_object.extras` assuming it is
a dictionary, but if `extras` is `None` (as allowed by the `QueryObject` type),
this will raise a `TypeError` (`'NoneType' object does not support item
assignment`) when applying time grain overrides.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
tests/integration_tests/annotation_time_grain_tests.py:
##########
@@ -0,0 +1,361 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""Tests for annotation layer time grain handling."""
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from superset.common.query_context import QueryContext
+from superset.common.query_context_processor import QueryContextProcessor
+from superset.common.query_object import QueryObject
+from superset.connectors.sqla.models import SqlaTable
+from superset.models.core import Database
+from superset.utils.core import DatasourceType
+from tests.integration_tests.base_tests import SupersetTestCase
+
+
+class TestAnnotationTimeGrain(SupersetTestCase):
+ """
+ Test that annotation layers correctly handle time grain values (P1D, PT1H,
etc.)
+ and prevent them from being treated as physical column names.
+ """
+
+ def test_is_time_grain_value_identifies_valid_patterns(self):
+ """Test _is_time_grain_value correctly identifies time grain
patterns."""
+ # Valid time grain patterns
+ assert QueryContextProcessor._is_time_grain_value("P1D")
+ assert QueryContextProcessor._is_time_grain_value("P1M")
+ assert QueryContextProcessor._is_time_grain_value("P1Y")
+ assert QueryContextProcessor._is_time_grain_value("P1W")
+ assert QueryContextProcessor._is_time_grain_value("PT1H")
+ assert QueryContextProcessor._is_time_grain_value("PT30M")
+ assert QueryContextProcessor._is_time_grain_value("PT1S")
+ assert QueryContextProcessor._is_time_grain_value("P3M") # Quarter
+ assert QueryContextProcessor._is_time_grain_value("PT0.5H") # Half
hour
+ assert QueryContextProcessor._is_time_grain_value("P0.25Y") # Quarter
year
+
+ # Week with epoch formats
+ assert QueryContextProcessor._is_time_grain_value(
+ "1969-12-28T00:00:00Z/P1W"
+ ) # Week starting Sunday
+ assert QueryContextProcessor._is_time_grain_value(
+ "1969-12-29T00:00:00Z/P1W"
+ ) # Week starting Monday
+ assert QueryContextProcessor._is_time_grain_value(
+ "P1W/1970-01-03T00:00:00Z"
+ ) # Week ending Saturday
+ assert QueryContextProcessor._is_time_grain_value(
+ "P1W/1970-01-04T00:00:00Z"
+ ) # Week ending Sunday
+
+ # Invalid patterns (normal column names)
+ assert not QueryContextProcessor._is_time_grain_value("country")
+ assert not QueryContextProcessor._is_time_grain_value("__time")
+ assert not QueryContextProcessor._is_time_grain_value("P") #
Incomplete
+ assert not QueryContextProcessor._is_time_grain_value("1D") # Missing
P
+ assert not QueryContextProcessor._is_time_grain_value("PD1") # Wrong
order
+ assert not QueryContextProcessor._is_time_grain_value("")
+ assert not QueryContextProcessor._is_time_grain_value(None)
+ assert not QueryContextProcessor._is_time_grain_value(123)
+ assert not QueryContextProcessor._is_time_grain_value(["P1D"])
+
+ @patch("superset.common.query_context_processor.ChartDAO")
+ @patch("superset.common.query_context_processor.ChartDataCommand")
Review Comment:
**Suggestion:** The patch target for `ChartDataCommand` in this test points
to `superset.common.query_context_processor.ChartDataCommand`, but
`get_viz_annotation_data` imports `ChartDataCommand` from
`superset.commands.chart.data.get_data_command`, so the mock never intercepts
the real class and the test may execute the real command (potentially hitting
the DB or other side effects) instead of the mocked one. [logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Tests may execute real ChartDataCommand hitting database.
- ⚠️ Annotation tests become flaky or slow.
- ⚠️ CI runs could include unintended side effects.
```
</details>
```suggestion
@patch("superset.commands.chart.data.get_data_command.ChartDataCommand")
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open test function in
tests/integration_tests/annotation_time_grain_tests.py at lines
76-80 where the decorators patch ChartDAO and ChartDataCommand.
2. Run this single test (e.g., pytest
tests/integration_tests/annotation_time_grain_tests.py::TestAnnotationTimeGrain::test_annotation_time_grain_override_removes_time_grain_from_columns).
3. At runtime, QueryContextProcessor.get_viz_annotation_data
(superset/common/query_context_processor.py starting at line 462) executes;
it performs a
local import: `from superset.commands.chart.data.get_data_command import
ChartDataCommand`.
4. Because the function imports ChartDataCommand from
superset.commands.chart.data.get_data_command at call time, the decorator
patching
superset.common.query_context_processor.ChartDataCommand (lines 76-77 in the
test) does
not replace the class actually imported, so the real ChartDataCommand
implementation is
used instead of the MagicMock. This causes the test to call the real command
rather than
the mocked one (observed behavior when running the test).
5. Result: the test either exercises the real ChartDataCommand (potential
DB/external side
effects) or fails because the test expected the mock to be used. The correct
repro is
running the test and observing that mock_chart_data_command.return_value
settings in the
test are ignored.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** tests/integration_tests/annotation_time_grain_tests.py
**Line:** 77:77
**Comment:**
*Logic Error: The patch target for `ChartDataCommand` in this test
points to `superset.common.query_context_processor.ChartDataCommand`, but
`get_viz_annotation_data` imports `ChartDataCommand` from
`superset.commands.chart.data.get_data_command`, so the mock never intercepts
the real class and the test may execute the real command (potentially hitting
the DB or other side effects) instead of the mocked one.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
tests/integration_tests/annotation_time_grain_tests.py:
##########
@@ -0,0 +1,361 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""Tests for annotation layer time grain handling."""
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from superset.common.query_context import QueryContext
+from superset.common.query_context_processor import QueryContextProcessor
+from superset.common.query_object import QueryObject
+from superset.connectors.sqla.models import SqlaTable
+from superset.models.core import Database
+from superset.utils.core import DatasourceType
+from tests.integration_tests.base_tests import SupersetTestCase
+
+
+class TestAnnotationTimeGrain(SupersetTestCase):
+ """
+ Test that annotation layers correctly handle time grain values (P1D, PT1H,
etc.)
+ and prevent them from being treated as physical column names.
+ """
+
+ def test_is_time_grain_value_identifies_valid_patterns(self):
+ """Test _is_time_grain_value correctly identifies time grain
patterns."""
+ # Valid time grain patterns
+ assert QueryContextProcessor._is_time_grain_value("P1D")
+ assert QueryContextProcessor._is_time_grain_value("P1M")
+ assert QueryContextProcessor._is_time_grain_value("P1Y")
+ assert QueryContextProcessor._is_time_grain_value("P1W")
+ assert QueryContextProcessor._is_time_grain_value("PT1H")
+ assert QueryContextProcessor._is_time_grain_value("PT30M")
+ assert QueryContextProcessor._is_time_grain_value("PT1S")
+ assert QueryContextProcessor._is_time_grain_value("P3M") # Quarter
+ assert QueryContextProcessor._is_time_grain_value("PT0.5H") # Half
hour
+ assert QueryContextProcessor._is_time_grain_value("P0.25Y") # Quarter
year
+
+ # Week with epoch formats
+ assert QueryContextProcessor._is_time_grain_value(
+ "1969-12-28T00:00:00Z/P1W"
+ ) # Week starting Sunday
+ assert QueryContextProcessor._is_time_grain_value(
+ "1969-12-29T00:00:00Z/P1W"
+ ) # Week starting Monday
+ assert QueryContextProcessor._is_time_grain_value(
+ "P1W/1970-01-03T00:00:00Z"
+ ) # Week ending Saturday
+ assert QueryContextProcessor._is_time_grain_value(
+ "P1W/1970-01-04T00:00:00Z"
+ ) # Week ending Sunday
+
+ # Invalid patterns (normal column names)
+ assert not QueryContextProcessor._is_time_grain_value("country")
+ assert not QueryContextProcessor._is_time_grain_value("__time")
+ assert not QueryContextProcessor._is_time_grain_value("P") #
Incomplete
+ assert not QueryContextProcessor._is_time_grain_value("1D") # Missing
P
+ assert not QueryContextProcessor._is_time_grain_value("PD1") # Wrong
order
+ assert not QueryContextProcessor._is_time_grain_value("")
+ assert not QueryContextProcessor._is_time_grain_value(None)
+ assert not QueryContextProcessor._is_time_grain_value(123)
+ assert not QueryContextProcessor._is_time_grain_value(["P1D"])
+
+ @patch("superset.common.query_context_processor.ChartDAO")
+ @patch("superset.common.query_context_processor.ChartDataCommand")
+ def test_annotation_time_grain_override_removes_time_grain_from_columns(
+ self, mock_chart_data_command, mock_chart_dao
+ ):
+ """
+ Test that when annotation overrides include time_grain_sqla, any time
grain
+ values in the columns list are removed to prevent column-not-found
errors.
+ """
+ # Setup mock chart with form_data containing time grain in columns
+ mock_chart = MagicMock()
+ mock_chart.id = 1
+ mock_chart.viz_type = "line"
+ mock_chart.form_data = {
+ "datasource": "1__table",
+ "columns": ["country", "P1D"], # Bad data: time grain in columns
+ }
+ mock_chart.datasource = MagicMock(spec=SqlaTable)
+ mock_chart.datasource.type = DatasourceType.TABLE
+ mock_chart.datasource.id = 1
+
+ # Setup mock query_context
+ mock_query_context = MagicMock()
+ mock_query_object = MagicMock(spec=QueryObject)
+ mock_query_object.columns = ["country", "P1D"]
+ mock_query_object.extras = {}
+ mock_query_context.queries = [mock_query_object]
+
+ mock_chart.get_query_context.return_value = mock_query_context
+ mock_chart_dao.find_by_id.return_value = mock_chart
+
+ # Setup mock ChartDataCommand
+ mock_command_instance = MagicMock()
+ mock_command_instance.run.return_value = {"queries": [{"data": []}]}
+ mock_chart_data_command.return_value = mock_command_instance
+
+ # Create annotation layer with time grain override
+ annotation_layer = {
+ "name": "test_annotation",
+ "value": 1, # chart_id
+ "sourceType": "line",
+ "overrides": {"time_grain_sqla": "P1D"},
+ }
+
+ # Call the method
+ result = QueryContextProcessor.get_viz_annotation_data(
+ annotation_layer, force=False
+ )
+
+ # Verify that P1D was removed from columns
+ assert mock_query_object.columns == [
+ "country"
+ ], "Time grain P1D should be removed from columns"
+ assert (
+ mock_query_object.extras["time_grain_sqla"] == "P1D"
+ ), "Time grain should be in extras"
+
+ # Verify command was called with cleaned query_object
+ mock_command_instance.validate.assert_called_once()
+ mock_command_instance.run.assert_called_once()
+
+ @patch("superset.common.query_context_processor.ChartDAO")
+ @patch("superset.common.query_context_processor.ChartDataCommand")
Review Comment:
**Suggestion:** In this test, `ChartDataCommand` is again patched via
`superset.common.query_context_processor.ChartDataCommand`, which does not
match the import used inside `get_viz_annotation_data`, so the mock is
ineffective and the real command is executed during the test instead of the
intended MagicMock. [logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ Tests may run real ChartDataCommand causing DB calls.
- ⚠️ Test suite stability reduced; flakiness possible.
- ⚠️ Debugging CI failures becomes harder.
```
</details>
```suggestion
@patch("superset.commands.chart.data.get_data_command.ChartDataCommand")
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open the test at tests/integration_tests/annotation_time_grain_tests.py
lines 137-141
where ChartDataCommand is patched in the decorator.
2. Execute pytest for this test: pytest
tests/integration_tests/annotation_time_grain_tests.py::TestAnnotationTimeGrain::test_annotation_without_time_grain_override_preserves_columns.
3. During the test, get_viz_annotation_data in
superset/common/query_context_processor.py
(function starting at line 462) performs an explicit import: `from
superset.commands.chart.data.get_data_command import ChartDataCommand`.
4. The decorator in the test patches
superset.common.query_context_processor.ChartDataCommand, but because the
function imports
ChartDataCommand locally from a different module, the patch does not
intercept the class
used by the function; the real ChartDataCommand is instantiated and run
instead of the
test's MagicMock.
5. Observe that mock_chart_data_command.return_value (set later in the test)
is not used
and the test ends up running the real command, reproducing the issue.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** tests/integration_tests/annotation_time_grain_tests.py
**Line:** 138:138
**Comment:**
*Logic Error: In this test, `ChartDataCommand` is again patched via
`superset.common.query_context_processor.ChartDataCommand`, which does not
match the import used inside `get_viz_annotation_data`, so the mock is
ineffective and the real command is executed during the test instead of the
intended MagicMock.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]