codeant-ai-for-open-source[bot] commented on code in PR #37042:
URL: https://github.com/apache/superset/pull/37042#discussion_r2717486171


##########
superset/commands/chart/delete.py:
##########
@@ -68,3 +74,140 @@ def validate(self) -> None:
                 security_manager.raise_for_ownership(model)
             except SupersetSecurityException as ex:
                 raise ChartForbiddenError() from ex
+
+    def _cleanup_dashboard_metadata(  # noqa: C901
+        self, chart_id: int
+    ) -> None:
+        """
+        Remove references to this chart from all dashboard metadata.
+
+        When a chart is deleted, dashboards may still contain references to the
+        chart ID in various metadata fields (expanded_slices, filter_scopes, 
etc.).
+        This method cleans up those references to prevent issues during 
dashboard
+        export/import.
+        """
+
+        # First check how many dashboards contain this chart
+        dashboard_count = (
+            db.session.query(func.count(Dashboard.id))
+            .filter(Dashboard.slices.any(id=chart_id))  # type: 
ignore[attr-defined]
+            .scalar()
+        )
+
+        if dashboard_count == 0:
+            return
+
+        # Log warning if cleaning up many dashboards
+        if dashboard_count > 100:
+            logger.warning(
+                "Chart %s is on %d dashboards. "
+                "Cleaning up metadata may take some time.",
+                chart_id,
+                dashboard_count,
+            )
+
+        # Use a reasonable limit to prevent memory issues with extremely 
popular charts
+        if dashboard_count > (safety_limit := 1000):
+            logger.error(
+                "Chart %s is on %d dashboards (exceeds safety limit of %d). "
+                "Skipping metadata cleanup. "
+                "Manual intervention may be required for export/import.",
+                chart_id,
+                dashboard_count,
+                safety_limit,
+            )
+            return
+
+        # Query only ID and json_metadata (not full Dashboard objects)
+        dashboards_to_update = (
+            db.session.query(Dashboard.id, Dashboard.json_metadata)
+            .filter(Dashboard.slices.any(id=chart_id))  # type: 
ignore[attr-defined]
+            .all()
+        )
+
+        for dashboard_id, json_metadata_str in dashboards_to_update:
+            if not json_metadata_str:
+                continue
+
+            try:
+                metadata = json.loads(json_metadata_str)
+            except (json.JSONDecodeError, TypeError):
+                logger.warning(
+                    "Could not parse metadata for dashboard %s", dashboard_id
+                )
+                continue
+
+            modified = False
+
+            # Clean up expanded_slices
+            if "expanded_slices" in metadata:
+                chart_id_str = str(chart_id)
+                if chart_id_str in metadata["expanded_slices"]:
+                    del metadata["expanded_slices"][chart_id_str]
+                    modified = True
+
+            # Clean up timed_refresh_immune_slices
+            if "timed_refresh_immune_slices" in metadata:
+                if chart_id in metadata["timed_refresh_immune_slices"]:
+                    metadata["timed_refresh_immune_slices"].remove(chart_id)
+                    modified = True
+
+            # Clean up filter_scopes
+            if "filter_scopes" in metadata:
+                chart_id_str = str(chart_id)
+                if chart_id_str in metadata["filter_scopes"]:
+                    del metadata["filter_scopes"][chart_id_str]
+                    modified = True
+                # Also clean from immune lists
+                for filter_scope in metadata["filter_scopes"].values():
+                    for attributes in filter_scope.values():
+                        if chart_id in attributes.get("immune", []):
+                            attributes["immune"].remove(chart_id)
+                            modified = True
+
+            # Clean up default_filters
+            if "default_filters" in metadata:
+                default_filters = json.loads(metadata["default_filters"])
+                chart_id_str = str(chart_id)
+                if chart_id_str in default_filters:
+                    del default_filters[chart_id_str]
+                    metadata["default_filters"] = json.dumps(default_filters)
+                    modified = True
+
+            # Clean up native_filter_configuration scope exclusions
+            if "native_filter_configuration" in metadata:
+                for native_filter in metadata["native_filter_configuration"]:
+                    scope_excluded = native_filter.get("scope", 
{}).get("excluded", [])
+                    if chart_id in scope_excluded:
+                        scope_excluded.remove(chart_id)

Review Comment:
   **Suggestion:** The code compares and removes `chart_id` as an int from 
lists that may store IDs as strings (and vice versa), so removals can silently 
fail (logic bug). Normalize comparisons by checking and removing both the 
integer and string representations to ensure the chart ID is removed regardless 
of stored type. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Dashboard export/importers/v1 may fail on orphaned references.
   - ❌ Native filter exclusion lists remain stale.
   - ⚠️ Chart deletion metadata cleanup incomplete.
   ```
   </details>
   
   ```suggestion
                       scope = native_filter.get("scope", {})
                       scope_excluded = scope.get("excluded", [])
                       removed = False
                       # remove both int and string representations if present
                       if chart_id in scope_excluded:
                           scope_excluded.remove(chart_id)
                           removed = True
                       chart_id_str = str(chart_id)
                       if chart_id_str in scope_excluded:
                           scope_excluded.remove(chart_id_str)
                           removed = True
                       if removed:
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Create a dashboard whose json_metadata["native_filter_configuration"] 
contains an entry
   where scope.excluded is a list of string IDs (e.g., ["123"]) rather than 
ints. The cleanup
   loop is at superset/commands/chart/delete.py:177-183 in
   DeleteChartCommand._cleanup_dashboard_metadata (function start at
   superset/commands/chart/delete.py:78).
   
   2. Invoke DeleteChartCommand.run with the chart id 123 present on that 
dashboard
   (superset/commands/chart/delete.py:49). The run method calls 
_cleanup_dashboard_metadata
   at superset/commands/chart/delete.py:52-54.
   
   3. The current code checks `if chart_id in scope_excluded` at
   superset/commands/chart/delete.py:180. Because scope_excluded contains 
string "123" and
   chart_id is the integer 123, the test is False; no removal occurs and 
modified remains
   False for that dashboard.
   
   4. The string-ID exclusion remains in the dashboard metadata and is not 
cleaned. This
   leaves an orphaned reference that can later trigger import-time KeyError in 
the importer
   (see PR description for superset/commands/dashboard/importers/v1/utils.py). 
The behavior
   is concrete and reproducible using string-stored IDs.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/commands/chart/delete.py
   **Line:** 180:182
   **Comment:**
        *Logic Error: The code compares and removes `chart_id` as an int from 
lists that may store IDs as strings (and vice versa), so removals can silently 
fail (logic bug). Normalize comparisons by checking and removing both the 
integer and string representations to ensure the chart ID is removed regardless 
of stored type.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to