aminghadersohi commented on code in PR #38449:
URL: https://github.com/apache/superset/pull/38449#discussion_r3359599041


##########
superset/utils/webdriver.py:
##########
@@ -740,5 +794,9 @@ def get_screenshot(self, url: str, element_name: str, user: 
User) -> bytes | Non
             logger.warning("exception in webdriver", exc_info=ex)
             raise
         finally:

Review Comment:
   **NIT — No-op `finally: pass` with misleading placement**
   
   A `finally: pass` block executes nothing. The comment explaining the cleanup 
strategy is useful, but putting it inside an empty `finally` creates a false 
expectation that something happens here. The comment would be clearer as a note 
before the `try` block or in the method docstring — somewhere a reader 
encounters it before entering the try, not after wondering why the `finally` is 
empty.



##########
superset/config.py:
##########
@@ -1070,6 +1070,11 @@ class D3TimeFormat(TypedDict, total=False):
 }
 THUMBNAIL_ERROR_CACHE_TTL = int(timedelta(days=1).total_seconds())
 
+# Cache warmup user — must be set explicitly before enabling the cache-warmup
+# Celery task. Intentionally defaults to None so operators pick a dedicated
+# least-privilege user rather than inadvertently running warmup as "admin".
+SUPERSET_CACHE_WARMUP_USER: str | None = None

Review Comment:
   **HIGH — `CACHE_WARMUP_EXECUTORS` is now silently superseded; `UPDATING.md` 
needs a migration note**
   
   The new `cache_warmup` task reads only `SUPERSET_CACHE_WARMUP_USER` (this 
line). It no longer reads `CACHE_WARMUP_EXECUTORS` (line 1032, outside this 
diff hunk). Operators who had configured `CACHE_WARMUP_EXECUTORS` will silently 
get `"SUPERSET_CACHE_WARMUP_USER is not configured"` on their next warmup run 
with no indication that a config migration is needed.
   
   PR #31844 introduced `CACHE_WARMUP_EXECUTORS` via an `UPDATING.md` entry. 
This PR supersedes that for the warmup path without a corresponding note.
   
   Two things needed:
   1. An `UPDATING.md` entry explaining the migration (e.g. "Cache warmup 
authentication now uses `SUPERSET_CACHE_WARMUP_USER` instead of 
`CACHE_WARMUP_EXECUTORS`. Set `SUPERSET_CACHE_WARMUP_USER` to a least-privilege 
user before the next warmup run.").
   2. A comment near `CACHE_WARMUP_EXECUTORS` in `config.py` (line 1032) 
clarifying it is no longer used by the `cache-warmup` Celery task.



##########
superset/tasks/cache.py:
##########
@@ -307,25 +234,39 @@ def cache_warmup(
         logger.exception(message)
         return message
 
-    results: dict[str, list[str]] = {"scheduled": [], "errors": []}
-    for task in strategy.get_tasks():
-        username = task["username"]
-        payload = json.dumps(task["payload"])
-        if username:
+    results: dict[str, list[str]] = {"success": [], "errors": []}
+
+    warmup_username = current_app.config.get("SUPERSET_CACHE_WARMUP_USER")
+    if not warmup_username:
+        message = (
+            "SUPERSET_CACHE_WARMUP_USER is not configured. Set it to a 
dedicated "
+            "least-privilege user with access to the dashboards you want 
warmed up."
+        )
+        logger.error(message)
+        return message
+
+    user = security_manager.find_user(username=warmup_username)
+    if not user:
+        message = (
+            f"Cache warmup user '{warmup_username}' not found. Please 
configure "
+            "SUPERSET_CACHE_WARMUP_USER with a valid username."
+        )
+        logger.error(message)
+        return message
+
+    wd = WebDriverSelenium(current_app.config["WEBDRIVER_TYPE"], user=user)

Review Comment:
   **HIGH — No tests for the new `cache_warmup` execution path**
   
   The entire body of this function (config validation, user lookup, WebDriver 
creation, screenshot loop, error accumulation) has no test coverage. The 
deleted `test_cache.py` covered `fetch_url` — a task that no longer exists — so 
that deletion is valid. But nothing replaces it for this new path.
   
   The minimum I'd expect before merge:
   
   ```python
   @patch("superset.tasks.cache.WebDriverSelenium")
   def test_cache_warmup_missing_config(mock_wd, app_context):
       with patch.dict(current_app.config, {"SUPERSET_CACHE_WARMUP_USER": 
None}):
           result = cache_warmup("dummy")
       assert "SUPERSET_CACHE_WARMUP_USER is not configured" in result
       mock_wd.assert_not_called()
   
   @patch("superset.tasks.cache.security_manager")
   @patch("superset.tasks.cache.WebDriverSelenium")
   def test_cache_warmup_user_not_found(mock_wd, mock_sm, app_context):
       mock_sm.find_user.return_value = None
       with patch.dict(current_app.config, {"SUPERSET_CACHE_WARMUP_USER": 
"bot"}):
           result = cache_warmup("dummy")
       assert "not found" in result
       mock_wd.assert_not_called()
   ```
   
   A happy-path test (mock `WebDriverSelenium`, verify `get_screenshot` called 
per URL, verify `destroy()` called in `finally`) would complete the coverage.



##########
superset/tasks/cache.py:
##########
@@ -307,25 +234,39 @@ def cache_warmup(
         logger.exception(message)
         return message
 
-    results: dict[str, list[str]] = {"scheduled": [], "errors": []}
-    for task in strategy.get_tasks():
-        username = task["username"]
-        payload = json.dumps(task["payload"])
-        if username:
+    results: dict[str, list[str]] = {"success": [], "errors": []}
+
+    warmup_username = current_app.config.get("SUPERSET_CACHE_WARMUP_USER")
+    if not warmup_username:
+        message = (
+            "SUPERSET_CACHE_WARMUP_USER is not configured. Set it to a 
dedicated "
+            "least-privilege user with access to the dashboards you want 
warmed up."
+        )
+        logger.error(message)
+        return message
+
+    user = security_manager.find_user(username=warmup_username)
+    if not user:
+        message = (
+            f"Cache warmup user '{warmup_username}' not found. Please 
configure "
+            "SUPERSET_CACHE_WARMUP_USER with a valid username."
+        )
+        logger.error(message)
+        return message
+
+    wd = WebDriverSelenium(current_app.config["WEBDRIVER_TYPE"], user=user)
+
+    try:
+        for url in strategy.get_urls():
             try:
-                user = security_manager.get_user_by_username(username)
-                cookies = MachineAuthProvider.get_auth_cookies(user)
-                headers = {
-                    "Cookie": "session=%s" % cookies.get("session", ""),
-                    "Content-Type": "application/json",
-                }
-                logger.info("Scheduling %s", payload)
-                fetch_url.delay(payload, headers)
-                results["scheduled"].append(payload)
-            except SchedulingError:
-                logger.exception("Error scheduling fetch_url for payload: %s", 
payload)
-                results["errors"].append(payload)
-        else:
-            logger.warning("Executor not found for %s", payload)
+                logger.info("Fetching %s", url)
+                wd.get_screenshot(url, "grid-container")
+                results["success"].append(url)
+            except (WebDriverException, Exception) as ex:  # noqa: BLE001
+                logger.exception("Error warming up cache for %s: %s", url, ex)

Review Comment:
   **NIT — `logger.exception` redundant format argument**
   
   `logger.exception()` already attaches the current exception via 
`exc_info=True`, so the full traceback is always captured. Passing `ex` as a 
format argument just prepends `str(ex)` to the message, duplicating information 
already in the traceback. Consider:
   
   ```python
   logger.exception("Error warming up cache for %s", url)
   ```



##########
superset/tasks/cache.py:
##########
@@ -307,25 +234,39 @@ def cache_warmup(
         logger.exception(message)
         return message
 
-    results: dict[str, list[str]] = {"scheduled": [], "errors": []}
-    for task in strategy.get_tasks():
-        username = task["username"]
-        payload = json.dumps(task["payload"])
-        if username:
+    results: dict[str, list[str]] = {"success": [], "errors": []}
+
+    warmup_username = current_app.config.get("SUPERSET_CACHE_WARMUP_USER")
+    if not warmup_username:
+        message = (
+            "SUPERSET_CACHE_WARMUP_USER is not configured. Set it to a 
dedicated "
+            "least-privilege user with access to the dashboards you want 
warmed up."
+        )
+        logger.error(message)
+        return message
+
+    user = security_manager.find_user(username=warmup_username)
+    if not user:
+        message = (
+            f"Cache warmup user '{warmup_username}' not found. Please 
configure "
+            "SUPERSET_CACHE_WARMUP_USER with a valid username."
+        )
+        logger.error(message)
+        return message
+
+    wd = WebDriverSelenium(current_app.config["WEBDRIVER_TYPE"], user=user)
+
+    try:
+        for url in strategy.get_urls():
             try:
-                user = security_manager.get_user_by_username(username)
-                cookies = MachineAuthProvider.get_auth_cookies(user)
-                headers = {
-                    "Cookie": "session=%s" % cookies.get("session", ""),
-                    "Content-Type": "application/json",
-                }
-                logger.info("Scheduling %s", payload)
-                fetch_url.delay(payload, headers)
-                results["scheduled"].append(payload)
-            except SchedulingError:
-                logger.exception("Error scheduling fetch_url for payload: %s", 
payload)
-                results["errors"].append(payload)
-        else:
-            logger.warning("Executor not found for %s", payload)
+                logger.info("Fetching %s", url)
+                wd.get_screenshot(url, "grid-container")
+                results["success"].append(url)
+            except (WebDriverException, Exception) as ex:  # noqa: BLE001

Review Comment:
   **MEDIUM — Redundant exception type in catch tuple**
   
   `WebDriverException` is a subclass of `Exception`, so the first type in the 
tuple is dead code — `Exception` already catches it. The `# noqa: BLE001` is 
correct, but the tuple should be simplified:
   
   ```python
   except Exception as ex:  # noqa: BLE001
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to