codeant-ai-for-open-source[bot] commented on code in PR #41080:
URL: https://github.com/apache/superset/pull/41080#discussion_r3417121379


##########
superset/utils/webdriver.py:
##########
@@ -254,8 +252,7 @@ def get_screenshot(  # pylint: disable=too-many-locals, 
too-many-statements  # n
             context.set_default_timeout(
                 app.config["SCREENSHOT_PLAYWRIGHT_DEFAULT_TIMEOUT"]
             )
-            if user:
-                self.auth(user, context)
+            self.auth(user, context)

Review Comment:
   **Suggestion:** `auth` returns a `BrowserContext`, but the returned value is 
ignored; this violates the function contract and breaks custom auth overrides 
that return a wrapped/replaced context. Assign the returned context and create 
the page from that authenticated context. [api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Custom WEBDRIVER_AUTH_FUNC overrides ignored for Playwright.
   - ⚠️ Playwright screenshots may run unauthenticated and fail.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. During app initialization, `MachineAuthProviderFactory.init_app` in
   `superset/utils/machine_auth.py:5-9` constructs a `MachineAuthProvider` with
   `app.config["WEBDRIVER_AUTH_FUNC"]` as `auth_webdriver_func_override`, 
allowing
   deployments to inject a custom auth function via config.
   
   2. For Playwright-based screenshots, `WebDriverPlaywright.get_screenshot` in
   `superset/utils/webdriver.py:27-56` creates a context with `context =
   browser.new_context(...)`, sets its default timeout, then calls 
`self.auth(user, context)`
   at line 255 before creating `page = context.new_page()`.
   
   3. `WebDriverPlaywright.auth` at `superset/utils/webdriver.py:20-23` 
delegates to
   
`machine_auth_provider_factory.instance.authenticate_browser_context(context, 
user)` and
   returns its result, while `MachineAuthProvider.authenticate_browser_context` 
at
   `superset/utils/machine_auth.py:11-18` explicitly returns either the original
   `browser_context` or the result of a user-provided `WEBDRIVER_AUTH_FUNC` 
override.
   
   4. Because `get_screenshot` ignores the return value of `self.auth(user, 
context)` and
   continues using the original `context`, any custom `WEBDRIVER_AUTH_FUNC` 
that returns a
   wrapped or replacement `BrowserContext` is not honored; the subsequent `page 
=
   context.new_page()` runs on the unauthenticated context, causing screenshots 
for secured
   URLs to be taken with the wrong authentication state (e.g., redirected to 
`/login` instead
   of the target dashboard).
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=ec86f5a7e0224894ae6ebcb915061dab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=ec86f5a7e0224894ae6ebcb915061dab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/utils/webdriver.py
   **Line:** 255:255
   **Comment:**
        *Api Mismatch: `auth` returns a `BrowserContext`, but the returned 
value is ignored; this violates the function contract and breaks custom auth 
overrides that return a wrapped/replaced context. Assign the returned context 
and create the page from that authenticated context.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=e3a1fd760560862841c9a4faba8677957d5fbd7ca5ef5b91ecbbd41e0f70f220&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=e3a1fd760560862841c9a4faba8677957d5fbd7ca5ef5b91ecbbd41e0f70f220&reaction=dislike'>👎</a>



##########
superset/utils/webdriver.py:
##########
@@ -682,26 +632,15 @@ def find_unexpected_errors(driver: WebDriver) -> 
list[str]:
 
         return error_messages
 
-    def get_screenshot(  # noqa: C901
-        self, url: str, element_name: str, user: User | None = None
-    ) -> bytes | None:
-        # If a user is passed explicitly and differs from the stored user,
-        # update and re-authenticate
-        if user and user != self._user:
-            self._user = user
-            if self._driver:
-                self._destroy()
-        driver = self.driver
+    def get_screenshot(self, url: str, element_name: str, user: User) -> bytes 
| None:  # noqa: C901

Review Comment:
   **Suggestion:** Changing the screenshot method contract to require a `user` 
breaks existing production callers that still invoke it with only `(url, 
element_name)` (for example cache warmup), which will now raise `TypeError` at 
runtime. Keep backward compatibility or update all callers in the same change. 
[api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Celery cache_warmup task crashes with TypeError.
   - ⚠️ Dashboard cache warming disabled; dashboards served from cold cache.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Trigger the Celery task `cache_warmup` defined at 
`superset/tasks/cache.py:8-17` (e.g.,
   via `cache_warmup.delay("TopNDashboardsStrategy")`).
   
   2. Observe that `cache_warmup` looks up the warmup user at 
`superset/tasks/cache.py:39-55`
   and instantiates the driver with `wd =
   WebDriverSelenium(current_app.config["WEBDRIVER_TYPE"], user=user)` at line 
57, while
   `WebDriverSelenium` inherits `__init__(self, driver_type: str, window: 
WindowSize | None =
   None)` from `WebDriverProxy` at `superset/utils/webdriver.py:5-13` and no 
longer accepts a
   `user` parameter.
   
   3. The `wd.get_screenshot(url, "grid-container")` call at 
`superset/tasks/cache.py:63`
   passes only `(url, element_name)` to `WebDriverSelenium.get_screenshot`, 
whose new
   signature is `def get_screenshot(self, url: str, element_name: str, user: 
User) -> bytes |
   None` at `superset/utils/webdriver.py:635`, so even after fixing the 
constructor call a
   `TypeError: get_screenshot() missing 1 required positional argument: 'user'` 
would be
   raised.
   
   4. As written in this PR, running `cache_warmup` in any environment where
   `SUPERSET_CACHE_WARMUP_USER` is configured will immediately fail with 
`TypeError` due to
   these API mismatches between `cache_warmup` and 
`WebDriverSelenium.get_screenshot`,
   preventing cache warmup from completing.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=4937a804564a41fabe36f3f7296343b3&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=4937a804564a41fabe36f3f7296343b3&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/utils/webdriver.py
   **Line:** 635:635
   **Comment:**
        *Api Mismatch: Changing the screenshot method contract to require a 
`user` breaks existing production callers that still invoke it with only `(url, 
element_name)` (for example cache warmup), which will now raise `TypeError` at 
runtime. Keep backward compatibility or update all callers in the same change.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=d9e7bb0e0ca6af3b23304a3218901f968b8c2b109722f2378bfb820806bef5c4&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=d9e7bb0e0ca6af3b23304a3218901f968b8c2b109722f2378bfb820806bef5c4&reaction=dislike'>👎</a>



##########
superset/utils/webdriver.py:
##########
@@ -682,26 +632,15 @@ def find_unexpected_errors(driver: WebDriver) -> 
list[str]:
 
         return error_messages
 
-    def get_screenshot(  # noqa: C901
-        self, url: str, element_name: str, user: User | None = None
-    ) -> bytes | None:
-        # If a user is passed explicitly and differs from the stored user,
-        # update and re-authenticate
-        if user and user != self._user:
-            self._user = user
-            if self._driver:
-                self._destroy()
-        driver = self.driver
+    def get_screenshot(self, url: str, element_name: str, user: User) -> bytes 
| None:  # noqa: C901
+        driver = self.auth(user)
+        driver.set_window_size(*self._window)
         driver.get(url)
         img: bytes | None = None
         selenium_headstart = app.config["SCREENSHOT_SELENIUM_HEADSTART"]
         logger.debug("Sleeping for %i seconds", selenium_headstart)
         sleep(selenium_headstart)
 
-        # WebDriver cleanup is intentionally not performed in this method. 
When the
-        # driver is used persistently (e.g., cache warmup), cleanup is handled
-        # externally via destroy(). When used for one-off screenshots, the 
caller or
-        # __del__ handles cleanup.
         try:

Review Comment:
   **Suggestion:** Driver creation/auth/navigation runs before entering the 
`try/finally` block, so failures in auth, `set_window_size`, or initial `get` 
can bypass `destroy` and leak browser processes. Move driver setup inside the 
guarded block (or add an outer `try/finally`) so cleanup is guaranteed on all 
failure paths. [resource leak]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Failed Selenium screenshots leak browser processes.
   - ⚠️ Repeated failures can exhaust worker resources.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Report execution uses screenshots via 
`screenshot.get_screenshot(user=user)` in
   `superset/commands/report/execute.py:18-21`, where `DashboardScreenshot` 
instances are
   created and iterated to generate images.
   
   2. For Selenium-backed screenshots, `DashboardScreenshot.get_screenshot` in
   `superset/utils/screenshots.py:12-17` obtains a `WebDriverSelenium` instance 
and calls
   `driver.get_screenshot(self.url, self.element, user)`, delegating to
   `WebDriverSelenium.get_screenshot` at `superset/utils/webdriver.py:635-741`.
   
   3. In `WebDriverSelenium.get_screenshot`, driver setup is performed before 
the
   `try/finally`: `driver = self.auth(user)` (line 636),
   `driver.set_window_size(*self._window)` (line 637), and `driver.get(url)` 
(line 638), with
   the `try:` block only starting at line 644 and the cleanup 
`self.destroy(driver,
   app.config["SCREENSHOT_SELENIUM_RETRIES"])` occurring in the `finally` at 
line 741.
   
   4. If `driver.set_window_size` or `driver.get(url)` raises a 
`WebDriverException` (e.g.,
   due to invalid window configuration or network/browser failure), the 
exception occurs
   before entering the `try` block, so the `finally` block is never executed and
   `self.destroy` is never called, leaking the browser process for each such 
failure during
   report screenshot execution.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=ac1ca2347523454299ab354b4a0c3e49&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=ac1ca2347523454299ab354b4a0c3e49&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/utils/webdriver.py
   **Line:** 636:644
   **Comment:**
        *Resource Leak: Driver creation/auth/navigation runs before entering 
the `try/finally` block, so failures in auth, `set_window_size`, or initial 
`get` can bypass `destroy` and leak browser processes. Move driver setup inside 
the guarded block (or add an outer `try/finally`) so cleanup is guaranteed on 
all failure paths.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=af1c05e62ca89a86e5a3fa77629677a460c04b55011072ceeb3942c75db4c84b&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=af1c05e62ca89a86e5a3fa77629677a460c04b55011072ceeb3942c75db4c84b&reaction=dislike'>👎</a>



##########
superset/utils/webdriver.py:
##########
@@ -602,29 +556,25 @@ def _create(self) -> WebDriver:
         logger.debug("Init selenium driver")
         return driver_class(**kwargs)
 
-    def _auth(self, user: User) -> None:
-        """Authenticate the persistent driver in-place."""
-        if self._driver is None:
-            raise RuntimeError("WebDriver is not initialized")
-        machine_auth_provider_factory.instance.authenticate_webdriver(
-            self._driver, user
+    def auth(self, user: User) -> WebDriver:
+        driver = self.create()
+        return machine_auth_provider_factory.instance.authenticate_webdriver(
+            driver, user
         )
 
-    def _destroy(self, tries: int = 2) -> None:
-        """Destroy the persistent driver"""
-        if not self._driver:
-            return
+    @staticmethod
+    def destroy(driver: WebDriver, tries: int = 2) -> None:

Review Comment:
   **Suggestion:** The new `destroy` signature now requires a `driver` 
argument, but existing callers still call `destroy()` on the instance with no 
arguments, which will raise `TypeError` during cleanup paths. Either preserve 
the old instance API or migrate all callers together. [api mismatch]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Cache warmup teardown raises TypeError in finally block.
   - ⚠️ Selenium drivers may not be closed during warmup.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Inspect the Selenium destroy helper in `superset/utils/webdriver.py`: 
`@staticmethod
   def destroy(driver: WebDriver, tries: int = 2) -> None` is defined at line 
566 and expects
   a `driver` argument.
   
   2. In the cache warmup task at `superset/tasks/cache.py:57-70`, a 
`WebDriverSelenium`
   instance is created and later cleaned up in the `finally` block with 
`wd.destroy()` at
   line 69, passing no arguments despite the new signature requiring `driver`.
   
   3. Once the constructor/get_screenshot API mismatches are corrected so that 
`cache_warmup`
   can execute its `try` block, control will always enter the `finally` block, 
where
   `wd.destroy()` attempts to call the staticmethod without the required 
`driver` parameter,
   raising `TypeError: destroy() missing 1 required positional argument: 
'driver'`.
   
   4. This teardown failure in `cache_warmup` not only causes the Celery task 
to error during
   cleanup but also prevents proper driver shutdown from `cache_warmup`, 
leaving Selenium
   instances alive and masking the original outcome of the warmup loop.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=bac64887df21401e85b4ddda3b5379d6&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=bac64887df21401e85b4ddda3b5379d6&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/utils/webdriver.py
   **Line:** 566:566
   **Comment:**
        *Api Mismatch: The new `destroy` signature now requires a `driver` 
argument, but existing callers still call `destroy()` on the instance with no 
arguments, which will raise `TypeError` during cleanup paths. Either preserve 
the old instance API or migrate all callers together.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=9ef2ac5663c18577b336a4f6a345027cecfb61b2126427e638725d8bf75f958b&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=9ef2ac5663c18577b336a4f6a345027cecfb61b2126427e638725d8bf75f958b&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to