codeant-ai-for-open-source[bot] commented on code in PR #41080:
URL: https://github.com/apache/superset/pull/41080#discussion_r3417121379
##########
superset/utils/webdriver.py:
##########
@@ -254,8 +252,7 @@ def get_screenshot( # pylint: disable=too-many-locals,
too-many-statements # n
context.set_default_timeout(
app.config["SCREENSHOT_PLAYWRIGHT_DEFAULT_TIMEOUT"]
)
- if user:
- self.auth(user, context)
+ self.auth(user, context)
Review Comment:
**Suggestion:** `auth` returns a `BrowserContext`, but the returned value is
ignored; this violates the function contract and breaks custom auth overrides
that return a wrapped/replaced context. Assign the returned context and create
the page from that authenticated context. [api mismatch]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Custom WEBDRIVER_AUTH_FUNC overrides ignored for Playwright.
- ⚠️ Playwright screenshots may run unauthenticated and fail.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. During app initialization, `MachineAuthProviderFactory.init_app` in
`superset/utils/machine_auth.py:5-9` constructs a `MachineAuthProvider` with
`app.config["WEBDRIVER_AUTH_FUNC"]` as `auth_webdriver_func_override`,
allowing
deployments to inject a custom auth function via config.
2. For Playwright-based screenshots, `WebDriverPlaywright.get_screenshot` in
`superset/utils/webdriver.py:27-56` creates a context with `context =
browser.new_context(...)`, sets its default timeout, then calls
`self.auth(user, context)`
at line 255 before creating `page = context.new_page()`.
3. `WebDriverPlaywright.auth` at `superset/utils/webdriver.py:20-23`
delegates to
`machine_auth_provider_factory.instance.authenticate_browser_context(context,
user)` and
returns its result, while `MachineAuthProvider.authenticate_browser_context`
at
`superset/utils/machine_auth.py:11-18` explicitly returns either the original
`browser_context` or the result of a user-provided `WEBDRIVER_AUTH_FUNC`
override.
4. Because `get_screenshot` ignores the return value of `self.auth(user,
context)` and
continues using the original `context`, any custom `WEBDRIVER_AUTH_FUNC`
that returns a
wrapped or replacement `BrowserContext` is not honored; the subsequent `page
=
context.new_page()` runs on the unauthenticated context, causing screenshots
for secured
URLs to be taken with the wrong authentication state (e.g., redirected to
`/login` instead
of the target dashboard).
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=ec86f5a7e0224894ae6ebcb915061dab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=ec86f5a7e0224894ae6ebcb915061dab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/webdriver.py
**Line:** 255:255
**Comment:**
*Api Mismatch: `auth` returns a `BrowserContext`, but the returned
value is ignored; this violates the function contract and breaks custom auth
overrides that return a wrapped/replaced context. Assign the returned context
and create the page from that authenticated context.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=e3a1fd760560862841c9a4faba8677957d5fbd7ca5ef5b91ecbbd41e0f70f220&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=e3a1fd760560862841c9a4faba8677957d5fbd7ca5ef5b91ecbbd41e0f70f220&reaction=dislike'>👎</a>
##########
superset/utils/webdriver.py:
##########
@@ -682,26 +632,15 @@ def find_unexpected_errors(driver: WebDriver) ->
list[str]:
return error_messages
- def get_screenshot( # noqa: C901
- self, url: str, element_name: str, user: User | None = None
- ) -> bytes | None:
- # If a user is passed explicitly and differs from the stored user,
- # update and re-authenticate
- if user and user != self._user:
- self._user = user
- if self._driver:
- self._destroy()
- driver = self.driver
+ def get_screenshot(self, url: str, element_name: str, user: User) -> bytes
| None: # noqa: C901
Review Comment:
**Suggestion:** Changing the screenshot method contract to require a `user`
breaks existing production callers that still invoke it with only `(url,
element_name)` (for example cache warmup), which will now raise `TypeError` at
runtime. Keep backward compatibility or update all callers in the same change.
[api mismatch]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Celery cache_warmup task crashes with TypeError.
- ⚠️ Dashboard cache warming disabled; dashboards served from cold cache.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Trigger the Celery task `cache_warmup` defined at
`superset/tasks/cache.py:8-17` (e.g.,
via `cache_warmup.delay("TopNDashboardsStrategy")`).
2. Observe that `cache_warmup` looks up the warmup user at
`superset/tasks/cache.py:39-55`
and instantiates the driver with `wd =
WebDriverSelenium(current_app.config["WEBDRIVER_TYPE"], user=user)` at line
57, while
`WebDriverSelenium` inherits `__init__(self, driver_type: str, window:
WindowSize | None =
None)` from `WebDriverProxy` at `superset/utils/webdriver.py:5-13` and no
longer accepts a
`user` parameter.
3. The `wd.get_screenshot(url, "grid-container")` call at
`superset/tasks/cache.py:63`
passes only `(url, element_name)` to `WebDriverSelenium.get_screenshot`,
whose new
signature is `def get_screenshot(self, url: str, element_name: str, user:
User) -> bytes |
None` at `superset/utils/webdriver.py:635`, so even after fixing the
constructor call a
`TypeError: get_screenshot() missing 1 required positional argument: 'user'`
would be
raised.
4. As written in this PR, running `cache_warmup` in any environment where
`SUPERSET_CACHE_WARMUP_USER` is configured will immediately fail with
`TypeError` due to
these API mismatches between `cache_warmup` and
`WebDriverSelenium.get_screenshot`,
preventing cache warmup from completing.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=4937a804564a41fabe36f3f7296343b3&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=4937a804564a41fabe36f3f7296343b3&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/webdriver.py
**Line:** 635:635
**Comment:**
*Api Mismatch: Changing the screenshot method contract to require a
`user` breaks existing production callers that still invoke it with only `(url,
element_name)` (for example cache warmup), which will now raise `TypeError` at
runtime. Keep backward compatibility or update all callers in the same change.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=d9e7bb0e0ca6af3b23304a3218901f968b8c2b109722f2378bfb820806bef5c4&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=d9e7bb0e0ca6af3b23304a3218901f968b8c2b109722f2378bfb820806bef5c4&reaction=dislike'>👎</a>
##########
superset/utils/webdriver.py:
##########
@@ -682,26 +632,15 @@ def find_unexpected_errors(driver: WebDriver) ->
list[str]:
return error_messages
- def get_screenshot( # noqa: C901
- self, url: str, element_name: str, user: User | None = None
- ) -> bytes | None:
- # If a user is passed explicitly and differs from the stored user,
- # update and re-authenticate
- if user and user != self._user:
- self._user = user
- if self._driver:
- self._destroy()
- driver = self.driver
+ def get_screenshot(self, url: str, element_name: str, user: User) -> bytes
| None: # noqa: C901
+ driver = self.auth(user)
+ driver.set_window_size(*self._window)
driver.get(url)
img: bytes | None = None
selenium_headstart = app.config["SCREENSHOT_SELENIUM_HEADSTART"]
logger.debug("Sleeping for %i seconds", selenium_headstart)
sleep(selenium_headstart)
- # WebDriver cleanup is intentionally not performed in this method.
When the
- # driver is used persistently (e.g., cache warmup), cleanup is handled
- # externally via destroy(). When used for one-off screenshots, the
caller or
- # __del__ handles cleanup.
try:
Review Comment:
**Suggestion:** Driver creation/auth/navigation runs before entering the
`try/finally` block, so failures in auth, `set_window_size`, or initial `get`
can bypass `destroy` and leak browser processes. Move driver setup inside the
guarded block (or add an outer `try/finally`) so cleanup is guaranteed on all
failure paths. [resource leak]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Failed Selenium screenshots leak browser processes.
- ⚠️ Repeated failures can exhaust worker resources.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Report execution uses screenshots via
`screenshot.get_screenshot(user=user)` in
`superset/commands/report/execute.py:18-21`, where `DashboardScreenshot`
instances are
created and iterated to generate images.
2. For Selenium-backed screenshots, `DashboardScreenshot.get_screenshot` in
`superset/utils/screenshots.py:12-17` obtains a `WebDriverSelenium` instance
and calls
`driver.get_screenshot(self.url, self.element, user)`, delegating to
`WebDriverSelenium.get_screenshot` at `superset/utils/webdriver.py:635-741`.
3. In `WebDriverSelenium.get_screenshot`, driver setup is performed before
the
`try/finally`: `driver = self.auth(user)` (line 636),
`driver.set_window_size(*self._window)` (line 637), and `driver.get(url)`
(line 638), with
the `try:` block only starting at line 644 and the cleanup
`self.destroy(driver,
app.config["SCREENSHOT_SELENIUM_RETRIES"])` occurring in the `finally` at
line 741.
4. If `driver.set_window_size` or `driver.get(url)` raises a
`WebDriverException` (e.g.,
due to invalid window configuration or network/browser failure), the
exception occurs
before entering the `try` block, so the `finally` block is never executed and
`self.destroy` is never called, leaking the browser process for each such
failure during
report screenshot execution.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=ac1ca2347523454299ab354b4a0c3e49&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=ac1ca2347523454299ab354b4a0c3e49&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/webdriver.py
**Line:** 636:644
**Comment:**
*Resource Leak: Driver creation/auth/navigation runs before entering
the `try/finally` block, so failures in auth, `set_window_size`, or initial
`get` can bypass `destroy` and leak browser processes. Move driver setup inside
the guarded block (or add an outer `try/finally`) so cleanup is guaranteed on
all failure paths.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=af1c05e62ca89a86e5a3fa77629677a460c04b55011072ceeb3942c75db4c84b&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=af1c05e62ca89a86e5a3fa77629677a460c04b55011072ceeb3942c75db4c84b&reaction=dislike'>👎</a>
##########
superset/utils/webdriver.py:
##########
@@ -602,29 +556,25 @@ def _create(self) -> WebDriver:
logger.debug("Init selenium driver")
return driver_class(**kwargs)
- def _auth(self, user: User) -> None:
- """Authenticate the persistent driver in-place."""
- if self._driver is None:
- raise RuntimeError("WebDriver is not initialized")
- machine_auth_provider_factory.instance.authenticate_webdriver(
- self._driver, user
+ def auth(self, user: User) -> WebDriver:
+ driver = self.create()
+ return machine_auth_provider_factory.instance.authenticate_webdriver(
+ driver, user
)
- def _destroy(self, tries: int = 2) -> None:
- """Destroy the persistent driver"""
- if not self._driver:
- return
+ @staticmethod
+ def destroy(driver: WebDriver, tries: int = 2) -> None:
Review Comment:
**Suggestion:** The new `destroy` signature now requires a `driver`
argument, but existing callers still call `destroy()` on the instance with no
arguments, which will raise `TypeError` during cleanup paths. Either preserve
the old instance API or migrate all callers together. [api mismatch]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Cache warmup teardown raises TypeError in finally block.
- ⚠️ Selenium drivers may not be closed during warmup.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Inspect the Selenium destroy helper in `superset/utils/webdriver.py`:
`@staticmethod
def destroy(driver: WebDriver, tries: int = 2) -> None` is defined at line
566 and expects
a `driver` argument.
2. In the cache warmup task at `superset/tasks/cache.py:57-70`, a
`WebDriverSelenium`
instance is created and later cleaned up in the `finally` block with
`wd.destroy()` at
line 69, passing no arguments despite the new signature requiring `driver`.
3. Once the constructor/get_screenshot API mismatches are corrected so that
`cache_warmup`
can execute its `try` block, control will always enter the `finally` block,
where
`wd.destroy()` attempts to call the staticmethod without the required
`driver` parameter,
raising `TypeError: destroy() missing 1 required positional argument:
'driver'`.
4. This teardown failure in `cache_warmup` not only causes the Celery task
to error during
cleanup but also prevents proper driver shutdown from `cache_warmup`,
leaving Selenium
instances alive and masking the original outcome of the warmup loop.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=bac64887df21401e85b4ddda3b5379d6&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=bac64887df21401e85b4ddda3b5379d6&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/webdriver.py
**Line:** 566:566
**Comment:**
*Api Mismatch: The new `destroy` signature now requires a `driver`
argument, but existing callers still call `destroy()` on the instance with no
arguments, which will raise `TypeError` during cleanup paths. Either preserve
the old instance API or migrate all callers together.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=9ef2ac5663c18577b336a4f6a345027cecfb61b2126427e638725d8bf75f958b&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41080&comment_hash=9ef2ac5663c18577b336a4f6a345027cecfb61b2126427e638725d8bf75f958b&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]