codeant-ai-for-open-source[bot] commented on code in PR #35021: URL: https://github.com/apache/superset/pull/35021#discussion_r2873242319
########## superset-frontend/src/pages/RedirectWarning/utils.ts: ########## @@ -0,0 +1,94 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +const TRUSTED_URLS_KEY = 'superset_trusted_urls'; +const MAX_TRUSTED_URLS = 100; +const ALLOWED_SCHEMES = ['http:', 'https:']; + +/** + * Normalize a URL for comparison (origin + path without trailing slash + search). + */ +function normalizeUrl(url: string): string { + try { + const parsed = new URL(url); + return parsed.origin + parsed.pathname.replace(/\/$/, '') + parsed.search; + } catch { + return url; + } +} + +/** + * Return true if the URL scheme is safe for navigation. + * Blocks javascript:, data:, vbscript:, file:, etc. + */ +export function isAllowedScheme(url: string): boolean { + try { + const parsed = new URL(url); + return ALLOWED_SCHEMES.includes(parsed.protocol); + } catch { + // relative URLs or unparseable — allow (they'll resolve against current origin) + return true; + } +} + +/** + * Read the target URL from the current page's query string. + * + * URLSearchParams.get() already percent-decodes the value, so we must NOT + * call decodeURIComponent again (doing so would allow double-encoded + * payloads like `javascript%253Aalert(1)` to bypass scheme checks). + */ +export function getTargetUrl(): string { + const params = new URLSearchParams(window.location.search); + const url = params.get('url') ?? ''; + return url.trim(); +} + +function getTrustedUrls(): string[] { + try { + const stored = localStorage.getItem(TRUSTED_URLS_KEY); + return stored ? JSON.parse(stored) : []; Review Comment: **Suggestion:** The `getTrustedUrls` function assumes `JSON.parse` returns an array, but if localStorage has been tampered with (e.g., contains a string or object instead of an array), calling `.some()` on the result will throw a TypeError. Add validation to ensure the parsed value is actually an array before returning it. [possible bug] <details> <summary><b>Severity Level:</b> Major ⚠️</summary> ```mdx - ❌ External link redirect page crashes preventing access to external URLs. - ⚠️ Users who tampered with localStorage (intentionally or via extension) blocked from email links. - ⚠️ Security feature becomes unusable for affected users. ``` </details> ```suggestion const parsed = stored ? JSON.parse(stored) : []; return Array.isArray(parsed) ? parsed : []; ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. User manually modifies localStorage or uses a browser extension that sets `superset_trusted_urls` key to a non-array value (e.g., string `"evil"` or object `{"foo":"bar"}`). 2. User receives an alert/report email containing an external link (e.g., `<a href="https://superset/redirect?url=https://external.com">`). 3. User clicks the link and navigates to `/redirect/` route (defined at `src/views/routes.tsx:199`), triggering the `RedirectWarning` component at `src/pages/RedirectWarning/index.tsx:98`. 4. Component initializes and calls `isUrlTrusted(targetUrl)` at line 106 via useMemo hook. 5. `isUrlTrusted()` at `utils.ts:82-85` calls `getTrustedUrls()` which returns the non-array value from localStorage. 6. Code attempts to call `.some()` on the returned value at `utils.ts:84`, causing: `TypeError: parsed.some is not a function`, crashing the redirect warning page. 7. User is unable to proceed to external URL or see the warning page, experiencing a blank/crashed page instead. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset-frontend/src/pages/RedirectWarning/utils.ts **Line:** 66:66 **Comment:** *Possible Bug: The `getTrustedUrls` function assumes `JSON.parse` returns an array, but if localStorage has been tampered with (e.g., contains a string or object instead of an array), calling `.some()` on the result will throw a TypeError. Add validation to ensure the parsed value is actually an array before returning it. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> <a href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F35021&comment_hash=43db453146f5ae4e6dc44616d08ebc8d58e17be5def0beae252fb02ced2273e4&reaction=like'>👍</a> | <a href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F35021&comment_hash=43db453146f5ae4e6dc44616d08ebc8d58e17be5def0beae252fb02ced2273e4&reaction=dislike'>👎</a> ########## superset-frontend/src/pages/RedirectWarning/index.tsx: ########## @@ -0,0 +1,183 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import { useState, useMemo, useCallback } from 'react'; +import { t } from '@superset-ui/core'; +import { css, styled, useTheme } from '@apache-superset/core/ui'; +import { + Button, + Card, + Checkbox, + Flex, + Typography, +} from '@superset-ui/core/components'; +import { Icons } from '@superset-ui/core/components/Icons'; +import { + getTargetUrl, + isUrlTrusted, + trustUrl, + isAllowedScheme, +} from './utils'; + +const PageContainer = styled(Flex)` + ${({ theme }) => css` + height: calc(100vh - 64px); + background-color: ${theme.colorBgLayout}; + padding: ${theme.padding}px; + `} +`; + +const WarningCard = styled(Card)` + ${({ theme }) => css` + max-width: 520px; + width: 100%; + box-shadow: ${theme.boxShadowSecondary}; + `} +`; + +const WarningHeader = styled(Flex)` + ${({ theme }) => css` + padding: ${theme.paddingLG}px ${theme.paddingXL}px; + border-bottom: 1px solid ${theme.colorBorderSecondary}; + `} +`; + +const WarningBody = styled.div` + ${({ theme }) => css` + padding: ${theme.paddingXL}px; + `} +`; + +const UrlDisplay = styled(Flex)` + ${({ theme }) => css` + background-color: ${theme.colorFillQuaternary}; + border-radius: ${theme.borderRadiusSM}px; + padding: ${theme.paddingSM}px ${theme.padding}px; + margin-bottom: ${theme.margin}px; + `} +`; + +const UrlText = styled(Typography.Text)` + ${({ theme }) => css` + font-family: ${theme.fontFamilyCode}; + font-size: ${theme.fontSize}px; + word-break: break-all; + `} +`; + +const WarningFooter = styled(Flex)` + ${({ theme }) => css` + padding: ${theme.padding}px ${theme.paddingXL}px; + background-color: ${theme.colorFillAlter}; + border-top: 1px solid ${theme.colorBorderSecondary}; + `} +`; + +const WarningTitle = styled(Typography.Title)` + && { + margin: 0; + } +`; + +export default function RedirectWarning() { + const theme = useTheme(); + const [trustChecked, setTrustChecked] = useState(false); + + const targetUrl = useMemo(() => getTargetUrl(), []); + + // If already trusted, redirect immediately + const alreadyTrusted = useMemo( + () => Boolean(targetUrl && isUrlTrusted(targetUrl)), + [targetUrl], + ); + + if (alreadyTrusted && targetUrl) { + window.location.href = targetUrl; + } Review Comment: **Suggestion:** Redirecting by assigning `window.location.href` during the render phase is a React anti-pattern that causes side effects during render. This should be wrapped in a `useEffect` hook. Additionally, the automatic redirect for trusted URLs should validate the URL scheme using `isAllowedScheme` to prevent potential open-redirect vulnerabilities if localStorage is manipulated to contain malicious URLs (e.g., `javascript:` schemes). [possible bug] <details> <summary><b>Severity Level:</b> Critical 🚨</summary> ```mdx - ❌ React anti-pattern: side effects during render cause double execution in Strict Mode. - ❌ Security bypass: trusted URLs skip `isAllowedScheme` validation enabling javascript: XSS. - ⚠️ Inconsistent validation between auto-redirect and manual continue flow. ``` </details> ```suggestion // If already trusted, redirect immediately const alreadyTrusted = useMemo( () => Boolean(targetUrl && isUrlTrusted(targetUrl)), [targetUrl], ); useEffect(() => { if (alreadyTrusted && targetUrl && isAllowedScheme(targetUrl)) { window.location.href = targetUrl; } }, [alreadyTrusted, targetUrl]); ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. User receives alert/report email containing external link and clicks it, navigating to `/redirect?url=https://example.com` (endpoint implemented by this new page component). 2. If user previously trusted the domain via the "Trust this URL" checkbox, `isUrlTrusted(targetUrl)` at line 106 returns `true` from localStorage. 3. During component render phase (before useEffect), the condition `if (alreadyTrusted && targetUrl)` at line 110 evaluates to true, triggering immediate execution of `window.location.href = targetUrl` - a side effect during React's render phase. 4. If localStorage was manipulated to mark a malicious URL like `javascript:alert(document.cookie)` as trusted, the auto-redirect executes without `isAllowedScheme` validation (unlike the manual "Continue" button which validates at line 115), leading to XSS vulnerability. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset-frontend/src/pages/RedirectWarning/index.tsx **Line:** 104:112 **Comment:** *Possible Bug: Redirecting by assigning `window.location.href` during the render phase is a React anti-pattern that causes side effects during render. This should be wrapped in a `useEffect` hook. Additionally, the automatic redirect for trusted URLs should validate the URL scheme using `isAllowedScheme` to prevent potential open-redirect vulnerabilities if localStorage is manipulated to contain malicious URLs (e.g., `javascript:` schemes). Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> <a href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F35021&comment_hash=8042bac7af4bafc5241885cb49a4c3e9cb2473c6c425ed02ea4a60a314c900ef&reaction=like'>👍</a> | <a href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F35021&comment_hash=8042bac7af4bafc5241885cb49a4c3e9cb2473c6c425ed02ea4a60a314c900ef&reaction=dislike'>👎</a> ########## superset/utils/link_redirect.py: ########## @@ -0,0 +1,149 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +""" +Utilities for processing links in alert/report emails. + +External links are rewritten to go through a redirect warning page so that +recipients see a confirmation before navigating to an external site. +""" + +import logging +import re +from urllib.parse import quote, urlparse + +from flask import current_app + +logger = logging.getLogger(__name__) + +# Matches href="..." in anchor tags (both single and double quotes) +_HREF_RE = re.compile( + r"""(<a\s[^>]*?href\s*=\s*)(["'])(.*?)\2""", + re.IGNORECASE | re.DOTALL, +) + + +def _get_base_hosts() -> set[str]: + """Return the set of hosts that are considered internal (lower-cased).""" + hosts: set[str] = set() + for key in ("WEBDRIVER_BASEURL_USER_FRIENDLY", "WEBDRIVER_BASEURL"): + url = current_app.config.get(key, "") + if url: + parsed = urlparse(url) + if parsed.scheme and parsed.netloc: + hosts.add(parsed.netloc.lower()) + return hosts + + +def _get_redirect_base() -> str: + """Return the base URL used to build redirect links.""" + for key in ("WEBDRIVER_BASEURL_USER_FRIENDLY", "WEBDRIVER_BASEURL"): + url = current_app.config.get(key, "") + if url: + return url.rstrip("/") + return "" + + +def _is_external(href: str, base_hosts: set[str]) -> bool: + """Return True if *href* points to an external host.""" + parsed = urlparse(href) + # Only rewrite http(s) links with a host that differs from ours + if parsed.scheme not in ("http", "https"): + return False + return bool(parsed.netloc) and parsed.netloc.lower() not in base_hosts + + +def _replace_href( + match: re.Match[str], + base_hosts: set[str], + redirect_base: str, +) -> str: + """Regex replacer: rewrite external hrefs to go through the redirect page.""" + prefix, quote_char, href = match.group(1), match.group(2), match.group(3) + href = href.strip() + + # Don't double-redirect + if "/redirect/" in href: + return match.group(0) + + if not _is_external(href, base_hosts): + return match.group(0) + + redirect_url = f"{redirect_base}/redirect/?url={quote(href, safe='')}" + return f"{prefix}{quote_char}{redirect_url}{quote_char}" + + +def process_html_links(html_content: str) -> str: + """ + Rewrite external links in *html_content* to go through the redirect page. + + Internal links (matching the configured base URL hosts) are left untouched. + """ + if not html_content or not html_content.strip(): + return html_content + + if not current_app.config.get("ALERT_REPORTS_ENABLE_LINK_REDIRECT", True): + return html_content + + base_hosts = _get_base_hosts() + if not base_hosts: + logger.warning("No base URL configured, skipping link redirect processing") + return html_content + + redirect_base = _get_redirect_base() + if not redirect_base: + return html_content + + try: + return _HREF_RE.sub( + lambda m: _replace_href(m, base_hosts, redirect_base), + html_content, + ) + except Exception: + logger.warning("Failed to process HTML links", exc_info=True) + return html_content Review Comment: **Suggestion:** The process_html_links() function has fail-open behavior on exceptions - when the regex substitution raises an exception, it logs a warning and returns the original html_content unchanged. For a security feature designed to protect users from malicious external links, fail-open behavior is dangerous because a parsing error or edge case in the regex could allow malicious links to pass through unmodified. A security control should fail-closed. The function should return an empty string or sanitized content when processing fails, not the potentially malicious original content. Alternatively, re-raise the exception after logging to ensure the error is not silently ignored. [security] <details> <summary><b>Severity Level:</b> Critical 🚨</summary> ```mdx - ❌ Malicious links bypass redirect warning on regex failures. - ❌ Email security control fails open exposing users to phishing. - ⚠️ Alert/report notification system compromised on edge case inputs. ``` </details> ```suggestion try: return _HREF_RE.sub( lambda m: _replace_href(m, base_hosts, redirect_base), html_content, ) except Exception: logger.exception("Failed to process HTML links") # Fail closed: return empty content instead of potentially malicious HTML return "" ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. Alerts/reports system calls `process_html_links()` at `superset/utils/link_redirect.py:90` with HTML content containing malicious links. 2. Regex substitution at line 112-114 encounters edge case input (e.g., malformed HTML with nested quotes, extremely long href values triggering recursion limits, or exotic Unicode causing regex backtracking issues). 3. Exception occurs during `_HREF_RE.sub()` execution, triggering the bare `except Exception:` handler at line 116. 4. Function returns original `html_content` at line 118 with malicious links intact, bypassing the security redirect mechanism. 5. Email containing unmodified malicious links is sent to users without any warning or interstitial page protection. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset/utils/link_redirect.py **Line:** 111:118 **Comment:** *Security: The process_html_links() function has fail-open behavior on exceptions - when the regex substitution raises an exception, it logs a warning and returns the original html_content unchanged. For a security feature designed to protect users from malicious external links, fail-open behavior is dangerous because a parsing error or edge case in the regex could allow malicious links to pass through unmodified. A security control should fail-closed. The function should return an empty string or sanitized content when processing fails, not the potentially malicious original content. Alternatively, re-raise the exception after logging to ensure the error is not silently ignored. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> <a href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F35021&comment_hash=a5f8d070e6e50c0ed19e39bf5778a4254af2b45077bd0ced7bfcc12652e73b64&reaction=like'>👍</a> | <a href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F35021&comment_hash=a5f8d070e6e50c0ed19e39bf5778a4254af2b45077bd0ced7bfcc12652e73b64&reaction=dislike'>👎</a> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
