codeant-ai-for-open-source[bot] commented on code in PR #39301:
URL: https://github.com/apache/superset/pull/39301#discussion_r3408323739
##########
superset/db_engine_specs/impala.py:
##########
@@ -209,8 +210,25 @@ def cancel_query(cls, cursor: Any, query: Query,
cancel_query_id: str) -> bool:
"""
try:
impala_host = query.database.url_object.host
+ # The cancel call issues an outbound HTTP request from the
+ # Superset backend to whatever host the DB connection was
+ # configured with; validate it before the call to keep this
+ # path consistent with the dataset-import and webhook URL
+ # checks. Operators with internal Impala targets can opt out
+ # via IMPALA_CANCEL_QUERY_ALLOW_INTERNAL_HOSTS.
+ if not impala_host:
+ return False
+ if not app.config[
+ "IMPALA_CANCEL_QUERY_ALLOW_INTERNAL_HOSTS"
+ ] and not is_safe_host(impala_host):
+ logger.warning(
+ "Impala cancel_query refused: target host is not allowed"
+ )
+ return False
url =
f"http://{impala_host}:25000/cancel_query?query_id={cancel_query_id}"
- response = requests.post(url, timeout=3)
+ # Do not follow redirects: a validated host could otherwise 30x the
+ # request to an internal target, bypassing the is_safe_host check.
Review Comment:
**Suggestion:** This host validation is vulnerable to DNS rebinding because
`is_safe_host()` resolves DNS first, but `requests.post()` performs a fresh DNS
resolution when sending the request. An attacker controlling DNS for a hostname
can return a public IP during validation and then switch to an internal IP
before the HTTP call, bypassing the SSRF guard. Resolve once and pin the
outbound connection to the validated IP (or use a transport that enforces
resolved-address pinning) instead of validating and then reconnecting by
hostname. [ssrf]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Impala cancel_query SSRF guard bypassable via DNS rebinding.
- ⚠️ Backend can issue HTTP requests to internal network hosts.
- ⚠️ Private services exposed whenever Impala hostname uses attacker DNS.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. A client calls the queries API stop endpoint (`@expose("/stop",
methods=("POST",))`)
implemented in `superset/queries/api.py:232-271`, which deserializes the
JSON body and
invokes `QueryDAO.stop_query(body["client_id"])` at
`superset/queries/api.py:10-13`.
2. `QueryDAO.stop_query` (in `superset/daos/query.py:60-24`) loads the
`Query` row for the
given `client_id` and, if the query is still running, calls
`sql_lab.cancel_query(query)`
at `superset/daos/query.py:20-21`.
3. `sql_lab.cancel_query` (in `superset/sql_lab.py:5-39`) checks for an
existing cancel
key in `query.extra` and then creates a new SQLAlchemy engine and cursor with
`query.database.get_sqla_engine(...)`, finally delegating to
`query.database.db_engine_spec.cancel_query(cursor, query, cancel_query_id)`
at
`superset/sql_lab.py:32-39`. For an Impala database, this calls
`ImpalaEngineSpec.cancel_query` implemented in
`superset/db_engine_specs/impala.py:23-56`.
4. Inside `ImpalaEngineSpec.cancel_query`
(`superset/db_engine_specs/impala.py:33-48`),
the code reads `impala_host = query.database.url_object.host` and applies
the host check:
`if not app.config["IMPALA_CANCEL_QUERY_ALLOW_INTERNAL_HOSTS"] and not
is_safe_host(impala_host): ...`. `is_safe_host` (defined in
`superset/utils/network.py:8-34`) calls `socket.getaddrinfo(host, None)` at
line 18 and
verifies each returned IP is globally routable; if an attacker-controlled
DNS server for
`impala_host` returns a public IP at this point, `is_safe_host` returns
`True` and the
guard passes.
5. After validation, `cancel_query` constructs `url =
f"http://{impala_host}:25000/cancel_query?query_id={cancel_query_id}"` and
issues
`requests.post(url, timeout=3, allow_redirects=False)` at
`superset/db_engine_specs/impala.py:49-52`. The `requests` library performs
its own DNS
resolution for `impala_host` when opening the HTTP connection rather than
reusing the
earlier `socket.getaddrinfo` result. An attacker who controls DNS for
`impala_host` can
perform DNS rebinding: first returning a public IP during `is_safe_host` so
the check
passes, then quickly changing the DNS record so that the subsequent
resolution performed
by `requests.post` points to a private or loopback address. This causes the
Superset
backend to send the HTTP cancel request to an internal host despite the SSRF
guard,
because validation and connection use separate, unpinned DNS lookups and
`allow_redirects=False` only blocks redirect-based bypasses, not this
rebinding window.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=889b60e7814241a08d48526f4cead6f8&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=889b60e7814241a08d48526f4cead6f8&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/db_engine_specs/impala.py
**Line:** 221:230
**Comment:**
*Ssrf: This host validation is vulnerable to DNS rebinding because
`is_safe_host()` resolves DNS first, but `requests.post()` performs a fresh DNS
resolution when sending the request. An attacker controlling DNS for a hostname
can return a public IP during validation and then switch to an internal IP
before the HTTP call, bypassing the SSRF guard. Resolve once and pin the
outbound connection to the validated IP (or use a transport that enforces
resolved-address pinning) instead of validating and then reconnecting by
hostname.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39301&comment_hash=ccb9a5ba4b573dc5904f5cdc2bf41f4cfdf87c17421a24d26715ea4203efe371&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39301&comment_hash=ccb9a5ba4b573dc5904f5cdc2bf41f4cfdf87c17421a24d26715ea4203efe371&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]