codeant-ai-for-open-source[bot] commented on code in PR #40671:
URL: https://github.com/apache/superset/pull/40671#discussion_r3338643152


##########
superset/security/guest_token.py:
##########
@@ -14,13 +14,72 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+import logging
 from typing import Optional, TypedDict, Union
 
 from flask_appbuilder.security.sqla.models import Group, Role
 from flask_login import AnonymousUserMixin
 
 from superset.utils.backports import StrEnum
 
+logger = logging.getLogger(__name__)
+
+# JWT claim that carries the revocation version a token was minted with.
+GUEST_TOKEN_REVOCATION_CLAIM = "rev"  # noqa: S105
+
+# Tokens minted before the revocation feature existed carry no version claim.
+# They are treated as version 0, which is only rejected once an admin has
+# explicitly bumped the expected version above 0.
+DEFAULT_GUEST_TOKEN_REVOCATION_VERSION = 0
+
+
+def get_current_guest_token_revocation_version() -> int:
+    """
+    Return the minimum guest-token revocation version accepted at validation 
time.
+
+    The value is stored in the metadata database via the ``key_value`` store 
so it
+    can be bumped at runtime (e.g. via the ``revoke-guest-tokens`` CLI command)
+    without a code deploy. A missing entry means revocation has never been
+    triggered, so the default version is returned.
+    """
+    # pylint: disable=import-outside-toplevel
+    from superset.key_value.shared_entries import get_shared_value
+    from superset.key_value.types import SharedKey
+
+    try:
+        value = get_shared_value(SharedKey.GUEST_TOKEN_REVOCATION_VERSION)
+    except Exception:  # pylint: disable=broad-except
+        # Never let a metadata-store hiccup hard-fail token validation; fall 
back
+        # to the default version (fail-open to the pre-feature behaviour).
+        logger.warning(
+            "Could not read guest token revocation version", exc_info=True
+        )
+        return DEFAULT_GUEST_TOKEN_REVOCATION_VERSION
+    if value is None:
+        return DEFAULT_GUEST_TOKEN_REVOCATION_VERSION
+    try:
+        return int(value)
+    except (TypeError, ValueError):
+        logger.warning("Invalid guest token revocation version stored: %r", 
value)
+        return DEFAULT_GUEST_TOKEN_REVOCATION_VERSION
+
+
+def bump_guest_token_revocation_version() -> int:
+    """
+    Increment and persist the guest-token revocation version, revoking every
+    outstanding guest token minted with a lower version.
+
+    :return: the new revocation version
+    """
+    # pylint: disable=import-outside-toplevel
+    from superset.key_value.shared_entries import upsert_shared_value
+    from superset.key_value.types import SharedKey
+
+    new_version = get_current_guest_token_revocation_version() + 1
+    upsert_shared_value(SharedKey.GUEST_TOKEN_REVOCATION_VERSION, new_version)

Review Comment:
   **Suggestion:** This increment is a read-then-write sequence and is not 
atomic, so concurrent revocation calls can overwrite each other (lost update). 
If two processes bump at the same time, both can compute the same 
`new_version`, which can fail to revoke tokens minted between the two 
operations. Make the bump atomic in storage (single DB-side increment/locking 
compare-and-swap) so each call always advances the version uniquely. [race 
condition]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Guest token revocation CLI may lose concurrent increments.
   - ⚠️ Some embedded dashboard viewers remain authorized after expected 
revocation.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Enable embedded guest tokens so they are used in requests: in
   `superset/security/manager.py` lines 9-15 (snippet starting at 650) the 
`request_loader`
   method routes requests to `get_guest_user_from_request` when the 
`EMBEDDED_SUPERSET`
   feature flag is enabled.
   
   2. An administrator runs the `revoke_guest_tokens` CLI command defined in
   `superset/cli/guest_token.py` lines 26-52; this command calls
   `bump_guest_token_revocation_version()` from 
`superset/security/guest_token.py` line 67.
   
   3. Inspect `bump_guest_token_revocation_version` in 
`superset/security/guest_token.py`
   lines 67-81: it reads the current version via
   `get_current_guest_token_revocation_version()` (line 78), adds 1, and then 
writes this
   `new_version` via 
`upsert_shared_value(SharedKey.GUEST_TOKEN_REVOCATION_VERSION,
   new_version)` (line 79), i.e. a read-then-write increment.
   
   4. In a multi-worker deployment, trigger two `revoke_guest_tokens` CLI 
invocations
   concurrently (for example from two terminals); both processes execute the 
same sequence:
   they call `get_current_guest_token_revocation_version()` (which reads the 
stored value via
   `get_shared_value` in `superset/key_value/shared_entries.py` lines 38-73), 
compute the
   same `new_version`, and call `upsert_shared_value` (lines 89-99) which 
simply upserts the
   provided constant value. Because there is no atomic DB-side increment or 
compare-and-swap,
   the final stored version reflects only one increment instead of two, so one 
revocation
   bump is lost and tokens minted between the two intended revocations (with 
the intermediate
   version) remain valid even though two revocation operations were executed.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=344c7ef916dd4ec4ba0de54f2d5651ab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=344c7ef916dd4ec4ba0de54f2d5651ab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/security/guest_token.py
   **Line:** 78:79
   **Comment:**
        *Race Condition: This increment is a read-then-write sequence and is 
not atomic, so concurrent revocation calls can overwrite each other (lost 
update). If two processes bump at the same time, both can compute the same 
`new_version`, which can fail to revoke tokens minted between the two 
operations. Make the bump atomic in storage (single DB-side increment/locking 
compare-and-swap) so each call always advances the version uniquely.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40671&comment_hash=a72706d996fde5113f3836de131f1b7703b0f0e53413ea7a91b0c56454d5967e&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40671&comment_hash=a72706d996fde5113f3836de131f1b7703b0f0e53413ea7a91b0c56454d5967e&reaction=dislike'>👎</a>



##########
superset/security/guest_token.py:
##########
@@ -14,13 +14,72 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+import logging
 from typing import Optional, TypedDict, Union
 
 from flask_appbuilder.security.sqla.models import Group, Role
 from flask_login import AnonymousUserMixin
 
 from superset.utils.backports import StrEnum
 
+logger = logging.getLogger(__name__)
+
+# JWT claim that carries the revocation version a token was minted with.
+GUEST_TOKEN_REVOCATION_CLAIM = "rev"  # noqa: S105
+
+# Tokens minted before the revocation feature existed carry no version claim.
+# They are treated as version 0, which is only rejected once an admin has
+# explicitly bumped the expected version above 0.
+DEFAULT_GUEST_TOKEN_REVOCATION_VERSION = 0
+
+
+def get_current_guest_token_revocation_version() -> int:
+    """
+    Return the minimum guest-token revocation version accepted at validation 
time.
+
+    The value is stored in the metadata database via the ``key_value`` store 
so it
+    can be bumped at runtime (e.g. via the ``revoke-guest-tokens`` CLI command)
+    without a code deploy. A missing entry means revocation has never been
+    triggered, so the default version is returned.
+    """
+    # pylint: disable=import-outside-toplevel
+    from superset.key_value.shared_entries import get_shared_value
+    from superset.key_value.types import SharedKey
+
+    try:
+        value = get_shared_value(SharedKey.GUEST_TOKEN_REVOCATION_VERSION)
+    except Exception:  # pylint: disable=broad-except
+        # Never let a metadata-store hiccup hard-fail token validation; fall 
back
+        # to the default version (fail-open to the pre-feature behaviour).
+        logger.warning(
+            "Could not read guest token revocation version", exc_info=True
+        )
+        return DEFAULT_GUEST_TOKEN_REVOCATION_VERSION

Review Comment:
   **Suggestion:** Falling back to version `0` on metadata read errors makes 
revocation fail open: when revocation is enabled and the store is temporarily 
unavailable, previously revoked tokens are treated as valid again. This 
undermines the security contract of runtime revocation; on read failures, 
validation should fail closed (or use a safe cached last-known value) instead 
of resetting to default. [security]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Previously revoked guest tokens accepted during metadata store outage.
   - ⚠️ Revocation feature's security guarantees weakened under infrastructure 
failures.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure guest token revocation and embedding: set 
`GUEST_TOKEN_REVOCATION_ENABLED =
   True` so `_get_guest_token_revocation_version` in 
`superset/security/manager.py` lines
   38-48 reads from `get_current_guest_token_revocation_version`, and enable
   `EMBEDDED_SUPERSET` so `request_loader` (same file, lines 9-15 in the 650+ 
snippet) routes
   incoming requests to `get_guest_user_from_request`.
   
   2. Mint and then revoke guest tokens: guest tokens are created by
   `create_guest_access_token` in `superset/security/manager.py` lines 9-36 
(3440+ snippet),
   which stamps `GUEST_TOKEN_REVOCATION_CLAIM` with 
`_get_guest_token_revocation_version()`.
   Running the `revoke_guest_tokens` CLI command in 
`superset/cli/guest_token.py` lines 26-52
   calls `bump_guest_token_revocation_version` (in 
`superset/security/guest_token.py` lines
   67-81), incrementing and persisting the version via `upsert_shared_value`, 
so older tokens
   with claim `rev=0` are now considered revoked by `_is_guest_token_revoked` 
(lines 86-106).
   
   3. Induce a metadata-store read failure: the revocation check 
`_is_guest_token_revoked`
   calls `get_current_guest_token_revocation_version()` (imported from
   `superset/security/guest_token.py` line 36), which in turn calls
   `get_shared_value(SharedKey.GUEST_TOKEN_REVOCATION_VERSION)` (lines 49-50). 
If the
   metadata store is unavailable (for example, `KeyValueDAO.get_value` raises 
due to a
   database outage), `get_shared_value` propagates an exception that is caught 
by
   `get_current_guest_token_revocation_version`'s broad `except Exception` 
block at lines
   49-57.
   
   4. Observe fail-open behavior during the outage: on the exception path,
   `get_current_guest_token_revocation_version` logs a warning and returns
   `DEFAULT_GUEST_TOKEN_REVOCATION_VERSION` (0) at line 57. While
   `GUEST_TOKEN_REVOCATION_ENABLED` remains `True`, `_is_guest_token_revoked` 
(lines 97-106
   in `superset/security/manager.py`) now compares the token's `rev` (for 
example 0 for
   legacy tokens) against this fallback 0 and returns `False` because 
`token_version < 0` is
   false. As a result, previously revoked tokens (those that were below the 
true stored
   version before the outage) are treated as not revoked and accepted in
   `get_guest_user_from_request` (lines 65-76), allowing embedded guest access 
to resume for
   tokens that should remain revoked whenever the metadata store cannot be read.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=944ae30ae6554ec0b836e3155b76b1a4&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=944ae30ae6554ec0b836e3155b76b1a4&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/security/guest_token.py
   **Line:** 49:57
   **Comment:**
        *Security: Falling back to version `0` on metadata read errors makes 
revocation fail open: when revocation is enabled and the store is temporarily 
unavailable, previously revoked tokens are treated as valid again. This 
undermines the security contract of runtime revocation; on read failures, 
validation should fail closed (or use a safe cached last-known value) instead 
of resetting to default.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40671&comment_hash=353bd95aa788ccb69c3c2eec899a0e5b908a463268a571a0241f17006f982c75&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40671&comment_hash=353bd95aa788ccb69c3c2eec899a0e5b908a463268a571a0241f17006f982c75&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to