codeant-ai-for-open-source[bot] commented on code in PR #41301:
URL: https://github.com/apache/superset/pull/41301#discussion_r3456329783


##########
superset/utils/version.py:
##########
@@ -57,6 +57,21 @@ def get_version_metadata() -> dict[str, Any]:
     return metadata
 
 
+def visible_version_metadata(
+    metadata: dict[str, Any], expose_build_details: bool
+) -> dict[str, Any]:
+    """Return version metadata for user-facing surfaces.
+
+    The release ``version_string`` is always included. Precise build details
+    (the git SHA and build number), which let a viewer map the deployment to a
+    specific commit/build, are blanked out unless ``expose_build_details`` is
+    True (e.g. the viewer is an admin or the deployment opted in).
+    """
+    if expose_build_details:
+        return metadata
+    return {**metadata, "version_sha": "", "build_number": None}

Review Comment:
   **Suggestion:** The redaction helper does not remove `full_sha`, even though 
that field is an exact git commit identifier and is part of the metadata 
produced upstream. With `expose_build_details=False`, callers that serialize 
this returned dict can still leak precise build identity. Redact/remove 
`full_sha` together with the short SHA/build number. [security]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Redaction helper leaves full_sha populated when hiding details.
   - ⚠️ Future serializers may expose exact commit SHA to users.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Configure Superset with `EXPOSE_BUILD_DETAILS_TO_USERS = False` in
   `superset/config.py:54-61` and run the server in an environment where 
`GITHUB_SHA` is set
   (e.g. CI-built container), so `get_version_metadata()` will populate 
`full_sha`.
   
   2. Trigger any code path that calls `menu_data(user)`; for example, a SPA 
page render via
   `SupersetModelView.render_app_template()` in 
`superset/views/base.py:670-672`, which calls
   `get_spa_template_context()` (`base.py:583-587`), then 
`common_bootstrap_payload()`
   (`base.py:38-42`), then `cached_common_bootstrap_data()` 
(`base.py:452-456`), which builds
   `"menu_data": menu_data(g.user)` at `base.py:528-528`.
   
   3. Inside `menu_data()` (`superset/views/base.py:273-291`), 
`version_metadata =
   visible_version_metadata(get_version_metadata(), expose_build_details)` is 
executed
   (`base.py:289-291`). `get_version_metadata()` from 
`superset/utils/version.py:29-57` reads
   `GITHUB_SHA` and sets `metadata["full_sha"] = github_sha` at `version.py:44`.
   
   4. Because `expose_build_details` is False for a non-admin user,
   `visible_version_metadata()` in `superset/utils/version.py:60-72` returns 
`{**metadata,
   "version_sha": "", "build_number": None}` (`version.py:72`), leaving
   `metadata["full_sha"]` untouched. Inspecting `version_metadata` in a 
debugger or via
   temporary logging confirms that `version_metadata["full_sha"]` still 
contains the exact
   commit SHA even though build details are supposed to be hidden, so any 
caller that later
   serializes the whole dict (instead of selecting keys as `menu_data` 
currently does) would
   leak the full commit identifier.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=f12fcc81bf544eefa29c23a14ddebbb7&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=f12fcc81bf544eefa29c23a14ddebbb7&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/utils/version.py
   **Line:** 72:72
   **Comment:**
        *Security: The redaction helper does not remove `full_sha`, even though 
that field is an exact git commit identifier and is part of the metadata 
produced upstream. With `expose_build_details=False`, callers that serialize 
this returned dict can still leak precise build identity. Redact/remove 
`full_sha` together with the short SHA/build number.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=ece426df16a4ec30478d38dbbe2f28f34dd391088ad9f06c75d6829330edf478&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=ece426df16a4ec30478d38dbbe2f28f34dd391088ad9f06c75d6829330edf478&reaction=dislike'>👎</a>



##########
superset/views/base.py:
##########
@@ -279,8 +279,16 @@ def menu_data(user: User) -> dict[str, Any]:
     if callable(brand_text := app.config["LOGO_RIGHT_TEXT"]):
         brand_text = brand_text()
 
-    # Get centralized version metadata
-    version_metadata = get_version_metadata()
+    # Get centralized version metadata. Precise build details (git SHA and
+    # build number) let a viewer map the deployment to a specific commit/build,
+    # so expose them only to admins unless the deployment opts in via
+    # EXPOSE_BUILD_DETAILS_TO_USERS. The release version string is always 
shown.
+    expose_build_details = (
+        app.config["EXPOSE_BUILD_DETAILS_TO_USERS"] or 
security_manager.is_admin()
+    )
+    version_metadata = visible_version_metadata(
+        get_version_metadata(), expose_build_details
+    )

Review Comment:
   **Suggestion:** `menu_data` now calls `get_version_metadata()` on the 
bootstrap path, and that helper can run `git rev-parse` subprocess calls when 
CI env vars are not present. Since this code is reached for normal page 
bootstrap traffic, it introduces avoidable per-request process-spawn overhead 
(memoized only for 60s per user/locale). Use already-loaded config values for 
navbar version fields or memoize version metadata independently of request/user 
context. [performance]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ spa.html bootstrap path runs git subprocess on cache misses.
   - ⚠️ Superset UI requests incur extra CPU and process overhead.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Run Superset in an environment where `GITHUB_HEAD_REF` and 
`GITHUB_REF_NAME` are unset
   (typical non-CI deployment) so `get_version_metadata()` must fall back to
   `_get_local_branch()`; confirm no branch env vars are set.
   
   2. From a browser, load any SPA page rendered via
   `SupersetModelView.render_app_template()` in 
`superset/views/base.py:670-672`, which calls
   `get_spa_template_context()` (`base.py:583-587`), then 
`common_bootstrap_payload()`
   (`base.py:38-42`), which in turn calls 
`cached_common_bootstrap_data(utils.get_user_id(),
   locale_str)` at `base.py:42`.
   
   3. On a cache miss for `cached_common_bootstrap_data` (per
   `@cache_manager.cache.memoize(timeout=60)` at 
`superset/views/base.py:452-456`), the
   function constructs the bootstrap dict including `"menu_data": 
menu_data(g.user)` at
   `base.py:528-528`. Inside `menu_data()` (`base.py:273-291`), it computes 
`version_metadata
   = visible_version_metadata(get_version_metadata(), expose_build_details)` at
   `base.py:289-291`.
   
   4. `get_version_metadata()` (`superset/utils/version.py:29-57`) calls
   `_get_local_branch()` at `version.py:48-53`, which executes
   `subprocess.check_output(["git", "rev-parse", "--abbrev-ref", "HEAD"], ...)` 
in
   `_get_local_branch()` (`version.py:106-115`). Profiling or adding temporary 
logging around
   `_get_local_branch()` shows a `git rev-parse` subprocess being spawned on 
each cache miss
   (i.e., at least once every 60 seconds per user/locale) along this common 
bootstrap path.
   ```
   </details>
   
   [![Fix in 
Cursor](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-cursor-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=a8f55ed0a07843898d7fa72ff7281a5d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 [![Fix in VSCode 
Claude](https://new-codeant-butcket.s3.us-west-1.amazonaws.com/badges/fix-in-vscode-claude-flat.svg)](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=a8f55ed0a07843898d7fa72ff7281a5d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/views/base.py
   **Line:** 289:291
   **Comment:**
        *Performance: `menu_data` now calls `get_version_metadata()` on the 
bootstrap path, and that helper can run `git rev-parse` subprocess calls when 
CI env vars are not present. Since this code is reached for normal page 
bootstrap traffic, it introduces avoidable per-request process-spawn overhead 
(memoized only for 60s per user/locale). Use already-loaded config values for 
navbar version fields or memoize version metadata independently of request/user 
context.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=a83ef2440ef92f63bd0aac9760d9e7be0aae60004d29a2db3e9818858ae687ae&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=a83ef2440ef92f63bd0aac9760d9e7be0aae60004d29a2db3e9818858ae687ae&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to