codeant-ai-for-open-source[bot] commented on code in PR #41301:
URL: https://github.com/apache/superset/pull/41301#discussion_r3456329783
##########
superset/utils/version.py:
##########
@@ -57,6 +57,21 @@ def get_version_metadata() -> dict[str, Any]:
return metadata
+def visible_version_metadata(
+ metadata: dict[str, Any], expose_build_details: bool
+) -> dict[str, Any]:
+ """Return version metadata for user-facing surfaces.
+
+ The release ``version_string`` is always included. Precise build details
+ (the git SHA and build number), which let a viewer map the deployment to a
+ specific commit/build, are blanked out unless ``expose_build_details`` is
+ True (e.g. the viewer is an admin or the deployment opted in).
+ """
+ if expose_build_details:
+ return metadata
+ return {**metadata, "version_sha": "", "build_number": None}
Review Comment:
**Suggestion:** The redaction helper does not remove `full_sha`, even though
that field is an exact git commit identifier and is part of the metadata
produced upstream. With `expose_build_details=False`, callers that serialize
this returned dict can still leak precise build identity. Redact/remove
`full_sha` together with the short SHA/build number. [security]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Redaction helper leaves full_sha populated when hiding details.
- ⚠️ Future serializers may expose exact commit SHA to users.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Configure Superset with `EXPOSE_BUILD_DETAILS_TO_USERS = False` in
`superset/config.py:54-61` and run the server in an environment where
`GITHUB_SHA` is set
(e.g. CI-built container), so `get_version_metadata()` will populate
`full_sha`.
2. Trigger any code path that calls `menu_data(user)`; for example, a SPA
page render via
`SupersetModelView.render_app_template()` in
`superset/views/base.py:670-672`, which calls
`get_spa_template_context()` (`base.py:583-587`), then
`common_bootstrap_payload()`
(`base.py:38-42`), then `cached_common_bootstrap_data()`
(`base.py:452-456`), which builds
`"menu_data": menu_data(g.user)` at `base.py:528-528`.
3. Inside `menu_data()` (`superset/views/base.py:273-291`),
`version_metadata =
visible_version_metadata(get_version_metadata(), expose_build_details)` is
executed
(`base.py:289-291`). `get_version_metadata()` from
`superset/utils/version.py:29-57` reads
`GITHUB_SHA` and sets `metadata["full_sha"] = github_sha` at `version.py:44`.
4. Because `expose_build_details` is False for a non-admin user,
`visible_version_metadata()` in `superset/utils/version.py:60-72` returns
`{**metadata,
"version_sha": "", "build_number": None}` (`version.py:72`), leaving
`metadata["full_sha"]` untouched. Inspecting `version_metadata` in a
debugger or via
temporary logging confirms that `version_metadata["full_sha"]` still
contains the exact
commit SHA even though build details are supposed to be hidden, so any
caller that later
serializes the whole dict (instead of selecting keys as `menu_data`
currently does) would
leak the full commit identifier.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=f12fcc81bf544eefa29c23a14ddebbb7&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=f12fcc81bf544eefa29c23a14ddebbb7&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/version.py
**Line:** 72:72
**Comment:**
*Security: The redaction helper does not remove `full_sha`, even though
that field is an exact git commit identifier and is part of the metadata
produced upstream. With `expose_build_details=False`, callers that serialize
this returned dict can still leak precise build identity. Redact/remove
`full_sha` together with the short SHA/build number.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=ece426df16a4ec30478d38dbbe2f28f34dd391088ad9f06c75d6829330edf478&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=ece426df16a4ec30478d38dbbe2f28f34dd391088ad9f06c75d6829330edf478&reaction=dislike'>👎</a>
##########
superset/views/base.py:
##########
@@ -279,8 +279,16 @@ def menu_data(user: User) -> dict[str, Any]:
if callable(brand_text := app.config["LOGO_RIGHT_TEXT"]):
brand_text = brand_text()
- # Get centralized version metadata
- version_metadata = get_version_metadata()
+ # Get centralized version metadata. Precise build details (git SHA and
+ # build number) let a viewer map the deployment to a specific commit/build,
+ # so expose them only to admins unless the deployment opts in via
+ # EXPOSE_BUILD_DETAILS_TO_USERS. The release version string is always
shown.
+ expose_build_details = (
+ app.config["EXPOSE_BUILD_DETAILS_TO_USERS"] or
security_manager.is_admin()
+ )
+ version_metadata = visible_version_metadata(
+ get_version_metadata(), expose_build_details
+ )
Review Comment:
**Suggestion:** `menu_data` now calls `get_version_metadata()` on the
bootstrap path, and that helper can run `git rev-parse` subprocess calls when
CI env vars are not present. Since this code is reached for normal page
bootstrap traffic, it introduces avoidable per-request process-spawn overhead
(memoized only for 60s per user/locale). Use already-loaded config values for
navbar version fields or memoize version metadata independently of request/user
context. [performance]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ spa.html bootstrap path runs git subprocess on cache misses.
- ⚠️ Superset UI requests incur extra CPU and process overhead.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Run Superset in an environment where `GITHUB_HEAD_REF` and
`GITHUB_REF_NAME` are unset
(typical non-CI deployment) so `get_version_metadata()` must fall back to
`_get_local_branch()`; confirm no branch env vars are set.
2. From a browser, load any SPA page rendered via
`SupersetModelView.render_app_template()` in
`superset/views/base.py:670-672`, which calls
`get_spa_template_context()` (`base.py:583-587`), then
`common_bootstrap_payload()`
(`base.py:38-42`), which in turn calls
`cached_common_bootstrap_data(utils.get_user_id(),
locale_str)` at `base.py:42`.
3. On a cache miss for `cached_common_bootstrap_data` (per
`@cache_manager.cache.memoize(timeout=60)` at
`superset/views/base.py:452-456`), the
function constructs the bootstrap dict including `"menu_data":
menu_data(g.user)` at
`base.py:528-528`. Inside `menu_data()` (`base.py:273-291`), it computes
`version_metadata
= visible_version_metadata(get_version_metadata(), expose_build_details)` at
`base.py:289-291`.
4. `get_version_metadata()` (`superset/utils/version.py:29-57`) calls
`_get_local_branch()` at `version.py:48-53`, which executes
`subprocess.check_output(["git", "rev-parse", "--abbrev-ref", "HEAD"], ...)`
in
`_get_local_branch()` (`version.py:106-115`). Profiling or adding temporary
logging around
`_get_local_branch()` shows a `git rev-parse` subprocess being spawned on
each cache miss
(i.e., at least once every 60 seconds per user/locale) along this common
bootstrap path.
```
</details>
[](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=a8f55ed0a07843898d7fa72ff7281a5d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
[](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=a8f55ed0a07843898d7fa72ff7281a5d&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/views/base.py
**Line:** 289:291
**Comment:**
*Performance: `menu_data` now calls `get_version_metadata()` on the
bootstrap path, and that helper can run `git rev-parse` subprocess calls when
CI env vars are not present. Since this code is reached for normal page
bootstrap traffic, it introduces avoidable per-request process-spawn overhead
(memoized only for 60s per user/locale). Use already-loaded config values for
navbar version fields or memoize version metadata independently of request/user
context.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=a83ef2440ef92f63bd0aac9760d9e7be0aae60004d29a2db3e9818858ae687ae&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F41301&comment_hash=a83ef2440ef92f63bd0aac9760d9e7be0aae60004d29a2db3e9818858ae687ae&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]