codeant-ai-for-open-source[bot] commented on code in PR #40593:
URL: https://github.com/apache/superset/pull/40593#discussion_r3374820879


##########
superset-extensions-cli/src/superset_extensions_cli/cli.py:
##########
@@ -239,12 +239,30 @@ def copy_backend_files(cwd: Path) -> None:
 
     # Process include patterns
     for pattern in include_patterns:
+        # Include patterns are only meant to select files within the backend
+        # directory. Reject absolute patterns or ones that walk outside it via
+        # parent ("..") components before handing them to glob().
+        pattern_parts = Path(pattern).parts
+        if Path(pattern).is_absolute() or ".." in pattern_parts:
+            raise click.ClickException(
+                f"Invalid include pattern {pattern!r}: patterns must be "
+                "relative to the backend directory and may not contain '..'."
+            )
         for f in backend_dir.glob(pattern):
             if not f.is_file():
                 continue
 
+            # Defense in depth: confirm the matched file resolves to a location
+            # inside the backend directory before copying it into the bundle.
+            resolved = f.resolve()
+            if not resolved.is_relative_to(backend_dir):
+                raise click.ClickException(
+                    f"Refusing to copy {f}: resolved path is outside the "
+                    f"backend directory {backend_dir}."
+                )
+
             # Check exclude patterns
-            relative_path = f.relative_to(backend_dir)
+            relative_path = resolved.relative_to(backend_dir)

Review Comment:
   **Suggestion:** Using the resolved filesystem path as the bundle-relative 
path changes behavior for symlinked files inside `backend/`: excludes are 
evaluated against the target path instead of the configured symlink path, and 
copied files are staged under the resolved target location rather than the 
matched path. This can cause unexpected inclusions and incorrect bundle layout 
(and potential overwrites when multiple symlinks point to the same target). 
Keep the boundary check on `resolved`, but compute the staged/exclude path from 
the matched path (`f.relative_to(backend_dir)`). [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Backend build copies symlinked files to wrong bundle paths.
   - ❌ Symlinked shared modules may be missing at expected locations.
   - ⚠️ Exclude patterns apply to targets not configured symlink paths.
   - ⚠️ Multiple symlinks to one target overwrite bundle files.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. In an extension project with a backend directory, create a symlinked 
module under
   `backend/src` (for example `backend/src/test_org/test_ext/common/module.py` 
as a symlink
   to `backend/src/common/module.py`). The build logic that will process this 
lives in
   `superset-extensions-cli/src/superset_extensions_cli/cli.py:226-275`
   (`copy_backend_files`).
   
   2. Configure `backend/pyproject.toml` so that
   `[tool.apache_superset_extensions.build].include` contains a pattern that 
matches the
   symlinked path (for example `include = ["src/test_org/test_ext/**/*.py"]`) 
and an
   `exclude` that does NOT exclude the symlink path but may exclude the target 
path (for
   example `exclude = ["src/common/**"]`), matching how `include_patterns` and
   `exclude_patterns` are read at `cli.py:231-238`.
   
   3. Run the extensions CLI build command from the project root (entrypoint 
`build` is
   defined at 
`superset-extensions-cli/src/superset_extensions_cli/cli.py:553-580` and calls
   `rebuild_backend` at `570-574`, which in turn calls `copy_backend_files` at 
`291-295`).
   This causes `copy_backend_files` to iterate include patterns, glob files 
under
   `backend_dir.glob(pattern)` at `cli.py:251`, and pick up the symlinked file 
as `f`.
   
   4. Inside `copy_backend_files`, for the symlinked file `f`, the code 
resolves it
   (`resolved = f.resolve()` at `cli.py:257`) and then computes `relative_path =
   resolved.relative_to(backend_dir)` at `cli.py:265`. For a symlink
   `backend/src/test_org/test_ext/common/module.py` pointing to
   `backend/src/common/module.py`, `relative_path` becomes 
`src/common/module.py`. Exclude
   checks and the target path `tgt = dist_dir / "backend" / relative_path` at 
`cli.py:272`
   now operate on the resolved target path rather than the matched symlink 
path, so: (a) the
   exclude pattern `src/common/**` will unexpectedly exclude the symlinked file 
even though
   the configured include path is under `src/test_org/test_ext/**`, and (b) if 
not excluded,
   the file is copied into `dist/backend/src/common/module.py` instead of
   `dist/backend/src/test_org/test_ext/common/module.py`, changing the bundle 
layout relative
   to the backend tree and breaking symlink path semantics.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=dfab9a0071974771a315ce41907db7ab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=dfab9a0071974771a315ce41907db7ab&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset-extensions-cli/src/superset_extensions_cli/cli.py
   **Line:** 265:265
   **Comment:**
        *Logic Error: Using the resolved filesystem path as the bundle-relative 
path changes behavior for symlinked files inside `backend/`: excludes are 
evaluated against the target path instead of the configured symlink path, and 
copied files are staged under the resolved target location rather than the 
matched path. This can cause unexpected inclusions and incorrect bundle layout 
(and potential overwrites when multiple symlinks point to the same target). 
Keep the boundary check on `resolved`, but compute the staged/exclude path from 
the matched path (`f.relative_to(backend_dir)`).
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40593&comment_hash=6bdbe14a6457df49c42de5157c77f237e8d90505e2a6c8404dff8de1687d39bb&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40593&comment_hash=6bdbe14a6457df49c42de5157c77f237e8d90505e2a6c8404dff8de1687d39bb&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to