codeant-ai-for-open-source[bot] commented on code in PR #40998:
URL: https://github.com/apache/superset/pull/40998#discussion_r3402963333


##########
docker/docker-bootstrap.sh:
##########
@@ -67,16 +67,18 @@ else
   echo "Skipping local overrides"
 fi
 
+HYPHEN_SYMBOL='-'
+
 case "${1}" in
   worker)
     echo "Starting Celery worker..."
     # setting up only 2 workers by default to contain memory usage in dev 
environments
-    celery --app=superset.tasks.celery_app:app worker -O fair -l INFO 
--concurrency=${CELERYD_CONCURRENCY:-2}
+    celery --app=superset.tasks.celery_app:app worker -O fair -l INFO 
--concurrency=${CELERYD_CONCURRENCY:-2} 
--logfile=${WORKER_LOG_FILE:-$HYPHEN_SYMBOL}

Review Comment:
   **Suggestion:** The `WORKER_LOG_FILE` expansion is unquoted, so values 
containing spaces or glob characters will be split/expanded by the shell and 
can be interpreted as extra CLI arguments instead of a single logfile path. 
This can break worker startup or allow option injection through environment 
configuration; quote the logfile value expansion so it is always passed as one 
argument. [security]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Superset worker can misparse logfile, failing to start.
   - ⚠️ Env-controlled logfile allows unintended Celery flags injection.
   - ⚠️ Docker dev worker startup fragile to spaced logfile paths.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Open `docker-compose.yml` and note the `superset-worker` service at
   `docker-compose.yml:6-17` uses `command: ["/app/docker/docker-bootstrap.sh", 
"worker"]`
   and `env_file: docker/.env`.
   
   2. In `docker/.env` (loaded by that service), set 
`WORKER_LOG_FILE="/tmp/worker log
   --concurrency=10"` so the value contains both a space and an extra Celery 
option.
   
   3. Run `docker compose up superset-worker`, which starts 
`/app/docker/docker-bootstrap.sh
   worker` as defined in `docker-compose.yml:6-17`, entering the `worker)` case 
in
   `docker/docker-bootstrap.sh:72-77`.
   
   4. At `docker/docker-bootstrap.sh:76`, the shell expands
   `--logfile=${WORKER_LOG_FILE:-$HYPHEN_SYMBOL}` without quotes, producing 
arguments
   `--logfile=/tmp/worker` and `log` and `--concurrency=10`; Celery receives an 
unintended
   extra `--concurrency=10` flag and an extra positional `log`, altering worker 
behavior or
   causing startup failure.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=b3d980e19aa1420399a5ab1ebc957eba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=b3d980e19aa1420399a5ab1ebc957eba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** docker/docker-bootstrap.sh
   **Line:** 76:76
   **Comment:**
        *Security: The `WORKER_LOG_FILE` expansion is unquoted, so values 
containing spaces or glob characters will be split/expanded by the shell and 
can be interpreted as extra CLI arguments instead of a single logfile path. 
This can break worker startup or allow option injection through environment 
configuration; quote the logfile value expansion so it is always passed as one 
argument.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=8d98353f042d304c5dca8e05f77dc426181a9ab598482735e8771a5c6a9f960f&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=8d98353f042d304c5dca8e05f77dc426181a9ab598482735e8771a5c6a9f960f&reaction=dislike'>👎</a>



##########
docker/docker-bootstrap.sh:
##########
@@ -67,16 +67,18 @@ else
   echo "Skipping local overrides"
 fi
 
+HYPHEN_SYMBOL='-'
+
 case "${1}" in
   worker)
     echo "Starting Celery worker..."
     # setting up only 2 workers by default to contain memory usage in dev 
environments
-    celery --app=superset.tasks.celery_app:app worker -O fair -l INFO 
--concurrency=${CELERYD_CONCURRENCY:-2}
+    celery --app=superset.tasks.celery_app:app worker -O fair -l INFO 
--concurrency=${CELERYD_CONCURRENCY:-2} 
--logfile=${WORKER_LOG_FILE:-$HYPHEN_SYMBOL}
     ;;
   beat)
     echo "Starting Celery beat..."
     rm -f /tmp/celerybeat.pid
-    celery --app=superset.tasks.celery_app:app beat --pidfile 
/tmp/celerybeat.pid -l INFO -s "${SUPERSET_HOME}"/celerybeat-schedule
+    celery --app=superset.tasks.celery_app:app beat --pidfile 
/tmp/celerybeat.pid -l INFO -s "${SUPERSET_HOME}"/celerybeat-schedule 
--logfile=${BEAT_LOG_FILE:-$HYPHEN_SYMBOL}

Review Comment:
   **Suggestion:** The `BEAT_LOG_FILE` expansion is also unquoted, so a path 
with whitespace or wildcard characters will be word-split/glob-expanded and 
passed incorrectly to Celery beat. Quote the logfile value expansion to 
preserve it as a single safe argument. [security]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Superset beat scheduler may fail to start correctly.
   - ⚠️ Env logfile value can inject unintended beat options.
   - ⚠️ Scheduled tasks reliability depends on whitespace-free logfile.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Open `docker-compose.yml` and note the `superset-worker-beat` service at
   `docker-compose.yml:31-40` uses `command: 
["/app/docker/docker-bootstrap.sh", "beat"]` and
   `env_file: docker/.env`.
   
   2. In `docker/.env` (used by that service), set `BEAT_LOG_FILE="/tmp/beat log
   --max-interval=120"` so the value includes spaces and an extra Celery beat 
option.
   
   3. Run `docker compose up superset-worker-beat`, which starts
   `/app/docker/docker-bootstrap.sh beat` as configured, entering the `beat)` 
branch in
   `docker/docker-bootstrap.sh:78-82`.
   
   4. At `docker/docker-bootstrap.sh:81`, the unquoted expansion of
   `--logfile=${BEAT_LOG_FILE:-$HYPHEN_SYMBOL}` splits into 
`--logfile=/tmp/beat`, `log`, and
   `--max-interval=120`; Celery beat receives a malformed extra positional 
`log` and an
   unintended `--max-interval` override, which can cause beat startup failure 
or incorrect
   scheduling behavior.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=921403b357784fd9ad2e0f052110b7c4&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=921403b357784fd9ad2e0f052110b7c4&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** docker/docker-bootstrap.sh
   **Line:** 81:81
   **Comment:**
        *Security: The `BEAT_LOG_FILE` expansion is also unquoted, so a path 
with whitespace or wildcard characters will be word-split/glob-expanded and 
passed incorrectly to Celery beat. Quote the logfile value expansion to 
preserve it as a single safe argument.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=3494fd0f80334885418438c7b6530069d77e62c59e57f6595e0e31ba00a1e338&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=3494fd0f80334885418438c7b6530069d77e62c59e57f6595e0e31ba00a1e338&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to