codeant-ai-for-open-source[bot] commented on code in PR #40998:
URL: https://github.com/apache/superset/pull/40998#discussion_r3402963333
##########
docker/docker-bootstrap.sh:
##########
@@ -67,16 +67,18 @@ else
echo "Skipping local overrides"
fi
+HYPHEN_SYMBOL='-'
+
case "${1}" in
worker)
echo "Starting Celery worker..."
# setting up only 2 workers by default to contain memory usage in dev
environments
- celery --app=superset.tasks.celery_app:app worker -O fair -l INFO
--concurrency=${CELERYD_CONCURRENCY:-2}
+ celery --app=superset.tasks.celery_app:app worker -O fair -l INFO
--concurrency=${CELERYD_CONCURRENCY:-2}
--logfile=${WORKER_LOG_FILE:-$HYPHEN_SYMBOL}
Review Comment:
**Suggestion:** The `WORKER_LOG_FILE` expansion is unquoted, so values
containing spaces or glob characters will be split/expanded by the shell and
can be interpreted as extra CLI arguments instead of a single logfile path.
This can break worker startup or allow option injection through environment
configuration; quote the logfile value expansion so it is always passed as one
argument. [security]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Superset worker can misparse logfile, failing to start.
- ⚠️ Env-controlled logfile allows unintended Celery flags injection.
- ⚠️ Docker dev worker startup fragile to spaced logfile paths.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open `docker-compose.yml` and note the `superset-worker` service at
`docker-compose.yml:6-17` uses `command: ["/app/docker/docker-bootstrap.sh",
"worker"]`
and `env_file: docker/.env`.
2. In `docker/.env` (loaded by that service), set
`WORKER_LOG_FILE="/tmp/worker log
--concurrency=10"` so the value contains both a space and an extra Celery
option.
3. Run `docker compose up superset-worker`, which starts
`/app/docker/docker-bootstrap.sh
worker` as defined in `docker-compose.yml:6-17`, entering the `worker)` case
in
`docker/docker-bootstrap.sh:72-77`.
4. At `docker/docker-bootstrap.sh:76`, the shell expands
`--logfile=${WORKER_LOG_FILE:-$HYPHEN_SYMBOL}` without quotes, producing
arguments
`--logfile=/tmp/worker` and `log` and `--concurrency=10`; Celery receives an
unintended
extra `--concurrency=10` flag and an extra positional `log`, altering worker
behavior or
causing startup failure.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=b3d980e19aa1420399a5ab1ebc957eba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=b3d980e19aa1420399a5ab1ebc957eba&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** docker/docker-bootstrap.sh
**Line:** 76:76
**Comment:**
*Security: The `WORKER_LOG_FILE` expansion is unquoted, so values
containing spaces or glob characters will be split/expanded by the shell and
can be interpreted as extra CLI arguments instead of a single logfile path.
This can break worker startup or allow option injection through environment
configuration; quote the logfile value expansion so it is always passed as one
argument.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=8d98353f042d304c5dca8e05f77dc426181a9ab598482735e8771a5c6a9f960f&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=8d98353f042d304c5dca8e05f77dc426181a9ab598482735e8771a5c6a9f960f&reaction=dislike'>👎</a>
##########
docker/docker-bootstrap.sh:
##########
@@ -67,16 +67,18 @@ else
echo "Skipping local overrides"
fi
+HYPHEN_SYMBOL='-'
+
case "${1}" in
worker)
echo "Starting Celery worker..."
# setting up only 2 workers by default to contain memory usage in dev
environments
- celery --app=superset.tasks.celery_app:app worker -O fair -l INFO
--concurrency=${CELERYD_CONCURRENCY:-2}
+ celery --app=superset.tasks.celery_app:app worker -O fair -l INFO
--concurrency=${CELERYD_CONCURRENCY:-2}
--logfile=${WORKER_LOG_FILE:-$HYPHEN_SYMBOL}
;;
beat)
echo "Starting Celery beat..."
rm -f /tmp/celerybeat.pid
- celery --app=superset.tasks.celery_app:app beat --pidfile
/tmp/celerybeat.pid -l INFO -s "${SUPERSET_HOME}"/celerybeat-schedule
+ celery --app=superset.tasks.celery_app:app beat --pidfile
/tmp/celerybeat.pid -l INFO -s "${SUPERSET_HOME}"/celerybeat-schedule
--logfile=${BEAT_LOG_FILE:-$HYPHEN_SYMBOL}
Review Comment:
**Suggestion:** The `BEAT_LOG_FILE` expansion is also unquoted, so a path
with whitespace or wildcard characters will be word-split/glob-expanded and
passed incorrectly to Celery beat. Quote the logfile value expansion to
preserve it as a single safe argument. [security]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ Superset beat scheduler may fail to start correctly.
- ⚠️ Env logfile value can inject unintended beat options.
- ⚠️ Scheduled tasks reliability depends on whitespace-free logfile.
```
</details>
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open `docker-compose.yml` and note the `superset-worker-beat` service at
`docker-compose.yml:31-40` uses `command:
["/app/docker/docker-bootstrap.sh", "beat"]` and
`env_file: docker/.env`.
2. In `docker/.env` (used by that service), set `BEAT_LOG_FILE="/tmp/beat log
--max-interval=120"` so the value includes spaces and an extra Celery beat
option.
3. Run `docker compose up superset-worker-beat`, which starts
`/app/docker/docker-bootstrap.sh beat` as configured, entering the `beat)`
branch in
`docker/docker-bootstrap.sh:78-82`.
4. At `docker/docker-bootstrap.sh:81`, the unquoted expansion of
`--logfile=${BEAT_LOG_FILE:-$HYPHEN_SYMBOL}` splits into
`--logfile=/tmp/beat`, `log`, and
`--max-interval=120`; Celery beat receives a malformed extra positional
`log` and an
unintended `--max-interval` override, which can cause beat startup failure
or incorrect
scheduling behavior.
```
</details>
[Fix in
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt_id=921403b357784fd9ad2e0f052110b7c4&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
| [Fix in VSCode
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt_id=921403b357784fd9ad2e0f052110b7c4&service=github&base_url=https%3A%2F%2Fgithub.com&org=apache&repo=apache%2Fsuperset)
*(Use Cmd/Ctrl + Click for best experience)*
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** docker/docker-bootstrap.sh
**Line:** 81:81
**Comment:**
*Security: The `BEAT_LOG_FILE` expansion is also unquoted, so a path
with whitespace or wildcard characters will be word-split/glob-expanded and
passed incorrectly to Celery beat. Quote the logfile value expansion to
preserve it as a single safe argument.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask
user if the user wants to fix the rest of the comments as well. if said yes,
then fetch all the comments validate the correctness and implement a minimal fix
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=3494fd0f80334885418438c7b6530069d77e62c59e57f6595e0e31ba00a1e338&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40998&comment_hash=3494fd0f80334885418438c7b6530069d77e62c59e57f6595e0e31ba00a1e338&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]