Re: [PR] [QDP] Enhance benchmark's installation journey [mahout]

via GitHub Sat, 14 Mar 2026 07:24:25 -0700


Copilot commented on code in PR #1154:
URL: https://github.com/apache/mahout/pull/1154#discussion_r2935345737



##########
qdp/DEVELOPMENT.md:
##########
@@ -93,14 +93,18 @@ uv run pytest qdp/qdp-python/tests -v
 
 ## 4. Benchmarks
 
-Install benchmark dependency group into the same root venv:
+From the repo root, set up and prepare benchmarks:
 
 ```bash
-source .venv/bin/activate
-uv sync --project qdp/qdp-python --group benchmark --active
+make benchmark
 ```
 
-Run benchmark scripts:
+This will:
+1. Install benchmark dependencies into the unified root venv
+2. Build the QDP extension (if GPU available)
+3. Display instructions for running specific benchmarks
+
+Then run benchmark scripts:

Review Comment:
   This section documents that `make benchmark` will “Build the QDP extension 
(if GPU available)” and “Display instructions”, but the Makefile currently 
skips the instruction output entirely when no NVIDIA GPU is detected. Either 
update the docs to match the current behavior (skip + GPU requirement), or 
adjust the Makefile so it still prints the usage instructions even when it 
can’t build/run.



##########
qdp/qdp-python/benchmark/README.md:
##########
@@ -11,12 +11,20 @@ scripts:
 
 ## Quick Start
 
-From the repo root:
+From the repo root, the easiest way to set up and run benchmarks is:
+
+```bash
+make benchmark
+```
+
+This will:
+1. Set up the benchmark environment in the unified root venv (`mahout/.venv`)
+2. Build the QDP extension (if GPU available)
+3. Display instructions for running specific benchmarks
+
+Then run individual benchmarks:

Review Comment:
   This section says `make benchmark` is the easiest way to “set up and run 
benchmarks”, but the Makefile target currently only prepares the environment 
and prints instructions (and on CPU-only machines it explicitly skips running). 
Update the wording here to reflect what the target actually does, and consider 
calling out the NVIDIA GPU requirement up front so users don’t expect 
benchmarks to run on CPU-only systems.



##########
Makefile:
##########
@@ -44,3 +44,34 @@ tests: test_rust test_python
 
 pre-commit: setup-test-python
        uv run pre-commit run --all-files
+
+setup-benchmark: setup-test-python
+ifeq ($(HAS_NVIDIA),yes)
+       @echo "[INFO] Setting up benchmark environment..."
+       uv sync --group dev --extra qdp
+       uv sync --project qdp/qdp-python --group benchmark --active
+       unset CONDA_PREFIX && uv run --active maturin develop --manifest-path 
qdp/qdp-python/Cargo.toml
+else
+       @echo "[SKIP] No NVIDIA GPU detected, skipping maturin develop"
+       @echo "[INFO] Setting up benchmark environment (CPU-only)..."
+       uv sync --project qdp/qdp-python --group benchmark --active
+endif
+
+benchmark: setup-benchmark
+ifeq ($(HAS_NVIDIA),yes)
+       @echo "[INFO] Running benchmarks..."
+       @echo "[INFO] Available benchmark scripts:"
+       @echo "  - benchmark_e2e.py: End-to-end latency (Disk -> GPU VRAM)"
+       @echo "  - benchmark_latency.py: Data-to-State latency (CPU RAM -> GPU 
VRAM)"
+       @echo "  - benchmark_throughput.py: DataLoader-style throughput"
+       @echo ""
+       @echo "[INFO] Run specific benchmarks with:"
+       @echo "  uv run --active python 
qdp/qdp-python/benchmark/benchmark_e2e.py"
+       @echo "  uv run --active python 
qdp/qdp-python/benchmark/benchmark_latency.py"
+       @echo "  uv run --active python 
qdp/qdp-python/benchmark/benchmark_throughput.py"

Review Comment:
   The instructions printed by `make benchmark` recommend `uv run --active 
...`, but the docs in this PR recommend `uv run --project qdp/qdp-python ...`. 
Please align these (either update the printed instructions or update the docs) 
to avoid users running commands that don’t match the intended unified-venv 
workflow.
   



##########
Makefile:
##########
@@ -44,3 +44,34 @@ tests: test_rust test_python
 
 pre-commit: setup-test-python
        uv run pre-commit run --all-files
+
+setup-benchmark: setup-test-python
+ifeq ($(HAS_NVIDIA),yes)
+       @echo "[INFO] Setting up benchmark environment..."
+       uv sync --group dev --extra qdp
+       uv sync --project qdp/qdp-python --group benchmark --active
+       unset CONDA_PREFIX && uv run --active maturin develop --manifest-path 
qdp/qdp-python/Cargo.toml

Review Comment:
   `setup-benchmark` uses `uv sync ... --active` and `uv run --active ...`, but 
this Makefile target never activates `.venv`. In this repo, `--active` is 
documented to require `source .venv/bin/activate`, so `make benchmark` will 
likely fail with no active environment. Consider either (a) invoking commands 
as `. .venv/bin/activate && uv ... --active` (on each line), or (b) removing 
`--active` and using a uv option that targets the root `.venv` explicitly so 
the target works from a fresh shell as documented.



##########
Makefile:
##########
@@ -44,3 +44,34 @@ tests: test_rust test_python
 
 pre-commit: setup-test-python
        uv run pre-commit run --all-files
+
+setup-benchmark: setup-test-python
+ifeq ($(HAS_NVIDIA),yes)
+       @echo "[INFO] Setting up benchmark environment..."
+       uv sync --group dev --extra qdp
+       uv sync --project qdp/qdp-python --group benchmark --active

Review Comment:
   `setup-benchmark` depends on `setup-test-python` (which already runs `uv 
sync --group dev`), but the GPU branch runs `uv sync --group dev --extra qdp` 
again. This re-sync is redundant and can significantly slow `make benchmark`. 
Consider restructuring so the dev sync happens only once (e.g., have 
`setup-benchmark` perform the needed sync itself and drop the dependency, or 
split a `setup-dev` vs `setup-dev-qdp` target to avoid duplicate work).



##########
Makefile:
##########
@@ -44,3 +44,34 @@ tests: test_rust test_python
 
 pre-commit: setup-test-python
        uv run pre-commit run --all-files
+
+setup-benchmark: setup-test-python
+ifeq ($(HAS_NVIDIA),yes)
+       @echo "[INFO] Setting up benchmark environment..."
+       uv sync --group dev --extra qdp
+       uv sync --project qdp/qdp-python --group benchmark --active
+       unset CONDA_PREFIX && uv run --active maturin develop --manifest-path 
qdp/qdp-python/Cargo.toml
+else
+       @echo "[SKIP] No NVIDIA GPU detected, skipping maturin develop"
+       @echo "[INFO] Setting up benchmark environment (CPU-only)..."
+       uv sync --project qdp/qdp-python --group benchmark --active
+endif
+
+benchmark: setup-benchmark
+ifeq ($(HAS_NVIDIA),yes)
+       @echo "[INFO] Running benchmarks..."
+       @echo "[INFO] Available benchmark scripts:"
+       @echo "  - benchmark_e2e.py: End-to-end latency (Disk -> GPU VRAM)"
+       @echo "  - benchmark_latency.py: Data-to-State latency (CPU RAM -> GPU 
VRAM)"
+       @echo "  - benchmark_throughput.py: DataLoader-style throughput"

Review Comment:
   The `benchmark` target only prints guidance, but the output starts with 
"[INFO] Running benchmarks..." which implies it actually executes them. Please 
adjust the message (or run the benchmarks) so users aren’t misled about what 
`make benchmark` does.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [QDP] Enhance benchmark's installation journey [mahout]

Reply via email to