This is an automated email from the ASF dual-hosted git repository.
viirya pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new b76b6a8c133d [SPARK-55724][PYTHON][TEST][FOLLOWUP] Unify wide_values
to wide_cols in bench
b76b6a8c133d is described below
commit b76b6a8c133d677688c617eec202eba1d7d5d8d0
Author: Liang-Chi Hsieh <[email protected]>
AuthorDate: Thu May 28 12:56:20 2026 -0700
[SPARK-55724][PYTHON][TEST][FOLLOWUP] Unify wide_values to wide_cols in
bench
### What changes were proposed in this pull request?
Rename the `wide_values` scenario key to `wide_cols` in three mixins of
`python/benchmarks/bench_eval_type.py`:
- `_CogroupedMapArrowBenchMixin`
- `_CogroupedMapPandasBenchMixin`
- `_GroupedMapArrowBenchMixin`
### Why are the changes needed?
The umbrella SPARK-55724 introduced \"wide-column\" scenarios under two
different keys: `wide_values` (SPARK-55947 grouped map Arrow, SPARK-56381
cogrouped map Arrow, SPARK-56629 cogrouped map pandas) and `wide_cols`
(SPARK-56085 grouped agg, SPARK-56120 window agg Arrow, SPARK-56562 grouped agg
pandas, SPARK-56658 window agg pandas). All seven describe the same shape: few
rows per group, many columns.
The drift makes the ASV summary harder to read across eval types and adds
friction when wiring shared helpers across siblings. `wide_cols` is the more
descriptive name (the scenario varies the column count, not the value
semantics) and is already the majority spelling.
Since these benchmarks have no nightly CI consumer yet, renaming now costs
nothing in ASV history continuity; deferring the rename only makes the eventual
cleanup costlier.
### Does this PR introduce _any_ user-facing change?
No. Test-only change in the benchmark module.
### How was this patch tested?
- `grep` confirmed no remaining `wide_values` references in the file.
- Ran `setup` + `time_worker` on `wide_cols` for the three renamed
`*TimeBench` classes (`CogroupedMapArrowUDFTimeBench`,
`CogroupedMapPandasUDFTimeBench`, `GroupedMapArrowUDFTimeBench`); all passed.
- Ran the same on the four pre-existing `wide_cols` `*TimeBench` classes
(`GroupedAggArrowUDFTimeBench`, `GroupedAggPandasUDFTimeBench`,
`WindowAggArrowUDFTimeBench`, `WindowAggPandasUDFTimeBench`) to confirm no
regression.
### Was this patch authored or co-authored using generative AI tooling?
Yes. Generated-by: Claude Code (claude-opus-4-7)
Closes #56171 from viirya/SPARK-55724-wide-cols-followup.
Authored-by: Liang-Chi Hsieh <[email protected]>
Signed-off-by: Liang-Chi Hsieh <[email protected]>
---
python/benchmarks/bench_eval_type.py | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/python/benchmarks/bench_eval_type.py
b/python/benchmarks/bench_eval_type.py
index 41f2572adccb..2646dec501a4 100644
--- a/python/benchmarks/bench_eval_type.py
+++ b/python/benchmarks/bench_eval_type.py
@@ -754,7 +754,7 @@ class _CogroupedMapArrowBenchMixin:
"few_groups_lg": (50, 50_000, 1, 4),
"many_groups_sm": (2_000, 500, 1, 4),
"many_groups_lg": (500, 10_000, 1, 4),
- "wide_values": (200, 5_000, 1, 20),
+ "wide_cols": (200, 5_000, 1, 20),
"multi_key": (200, 5_000, 3, 5),
}
@@ -848,7 +848,7 @@ class _CogroupedMapPandasBenchMixin:
"few_groups_lg": (50, 10_000, 1, 4),
"many_groups_sm": (500, 200, 1, 4),
"many_groups_lg": (200, 2_000, 1, 4),
- "wide_values": (100, 1_000, 1, 20),
+ "wide_cols": (100, 1_000, 1, 20),
"multi_key": (100, 1_000, 3, 5),
}
@@ -1148,7 +1148,7 @@ class _GroupedMapArrowBenchMixin:
"few_groups_lg": (50, 50_000, 1, 4),
"many_groups_sm": (2_000, 500, 1, 4),
"many_groups_lg": (500, 10_000, 1, 4),
- "wide_values": (200, 5_000, 1, 20),
+ "wide_cols": (200, 5_000, 1, 20),
"multi_key": (200, 5_000, 3, 5),
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]