(spark) branch master updated: [SPARK-55724][PYTHON][TEST][FOLLOWUP] Unify wide_values to wide_cols in bench

viirya Thu, 28 May 2026 12:57:13 -0700

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new b76b6a8c133d [SPARK-55724][PYTHON][TEST][FOLLOWUP] Unify wide_values 
to wide_cols in bench
b76b6a8c133d is described below

commit b76b6a8c133d677688c617eec202eba1d7d5d8d0
Author: Liang-Chi Hsieh <[email protected]>
AuthorDate: Thu May 28 12:56:20 2026 -0700

    [SPARK-55724][PYTHON][TEST][FOLLOWUP] Unify wide_values to wide_cols in 
bench
    
    ### What changes were proposed in this pull request?
    
    Rename the `wide_values` scenario key to `wide_cols` in three mixins of 
`python/benchmarks/bench_eval_type.py`:
    
    - `_CogroupedMapArrowBenchMixin`
    - `_CogroupedMapPandasBenchMixin`
    - `_GroupedMapArrowBenchMixin`
    
    ### Why are the changes needed?
    
    The umbrella SPARK-55724 introduced \"wide-column\" scenarios under two 
different keys: `wide_values` (SPARK-55947 grouped map Arrow, SPARK-56381 
cogrouped map Arrow, SPARK-56629 cogrouped map pandas) and `wide_cols` 
(SPARK-56085 grouped agg, SPARK-56120 window agg Arrow, SPARK-56562 grouped agg 
pandas, SPARK-56658 window agg pandas). All seven describe the same shape: few 
rows per group, many columns.
    
    The drift makes the ASV summary harder to read across eval types and adds 
friction when wiring shared helpers across siblings. `wide_cols` is the more 
descriptive name (the scenario varies the column count, not the value 
semantics) and is already the majority spelling.
    
    Since these benchmarks have no nightly CI consumer yet, renaming now costs 
nothing in ASV history continuity; deferring the rename only makes the eventual 
cleanup costlier.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. Test-only change in the benchmark module.
    
    ### How was this patch tested?
    
    - `grep` confirmed no remaining `wide_values` references in the file.
    - Ran `setup` + `time_worker` on `wide_cols` for the three renamed 
`*TimeBench` classes (`CogroupedMapArrowUDFTimeBench`, 
`CogroupedMapPandasUDFTimeBench`, `GroupedMapArrowUDFTimeBench`); all passed.
    - Ran the same on the four pre-existing `wide_cols` `*TimeBench` classes 
(`GroupedAggArrowUDFTimeBench`, `GroupedAggPandasUDFTimeBench`, 
`WindowAggArrowUDFTimeBench`, `WindowAggPandasUDFTimeBench`) to confirm no 
regression.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Yes. Generated-by: Claude Code (claude-opus-4-7)
    
    Closes #56171 from viirya/SPARK-55724-wide-cols-followup.
    
    Authored-by: Liang-Chi Hsieh <[email protected]>
    Signed-off-by: Liang-Chi Hsieh <[email protected]>
---
 python/benchmarks/bench_eval_type.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/python/benchmarks/bench_eval_type.py 
b/python/benchmarks/bench_eval_type.py
index 41f2572adccb..2646dec501a4 100644
--- a/python/benchmarks/bench_eval_type.py
+++ b/python/benchmarks/bench_eval_type.py
@@ -754,7 +754,7 @@ class _CogroupedMapArrowBenchMixin:
         "few_groups_lg": (50, 50_000, 1, 4),
         "many_groups_sm": (2_000, 500, 1, 4),
         "many_groups_lg": (500, 10_000, 1, 4),
-        "wide_values": (200, 5_000, 1, 20),
+        "wide_cols": (200, 5_000, 1, 20),
         "multi_key": (200, 5_000, 3, 5),
     }
 
@@ -848,7 +848,7 @@ class _CogroupedMapPandasBenchMixin:
         "few_groups_lg": (50, 10_000, 1, 4),
         "many_groups_sm": (500, 200, 1, 4),
         "many_groups_lg": (200, 2_000, 1, 4),
-        "wide_values": (100, 1_000, 1, 20),
+        "wide_cols": (100, 1_000, 1, 20),
         "multi_key": (100, 1_000, 3, 5),
     }
 
@@ -1148,7 +1148,7 @@ class _GroupedMapArrowBenchMixin:
         "few_groups_lg": (50, 50_000, 1, 4),
         "many_groups_sm": (2_000, 500, 1, 4),
         "many_groups_lg": (500, 10_000, 1, 4),
-        "wide_values": (200, 5_000, 1, 20),
+        "wide_cols": (200, 5_000, 1, 20),
         "multi_key": (200, 5_000, 3, 5),
     }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55724][PYTHON][TEST][FOLLOWUP] Unify wide_values to wide_cols in bench

Reply via email to