Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/21931#discussion_r207519280
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
---
@@ -83,7 +84,7 @@ class VectorizedHashMapGenerator(
| private ${classOf[ColumnarBatch].getName} batch;
| private ${classOf[MutableColumnarRow].getName} aggBufferRow;
| private int[] buckets;
- | private int capacity = 1 << 16;
+ | private int capacity = $maxCapacity;
--- End diff --
We can see the following code at L226. If a user specify `2^n` value (e.g.
1024), it works functionally correct. What happens if a user specified non
`2^n` value (e.g. 127)?
```
idx = (idx + 1) & (numBuckets - 1);
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]