cloud-fan commented on code in PR #38722:
URL: https://github.com/apache/spark/pull/38722#discussion_r1027662257
##########
core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java:
##########
@@ -812,9 +812,7 @@ public boolean append(Object kbase, long koff, int klen,
Object vbase, long voff
// If the map has reached its growth threshold, try to grow it.
if (numKeys >= growthThreshold) {
- // We use two array entries per key, so the array size is twice the
capacity.
- // We should compare the current capacity of the array, instead of
its size.
- if (longArray.size() / 2 < MAX_CAPACITY) {
+ if (longArray.size() < MAX_CAPACITY) {
Review Comment:
I think we need to read the code of this entire class to understand the
capacity. According to the code comment
```
/**
* The maximum number of keys that BytesToBytesMap supports. The hash
table has to be
* power-of-2-sized and its backing Java array can contain at most (1
<< 30) elements,
* since that's the largest power-of-2 that's less than Integer.MAX_VALUE.
We need two long array
* entries per key, giving us a maximum capacity of (1 << 29).
*/
public static final int MAX_CAPACITY = (1 << 29);
```
This is to restrict the number of keys, and `longArray.size() / 2` is the
number of keys. The code here should be correct.
I also saw code like this
```
private void allocate(int capacity) {
assert (capacity >= 0);
capacity = Math.max((int) Math.min(MAX_CAPACITY,
ByteArrayMethods.nextPowerOf2(capacity)), 64);
assert (capacity <= MAX_CAPACITY);
longArray = allocateArray(capacity * 2L);
longArray.zeroOut();
this.growthThreshold = (int) (capacity * loadFactor);
this.mask = capacity - 1;
}
```
`longArray` is 2x capacity.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]