richardstartin commented on a change in pull request #8189:
URL: https://github.com/apache/pinot/pull/8189#discussion_r804687447
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/operator/query/DictionaryBasedAggregationOperator.java
##########
@@ -152,6 +123,50 @@ private double toDouble(Comparable value) {
}
}
+ private Set getDistinctValueSet(Dictionary dictionary) {
+ int dictionarySize = dictionary.length();
+ switch (dictionary.getValueType()) {
+ case INT:
+ IntOpenHashSet intSet = new IntOpenHashSet(dictionarySize);
+ for (int dictId = 0; dictId < dictionarySize; dictId++) {
+ intSet.add(dictionary.getIntValue(dictId));
+ }
+ return intSet;
+ case LONG:
+ LongOpenHashSet longSet = new LongOpenHashSet(dictionarySize);
+ for (int dictId = 0; dictId < dictionarySize; dictId++) {
+ longSet.add(dictionary.getLongValue(dictId));
+ }
+ return longSet;
+ case FLOAT:
+ FloatOpenHashSet floatSet = new FloatOpenHashSet(dictionarySize);
+ for (int dictId = 0; dictId < dictionarySize; dictId++) {
+ floatSet.add(dictionary.getFloatValue(dictId));
+ }
+ return floatSet;
+ case DOUBLE:
+ DoubleOpenHashSet doubleSet = new DoubleOpenHashSet(dictionarySize);
+ for (int dictId = 0; dictId < dictionarySize; dictId++) {
+ doubleSet.add(dictionary.getDoubleValue(dictId));
+ }
+ return doubleSet;
Review comment:
this appears to be beneficial:
```java
RoaringBitmap bitmap = new RoaringBitmap();
FloatOpenHashSet set = new FloatOpenHashSet();
long bitmapBefore = GraphLayout.parseInstance(bitmap).totalSize();
long setBefore = GraphLayout.parseInstance(set).totalSize();
for (int i = 0; i < 1 << 20; i++) {
float f = ThreadLocalRandom.current().nextFloat() *
ThreadLocalRandom.current().nextLong();
bitmap.add(Float.floatToIntBits(f));
set.add(f);
}
System.err.println("bitmap: " +
((GraphLayout.parseInstance(bitmap).totalSize() - bitmapBefore) >>> 20) + "MB");
System.err.println(GraphLayout.parseInstance(bitmap).toFootprint());
System.err.println("set: " +
((GraphLayout.parseInstance(set).totalSize() - setBefore) >>> 20) + "MB");
System.err.println(GraphLayout.parseInstance(set).toFootprint());
```
```
bitmap: 2MB
org.roaringbitmap.RoaringBitmap@36a6bea6d footprint:
COUNT AVG SUM DESCRIPTION
3618 706 2555408 [C
1 15008 15008 [Lorg.roaringbitmap.Container;
3617 24 86808 org.roaringbitmap.ArrayContainer
1 24 24 org.roaringbitmap.RoaringArray
1 16 16 org.roaringbitmap.RoaringBitmap
7238 2657264 (total)
set: 7MB
it.unimi.dsi.fastutil.floats.FloatOpenHashSet@a62c7cdd footprint:
COUNT AVG SUM DESCRIPTION
1 8388632 8388632 [F
1 48 48
it.unimi.dsi.fastutil.floats.FloatOpenHashSet
2 8388680 (total)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]