gianm commented on a change in pull request #8194: HllSketch Merge/Build
BufferAggregators: Speed up init with prebuilt sketch.
URL: https://github.com/apache/incubator-druid/pull/8194#discussion_r308984030
##########
File path:
extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/hll/HllSketchMergeBufferAggregator.java
##########
@@ -60,17 +69,25 @@ public HllSketchMergeBufferAggregator(
this.lgK = lgK;
this.tgtHllType = tgtHllType;
this.size = size;
+ this.emptyUnion = new byte[size];
+
+ //noinspection ResultOfObjectAllocationIgnored (Union writes to
"emptyUnion" as a side effect of construction)
+ new Union(lgK, WritableMemory.wrap(emptyUnion));
}
- @SuppressWarnings("ResultOfObjectAllocationIgnored")
@Override
public void init(final ByteBuffer buf, final int position)
{
- final WritableMemory mem = WritableMemory.wrap(buf,
ByteOrder.LITTLE_ENDIAN).writableRegion(position, size);
- // Not necessary to keep the constructed object since it is cheap to
reconstruct by wrapping the memory.
- // The objects are not cached as in BuildBufferAggregator since they never
exceed the max size and never move.
Review comment:
But after this patch, they aren't constructed in init any longer. So I
thought including this comment in "init" would be confusing to a reader, and
"aggregate" was the next-best place (since that's now where Union objects are
created and thrown away after being used):
```java
try {
// Not necessary to keep the constructed object since it is cheap to
reconstruct by wrapping the memory.
// The objects are not cached as in HllSketchBuildBufferAggregator,
since they never exceed the max size and
// never move.
final Union union = Union.writableWrap(mem);
union.update(sketch);
}
```
I suppose the comment could go into the constructor instead, which is where
the "new Union" code from "init" migrated. But the comment, as currently
written, is a non-sequitur there, since there would be no reason to cache the
Union object in the constructor anyway (it's only used to generate the
emptyUnion bytes). It would need to be rewritten somehow.
I think things make sense the way they are in the latest patch, but if you
have a better suggestion I am all ears.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]