cryptoe commented on code in PR #12998:
URL: https://github.com/apache/druid/pull/12998#discussion_r972071372
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/statistics/QuantilesSketchKeyCollectorFactory.java:
##########
@@ -38,9 +38,9 @@
public class QuantilesSketchKeyCollectorFactory
implements KeyCollectorFactory<QuantilesSketchKeyCollector,
QuantilesSketchKeyCollectorSnapshot>
{
- // smallest value with normalized rank error < 0.1%; retain up to ~86k
elements
+ // smallest value with normalized rank error < 0.01%; retain up to ~430k
elements
@VisibleForTesting
- static final int SKETCH_INITIAL_K = 1 << 12;
+ static final int SKETCH_INITIAL_K = 1 << 15;
Review Comment:
So as this change would have some implications on how much memory we use,
can we document this as part of the PR description?
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/statistics/DistinctKeyCollector.java:
##########
@@ -43,8 +43,8 @@
*/
public class DistinctKeyCollector implements KeyCollector<DistinctKeyCollector>
{
- static final int INITIAL_MAX_KEYS = 2 << 15 /* 65,536 */;
- static final int SMALLEST_MAX_KEYS = 16;
+ static final int INITIAL_MAX_BYTES = 5_120_000;
Review Comment:
Should this be 10 MB?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]