Hi Jisoo, the initial version of groupBy v2
On Thu, Jul 19, 2018 at 2:42 PM Jisoo Kim <[email protected]> wrote: > Hi all, > > I am currently working on a project that uses Druid's QueryRunner and other > druid-processing classes. It uses Druid's own classes to calculate query > results. I have been testing large GroupBy queries (using v2), and it seems > like parallel combining threads for GroupBy queries are only enabled on the > historical level. I think it is only getting called by > GroupByStrategyV2.mergeRunners() > < > https://github.com/apache/incubator-druid/blob/druid-0.12.1/processing/src/main/java/io/druid/query/groupby/strategy/GroupByStrategyV2.java#L335 > > > which is only called by GroupByQueryRunnerFactory.mergeRunners() on > historicals. > > Are GroupByMergingQueryRunnerV2 and parallel combining threads meant for > computing and merging per-segment results only, or can they also be used on > the broker level? I changed the logic of my project from calling > queryToolChest.mergeResults() on MergeSequence (created by providing a list > of per-segment/per-server sequences) to calling > queryToolChest.mergeResults() on queryRunnerFactory.mergeRunners() (where > each runner returns a deserialized result sequence), and that seemed to > have reduced really heavy groupby query computation time or failures by > quite a lot. Or is this just a coincidence and there shouldn't be a > performance difference in merging groupby query results, and the only > difference could've been by parallelizing the deserialization of result > sequences from sub-queries? > > Thanks, > Jisoo >
