Re: Question on GroupBy query results merging process

Jihoon Son Thu, 19 Jul 2018 14:58:58 -0700

Hi Jisoo,

the initial version of groupBy v2


On Thu, Jul 19, 2018 at 2:42 PM Jisoo Kim <[email protected]>
wrote:

> Hi all,
>
> I am currently working on a project that uses Druid's QueryRunner and other
> druid-processing classes. It uses Druid's own classes to calculate query
> results. I have been testing large GroupBy queries (using v2), and it seems
> like parallel combining threads for GroupBy queries are only enabled on the
> historical level. I think it is only getting called by
> GroupByStrategyV2.mergeRunners()
> <
> https://github.com/apache/incubator-druid/blob/druid-0.12.1/processing/src/main/java/io/druid/query/groupby/strategy/GroupByStrategyV2.java#L335
> >
> which is only called by GroupByQueryRunnerFactory.mergeRunners() on
> historicals.
>
> Are GroupByMergingQueryRunnerV2 and parallel combining threads meant for
> computing and merging per-segment results only, or can they also be used on
> the broker level? I changed the logic of my project from calling
> queryToolChest.mergeResults() on MergeSequence (created by providing a list
> of per-segment/per-server sequences) to calling
> queryToolChest.mergeResults() on queryRunnerFactory.mergeRunners() (where
> each runner returns a deserialized result sequence), and that seemed to
> have reduced really heavy groupby query computation time or failures by
> quite a lot. Or is this just a coincidence and there shouldn't be a
> performance difference in merging groupby query results, and the only
> difference could've been by parallelizing the deserialization of result
> sequences from sub-queries?
>
> Thanks,
> Jisoo
>

Re: Question on GroupBy query results merging process

Reply via email to