@a2l007 thanks, but I still wonder how the lazy evaluation in `MergeSequence` 
prevents the query execution from being parallelized. `AsyncQueryRunner` 
immediately executes the given query and the returned `LazySequence` can wait 
for the query execution to be completed when `toYielder()` is called in 
`MergeSequence`. Even though the `baseRunner` in `UnionQuery` is essentially 
`CachingClusteredClient` which blocks until it merges all intermediates from 
historicals, `AsyncQueryRunner` looks feasible to parallelize the query 
execution in `UnionQuery`.

[The way the broker works is 
similar](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/org/apache/druid/client/CachingClusteredClient.java#L280-L287).

```java
      return new LazySequence<>(() -> {
        List<Sequence<T>> sequencesByInterval = new 
ArrayList<>(alreadyCachedResults.size() + segmentsByServer.size());
        addSequencesFromCache(sequencesByInterval, alreadyCachedResults);
        addSequencesFromServer(sequencesByInterval, segmentsByServer);
        return Sequences
            .simple(sequencesByInterval)
            .flatMerge(seq -> seq, query.getResultOrdering());
      });
```

It creates queryRunners which get results from historicals 
(`DirectDruidClient`) and caches asynchronously, and then merges the returned 
sequences using `MergeSequence`. 

Am I missing something?

But, even if this is correct, I'm not sure it's a good idea to use 
`AsyncQueryRunner` in `UnionQuery`. This means we have two layers for 
parallelizing unionQueries in `UnionQuery` and `CachingClusteredClient` 
classes. It might be better to improve `CachingClusteredClient` to handle all 
parallelization logic including for unionQueries. Maybe is this similar to your 
idea to improve `BrokerServerView`? Though I'm not sure what it means exactly.

[ Full content available at: 
https://github.com/apache/incubator-druid/issues/6057 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to