Re: Alternate concurrent DrillSideways approach?

Greg Miller Tue, 23 Feb 2021 19:52:19 -0800

Thanks Mike! I'll follow up with results once I have an opportunity to test
out an alternate approach.


DrillSidewaysQuery certainly looks setup to avoid lots of duplicate work,
and it was a little surprising to find such a different approach in the
concurrent version. That said, the code is certainly much simpler to run a
bunch of DrillSidewaysQueries in parallel!

Cheers,
-Greg

On Tue, Feb 23, 2021 at 1:32 PM Michael McCandless <
[email protected]> wrote:

> Hi Greg,
>
> As far as I know nobody has experimented any further with concurrent
> implementation for drill sideways.  Patches welcome!
>
> I would be curious to know how those two concurrent solutions we support
> today compare with the serial performance of DrillSidewaysQuery.  The
> redundant work is indeed frustrating and was the original motivation for
> creating DrillSidewaysQuery in the first place.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Mon, Feb 15, 2021 at 12:05 PM Greg Miller <[email protected]> wrote:
>
>> Hi folks-
>>
>> I'm reaching out to understand if there's been any past exploration into
>> alternative concurrent DrillSideways execution approaches. My understanding
>> of the current approach is that we're achieving some concurrency by using a
>> CollectorManager with IndexSearcher (allowing parallel execution across the
>> shards) but also collecting the different facet results by executing N
>> separate drill down queries, where N is the number of drill downs applied,
>> each with one of the drill down restrictions removed. This approach seems
>> like it would do a large amount of duplicate computational work when
>> executing these queries (e.g., just think of the base query component of
>> each drill down query being executing N times).
>>
>> Michael McCandless brought up
>> <https://issues.apache.org/jira/browse/LUCENE-7588> an alternate
>> approach of sticking with the existing "doc at a time" methodology (rather
>> than implementing this "query at a time" approach), but it's not clear to
>> me if it was explored further. It seems to me like the latency regression
>> of "doc at a time" would likely be fairly small but the overall computation
>> for these searches may drop significantly. Is there any more history on
>> this approach that folks are aware of, or any thoughts on whether-or-not it
>> would be valuable to explore a "doc at a time" approach (essentially create
>> a single DrillSidewaysQuery and hand that off to IndexSearcher with the
>> CollectorManager instead of scheduling N IndexSearcher searches as is done
>> today)?
>>
>> Thanks in advance for any thoughts/info/discussion!
>>
>> Cheers,
>> -Greg
>>
>

Re: Alternate concurrent DrillSideways approach?

Reply via email to