Hi Luca,
Thanks for the suggestion. Let me explore more on it and I will get back.
But I agree making the slices non-final will require consumers to ensure
slices are not mutated after the construction time and is not preferred.

Thanks,
Sorabh



On Tue, May 23, 2023 at 3:14 AM Luca Cavanna <java...@apache.org> wrote:

> Hi Sorabh,
> thanks for explaining. I see what you mean, it does get awkward to
> customize how slices are created. If the plan is to make SliceExecutor
> public and extensible, would it make sense to figure out what its public
> methods should be, and include the slice creation in there so it is
> detached from the IndexSearcher? I think that in practice there is a
> correlation between how slices are created and how they are executed (e.g.
> you may want to limit the number of slices created and execute each one on
> a separate thread, or possibly have more slices than threads, and reuse the
> same thread to execute multiple slices). Moving the slice creation to the
> component responsible for their execution would give the desired
> flexibility for users to customize their logic without having to poke with
> IndexSearcher constructors? What do you think? I personally prefer this
> option over adding a function argument to the existing searcher
> constructor, or making the slices non-final which affects the inter-segment
> concurrency design (slices should not be mutable?).
>
> Cheers
> Luca
>
> On Fri, May 19, 2023 at 9:02 AM SorabhApache <sor...@apache.org> wrote:
>
>> Hi Luca,
>> Thanks for your reply. Sharing an example below for clarity.
>>
>>
>>
>>
>>
>>
>>
>>
>> *public class CustomIndexSearcher extends IndexSearcher {         public
>> CustomIndexSearcher(IndexReader reader, Executor executor, int
>> maxSliceCount) {                super(reader, executor);         }
>>  @Override          protected LeafSlice[] slices(List<LeafReaderContext>
>> leaves) {*
>> *                 // cannot use maxSliceCount here to control custom
>> logic as this is called from constructor of super class*
>> *                 // I want to use parameter[s] in the constructor input
>> to control this slice computation*
>> *          }*
>> *}*
>>
>> Yes, the SliceExecutor class will become public. Will also need a
>> constructor in IndexSearcher which can take an implementation from the
>> extensions.
>>
>> Thanks,
>> Sorabh
>>
>>
>> On Thu, May 18, 2023 at 11:40 PM Luca Cavanna <l...@elastic.co.invalid>
>> wrote:
>>
>>> Hi Sorabh,
>>> You'll want to override the protected slices method to include your
>>> custom logic for creating the leaf slices. Your IndexSearcher extension can
>>> also retrieve the slices through the getSlices public method. I don't
>>> understand what makes the additional constructor necessary, could you
>>> clarify that for me?
>>>
>>> One thing that may make sense to do is making the SliceExecutor
>>> extensible. Currently it is package private, and I can see how users may
>>> want to provide their own implementation when it comes to handling
>>> rejections, executing on the caller thread in certain scenarios. Possibly
>>> even the task creation, and the coordination of their execution could be
>>> moved to the SliceExecutor too.
>>>
>>> Cheers
>>> Luca
>>>
>>> On Fri, May 19, 2023, 03:27 SorabhApache <sor...@apache.org> wrote:
>>>
>>>> Hi All,
>>>>
>>>> For concurrent segment search, lucene uses the *slices* method to
>>>> compute the number of work units which can be processed concurrently.
>>>>
>>>> a) It calculates *slices* in the constructor of *IndexSearcher*
>>>> <https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L239>
>>>> with default thresholds for document count and segment counts.
>>>> b) Provides an implementation of *SliceExecutor* (i.e.
>>>> QueueSizeBasedExecutor)
>>>> <https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L1008>
>>>> based on executor type which applies the backpressure in concurrent
>>>> execution based on a limiting factor of 1.5 times the passed in threadpool
>>>> maxPoolSize.
>>>>
>>>> In OpenSearch, we have a search threadpool which serves the search
>>>> request to all the lucene indices (or OpenSearch shards) assigned to a
>>>> node. Each node can get the requests to some or all the indices on that
>>>> node.
>>>> I am exploring a mechanism such that I can dynamically control the max
>>>> slices for each lucene index search request. For example: search requests
>>>> to some indices on that node to have max 4 slices each and others to have 2
>>>> slices each. Then the threadpool shared to execute these slices does not
>>>> have any limiting factor. In this model the top level search threadpool
>>>> will limit the number of active search requests which will limit the number
>>>> of work units in the SliceExecutor threadpool.
>>>>
>>>> For this the derived implementation of IndexSearcher can get an input
>>>> value in the constructor to control the slice count computation. Even
>>>> though the slice method is protected it gets called from the constructor of
>>>> base IndexSearcher class which prevents the derived class from using the
>>>> passed in input.
>>>>
>>>> To achieve this I can think of the following ways (in order of
>>>> preference) and would like to submit a pull request for it. But I wanted to
>>>> get some feedback if option 1 looks fine or take some other approach.
>>>>
>>>> 1. Provide another constructor in IndexSearcher which takes in 4 input
>>>> parameters:
>>>>   protected IndexSearcher(IndexReaderContext context, Executor
>>>> executor, SliceExecutor sliceExecutor, Function<List<LeafReaderContext>,
>>>> LeafSlice[]> sliceProvider)
>>>>
>>>> 2. Make the `leafSlices` member protected and non final. After it is
>>>> initialized by the IndexSearcher (using default mechanism in lucene), the
>>>> derived implementation can again update it if need be (like based on some
>>>> input parameter to its own constructor). Also make the constructor with
>>>> SliceExecutor input protected such that derived implementation can provide
>>>> its own implementation of SliceExecutor. This mechanism will have redundant
>>>> computation of leafSlices.
>>>>
>>>>
>>>> Thanks,
>>>> Sorabh
>>>>
>>>

Reply via email to