Hi Luca, Thanks for the suggestion. Let me explore more on it and I will get back. But I agree making the slices non-final will require consumers to ensure slices are not mutated after the construction time and is not preferred.
Thanks, Sorabh On Tue, May 23, 2023 at 3:14 AM Luca Cavanna <java...@apache.org> wrote: > Hi Sorabh, > thanks for explaining. I see what you mean, it does get awkward to > customize how slices are created. If the plan is to make SliceExecutor > public and extensible, would it make sense to figure out what its public > methods should be, and include the slice creation in there so it is > detached from the IndexSearcher? I think that in practice there is a > correlation between how slices are created and how they are executed (e.g. > you may want to limit the number of slices created and execute each one on > a separate thread, or possibly have more slices than threads, and reuse the > same thread to execute multiple slices). Moving the slice creation to the > component responsible for their execution would give the desired > flexibility for users to customize their logic without having to poke with > IndexSearcher constructors? What do you think? I personally prefer this > option over adding a function argument to the existing searcher > constructor, or making the slices non-final which affects the inter-segment > concurrency design (slices should not be mutable?). > > Cheers > Luca > > On Fri, May 19, 2023 at 9:02 AM SorabhApache <sor...@apache.org> wrote: > >> Hi Luca, >> Thanks for your reply. Sharing an example below for clarity. >> >> >> >> >> >> >> >> >> *public class CustomIndexSearcher extends IndexSearcher { public >> CustomIndexSearcher(IndexReader reader, Executor executor, int >> maxSliceCount) { super(reader, executor); } >> @Override protected LeafSlice[] slices(List<LeafReaderContext> >> leaves) {* >> * // cannot use maxSliceCount here to control custom >> logic as this is called from constructor of super class* >> * // I want to use parameter[s] in the constructor input >> to control this slice computation* >> * }* >> *}* >> >> Yes, the SliceExecutor class will become public. Will also need a >> constructor in IndexSearcher which can take an implementation from the >> extensions. >> >> Thanks, >> Sorabh >> >> >> On Thu, May 18, 2023 at 11:40 PM Luca Cavanna <l...@elastic.co.invalid> >> wrote: >> >>> Hi Sorabh, >>> You'll want to override the protected slices method to include your >>> custom logic for creating the leaf slices. Your IndexSearcher extension can >>> also retrieve the slices through the getSlices public method. I don't >>> understand what makes the additional constructor necessary, could you >>> clarify that for me? >>> >>> One thing that may make sense to do is making the SliceExecutor >>> extensible. Currently it is package private, and I can see how users may >>> want to provide their own implementation when it comes to handling >>> rejections, executing on the caller thread in certain scenarios. Possibly >>> even the task creation, and the coordination of their execution could be >>> moved to the SliceExecutor too. >>> >>> Cheers >>> Luca >>> >>> On Fri, May 19, 2023, 03:27 SorabhApache <sor...@apache.org> wrote: >>> >>>> Hi All, >>>> >>>> For concurrent segment search, lucene uses the *slices* method to >>>> compute the number of work units which can be processed concurrently. >>>> >>>> a) It calculates *slices* in the constructor of *IndexSearcher* >>>> <https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L239> >>>> with default thresholds for document count and segment counts. >>>> b) Provides an implementation of *SliceExecutor* (i.e. >>>> QueueSizeBasedExecutor) >>>> <https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L1008> >>>> based on executor type which applies the backpressure in concurrent >>>> execution based on a limiting factor of 1.5 times the passed in threadpool >>>> maxPoolSize. >>>> >>>> In OpenSearch, we have a search threadpool which serves the search >>>> request to all the lucene indices (or OpenSearch shards) assigned to a >>>> node. Each node can get the requests to some or all the indices on that >>>> node. >>>> I am exploring a mechanism such that I can dynamically control the max >>>> slices for each lucene index search request. For example: search requests >>>> to some indices on that node to have max 4 slices each and others to have 2 >>>> slices each. Then the threadpool shared to execute these slices does not >>>> have any limiting factor. In this model the top level search threadpool >>>> will limit the number of active search requests which will limit the number >>>> of work units in the SliceExecutor threadpool. >>>> >>>> For this the derived implementation of IndexSearcher can get an input >>>> value in the constructor to control the slice count computation. Even >>>> though the slice method is protected it gets called from the constructor of >>>> base IndexSearcher class which prevents the derived class from using the >>>> passed in input. >>>> >>>> To achieve this I can think of the following ways (in order of >>>> preference) and would like to submit a pull request for it. But I wanted to >>>> get some feedback if option 1 looks fine or take some other approach. >>>> >>>> 1. Provide another constructor in IndexSearcher which takes in 4 input >>>> parameters: >>>> protected IndexSearcher(IndexReaderContext context, Executor >>>> executor, SliceExecutor sliceExecutor, Function<List<LeafReaderContext>, >>>> LeafSlice[]> sliceProvider) >>>> >>>> 2. Make the `leafSlices` member protected and non final. After it is >>>> initialized by the IndexSearcher (using default mechanism in lucene), the >>>> derived implementation can again update it if need be (like based on some >>>> input parameter to its own constructor). Also make the constructor with >>>> SliceExecutor input protected such that derived implementation can provide >>>> its own implementation of SliceExecutor. This mechanism will have redundant >>>> computation of leafSlices. >>>> >>>> >>>> Thanks, >>>> Sorabh >>>> >>>