If you want an exact number of segments, create 64 indexes, each forceMerged to one segment. After that use MultiReader to create a view on all separate indexes. MultiReaders's contents are always flattened to a list of those 64 indexes.
But keep in mind that this should only ever be done with *static* indexes. As soon as you have updates, this is a bad idea (forceMerge in general) and also splitting indexes like this. Parallelization should normally come from multiple queries running in parallel, but you shouldn't force Lucene to run a single query over so many indexes. Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Alex K <aklib...@gmail.com> > Sent: Monday, July 5, 2021 4:04 AM > To: java-user@lucene.apache.org > Subject: Control the number of segments without using forceMerge. > > Hi all, > > I'm trying to figure out if there is a way to control the number of > segments in an index without explicitly calling forceMerge. > > My use-case looks like this: I need to index a static dataset of ~1 > billion documents. I know the exact number of docs before indexing starts. > I know the VM where this index is searched has 64 threads. I'd like to end > up with exactly 64 segments, so I can search them in a parallelized fashion. > > I know that I could call forceMerge(64), but this takes an extremely long > time. > > Is there a straightforward way to ensure that I end up with 64 threads > without force-merging after adding all of the documents? > > Thanks in advance for any tips > > Alex Klibisz --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org