This is a good idea, because sometimes it's nice to change the MergePolicy on the fly without reopening! One example is https://issues.apache.org/jira/browse/LUCENE-5526 In my case, I would like to open an IndexWriter, set its merge policy to IndexUpdaterMergePolicy, force a merge to upgrade all segments and then proceed with normal indexing and other stuff. Currently you have to close IW - this is bad in multithreaded environments: If you start an Index Upgrade after installing a new version of your favourite Solr/ES/... server, but need to index documents in parallel (real time system) - so with little downtime. The proposal in the above issue is to allow to pass a MergePolicy to forceMerge().
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Shai Erera [mailto:ser...@gmail.com] > Sent: Thursday, August 07, 2014 4:11 PM > To: java-user@lucene.apache.org > Subject: Re: improve indexing speed with nomergepolicy > > Yes, currently an MP isn't a "live" setting on IndexWriter, meaning you pass > it > at construction time and don't change it afterwards. I wonder if after > LUCENE-5711 we can move MergePolicy to LiveIndexWriterConfig and fix > IndexWriter to not hold on to it, but rather pull it from the config. > > Not sure what others think about it. > > Shai > > > On Thu, Aug 7, 2014 at 5:05 PM, Jon Stewart > <j...@lightboxtechnologies.com> > wrote: > > > Related, how does one change the MergePolicy on an IndexWriter (e.g., > > use NoMergePolicy during batch indexing, then change to something > > better once finished with batch)? It looks like the MergePolicy is set > > through IndexWriterConfig but I don't see a way to update an IWC on an > > IW. > > > > Thanks, > > > > Jon > > > > > > On Thu, Aug 7, 2014 at 7:37 AM, Shai Erera <ser...@gmail.com> wrote: > > > Using NoMergePolicy for online indexes is usually not recommended. > > > You > > want > > > to use NoMP in case where you build an index in a batch job, then in > > > the end before the index is "published" you run a forceMerge or > > > maybeMerge (with a real MergePolicy). > > > > > > For online indexes, i.e. indexes that are being searched while they > > > are updated, if you use NoMP you will accumulate many segments in the > index. > > > This means higher resources consumption overall: file handles, RAM, > > > potentially disk space, and usually results in slower searches. > > > > > > You may want to tweak the default MP's settings though, to not kick > > > off a merge unless there are a large number of segments in the > > > index. E.g. the default MP merges segments when there are 10 at the > same level (i.e. > > > roughly the same size). You can increase that. > > > > > > Also, do you use NRTCachingDirectory? It's usually recommended for > > > NRT, even with default MP, since the tiny segments are merged > > > in-memory, and your NRT reopens don't result in flushing new segments > to disk. > > > > > > Shai > > > > > > > > > On Thu, Aug 7, 2014 at 1:14 PM, Sascha Janz <sascha.j...@gmx.net> > wrote: > > > > > >> hi, > > >> > > >> i try to speed up our indexing process. we use SeacherManager with > > >> applydeletes to get near real time Reader. > > >> > > >> we have not really "much" incoming documents, but the documents > > >> must be updated from time to time and the amount of documents to be > > >> updated > > could > > >> be quite large. > > >> > > >> i tried some tests with NoMergePolicy and the indexing process was > > >> 25 % faster. > > >> > > >> so i think of a change in our code, to use NoMergePolicy for a > > >> specific time interval, when users are active and do a > > >> forceMerge(20) every > > night, > > >> which last about 2 - 5 minutes. > > >> > > >> is this a good idea? or will i perhaps get into trouble? > > >> > > >> Sascha > > >> > > >> > > >> ------------------------------------------------------------------- > > >> -- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > >> > > >> > > > > > > > > -- > > Jon Stewart, Principal > > (646) 719-0317 | j...@lightboxtechnologies.com | Arlington, VA > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org