> I think the doc is correct Wait, one of the docs is wrong. I guess according to what you write, it's FlushPolicy, as a new segment is not flushed per this setting? Or perhaps they should be clarified that the deletes are flushed == applied on existing segments?
I disabled reader pooling and I still don't see .del files. But I think that's explained due to there are no segments in the index yet. All documents are still in the RAM buffer, and according to what you write, I shouldn't see any segment cause of delTerms? Shai On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > First off, it's bad that you don't see .del files when > conf.setMaxBufferedDeleteTerms is 1. > > But, it could be that newIndexWriterConfig turned on readerPooling > which would mean the deletes are held in the SegmentReader and not > flushed to disk. Can you make sure that's off? > > Second off, I think the doc is correct: a segment will not be flushed; > rather, new .del files should appear against older segments. > > And yes, if RAM usage of the buffered del Term/Query s is too high, > then a segment is flushed along with the deletes being applied > (creating the .del files). > > I think buffered delete Querys are not counted towards > setMaxBufferedDeleteTerms; so they are only flushed by RAM usage > (rough rough estimate) or by other ops (merging, NRT reopen, commit, > etc.). > > Mike McCandless > > http://blog.mikemccandless.com > > > On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera <ser...@gmail.com> wrote: > > Hi > > > > I'm a little confused about FlushPolicy and > > IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy > jdocs > > say: > > > > * Segments are traditionally flushed by: > > * <ul> > > * <li>RAM consumption - configured via > > ... > > * <li>Number of buffered delete terms/queries - configured via > > * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}</li> > > * </ul> > > > > Yet IWC.setMaxBufDelTerm says: > > > > NOTE: This setting won't trigger a segment flush. > > > > And FlushByRamOrCountPolicy says: > > > > * <li>{@link #onDelete(DocumentsWriterFlushControl, > > DocumentsWriterPerThreadPool.ThreadState)} - flushes > > * based on the global number of buffered delete terms iff > > * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled</li> > > > > Confused, I wrote a short unit test: > > > > public void testMaxBufDelTerm() throws Exception { > > Directory dir = new RAMDirectory(); > > IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, > new > > MockAnalyzer(random())); > > conf.setMaxBufferedDeleteTerms(1); > > conf.setMaxBufferedDocs(10); > > conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH); > > conf.setInfoStream(new PrintStreamInfoStream(System.out)); > > IndexWriter writer = new IndexWriter(dir, conf ); > > int numDocs = 4; > > for (int i = 0; i < numDocs; i++) { > > Document doc = new Document(); > > doc.add(new StringField("id", "doc-" + i, Store.NO)); > > writer.addDocument(doc); > > } > > > > System.out.println("before delete"); > > for (String f : dir.listAll()) System.out.println(f); > > > > writer.deleteDocuments(new Term("id", "doc-0")); > > writer.deleteDocuments(new Term("id", "doc-1")); > > > > System.out.println("\nafter delete"); > > for (String f : dir.listAll()) System.out.println(f); > > > > writer.close(); > > dir.close(); > > } > > > > When InfoStream is turned on, I can see messages regarding terms flushing > > (vs if I comment the .setMaxBufDelTerm line), so I know this settings > takes > > effect. > > Yet both before and after the delete operations, the dir.list() returns > only > > the fdx and fdt files. > > > > So is this a bug that a segment isn't flushed? If not (and I'm ok with > > that), is it a documentation inconsistency? > > Strangely, I think, if the delTerms RAM accounting exhausts > max-RAM-buffer > > size, a new segment will be deleted? > > > > Slightly unrelated to FlushPolicy, but do I understand correctly that > > maxBufDelTerm does not apply to delete-by-query operations? > > BufferedDeletes doesn't increment any counter on addQuery(), so is it > > correct to assume that if I only delete-by-query, this setting has no > > effect? > > And the delete queries are buffered until the next segment is flushed > due to > > other operations (constraints, commit, NRT-reopen)? > > > > Shai > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >