[ https://issues.apache.org/jira/browse/LUCENE-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874373#action_12874373 ]
Trejkaz commented on LUCENE-2348: --------------------------------- I attempted to make a test but it fails with matching 0 instead of matching 2 like I would have expected. Here is the code: {code:java} @Test public void testDuplicateFilterAcrossSegments() throws Exception { RAMDirectory index1Dir = new RAMDirectory(); addDoc(index1Dir); RAMDirectory index2Dir = new RAMDirectory(); addDoc(index2Dir); IndexReader reader1 = IndexReader.open(index1Dir, true); IndexReader reader2 = IndexReader.open(index2Dir, true); IndexReader multi = new MultiReader(new IndexReader[] { reader1, reader2 }); IndexSearcher searcher = new IndexSearcher(multi); TopDocs docs; docs = searcher.search(new MatchAllDocsQuery(), null, 10); assertEquals("Should only be two hits without the filter (just checking)", 2, docs.totalHits); docs = searcher.search(new MatchAllDocsQuery(), new DuplicateFilter("id"), 10); assertEquals("Should only be one hit because the second was a duplicate", 1, docs.totalHits); } private void addDoc(Directory dir) throws IOException { IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true, IndexWriter.MaxFieldLength.UNLIMITED); try { Document doc = new Document(); doc.add(new Field("id", "1", Field.Store.YES, Field.Index.NO)); writer.addDocument(doc); writer.commit(); } finally { writer.close(); } } {code} > DuplicateFilter incorrectly handles multiple calls to getDocIdSet for segment > readers > ------------------------------------------------------------------------------------- > > Key: LUCENE-2348 > URL: https://issues.apache.org/jira/browse/LUCENE-2348 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/* > Affects Versions: 2.9.2 > Reporter: Trejkaz > > DuplicateFilter currently works by building a single doc ID set, without > taking into account that getDocIdSet() will be called once per segment and > only with each segment's local reader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org