Does IndexReader#docFreq should be aware of deleted documents, if the index has not been optimized ?
If I have two documents with same term T: I call docFreq(T) and it returns 2.
I delete the first document.
I call docFreq(T) again and it returns 2.
In our cases, indexes are very big and it costs to optimize them.
Here is a code snippet pointing out the problem :
---------------------------------------
public class Test { public static void main(String[] args) {
String tmp = System.getProperty("java.io.tmpdir")+File.separator+"tst"; try {
IndexWriter wri = new IndexWriter(tmp, new WhitespaceAnalyzer(),true);
Document doc =new Document();
doc.add(Field.Text("field1","value"));
doc.add(Field.Text("field2","value2"));
wri.addDocument(doc);
doc = new Document();
doc.add(Field.Text("field1","value"));
doc.add(Field.Text("field2","value3"));
wri.addDocument(doc);
wri.optimize();
wri.close(); IndexReader reader = IndexReader.open(tmp);
System.out.println(reader.docFreq(new Term("field1","value")));
reader.delete(0);
reader.close();
reader = IndexReader.open(tmp);
System.out.println(reader.docFreq(new Term("field1","value")));
}
catch (IOException e) {
e.printStackTrace();
}} } ---------------------------------------
Thanks. Pascal.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
