[ 
https://issues.apache.org/jira/browse/LUCENE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549495#comment-13549495
 ] 

Adrien Grand commented on LUCENE-4669:
--------------------------------------

Hi Miguel,

bq. One more question: what's the best way to iterate over all documents in an 
index?

Retrieving stored fields for all documents in an index is something Lucene is 
bad at (it doesn't optimize for this use-case on purpose), and you should try 
to avoid doing it. Otherwise, iterating over all doc ids from 0 to ir.maxDoc(), 
skipping deleted documents (liveDocs != null && !liveDocs.get(docID)) and 
calling IndexReader.document(docID) should work.

Please ask questions on the user mailing-list instead of JIRA in the future.
                
> Document wrongly deleted from index
> -----------------------------------
>
>                 Key: LUCENE-4669
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4669
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.0
>         Environment: OS = Mac OS X 10.7.5
> Java = JVM 1.6
>            Reporter: Miguel Ferreira
>
> I'm trying to implement document deletion from an index.
> If I create an index with three documents (A, B and C) and then try to delete 
> A, A gets marked as deleted but C is removed from the index. I've tried this 
> with different number of documents and saw that it is always the last 
> document that is removed.
> When I run the example unit test code bellow I get this output:
> {code}
> Before delete
> Found 3 documents
> Document at = 0; isDeleted = false; path = a; 
> Document at = 1; isDeleted = false; path = b; 
> Document at = 2; isDeleted = false; path = c; 
> After delete
> Found 2 documents
> Document at = 0; isDeleted = true; path = a; 
> Document at = 1; isDeleted = false; path = b; 
> {code}
> Example unit test:
> {code:title=ExampleUnitTest.java}
>     @Test
>     public void delete() throws Exception {
>         File indexDir = FileUtils.createTempDir();
>         IndexWriter writer = new IndexWriter(new NIOFSDirectory(indexDir), 
> new IndexWriterConfig(Version.LUCENE_40,
>             new StandardAnalyzer(Version.LUCENE_40)));
>         Document doc = new Document();
>         String fieldName = "path";
>         doc.add(new StringField(fieldName, "a", Store.YES));
>         writer.addDocument(doc);
>         doc = new Document();
>         doc.add(new StringField(fieldName, "b", Store.YES));
>         writer.addDocument(doc);
>         doc = new Document();
>         doc.add(new StringField(fieldName, "c", Store.YES));
>         writer.addDocument(doc);
>         writer.commit();
>         System.out.println("Before delete");
>         print(indexDir);
>         writer.deleteDocuments(new Term(fieldName, "a"));
>         writer.commit();
>         System.out.println("After delete");
>         print(indexDir);
>     }
>     public static void print(File indexDirectory) throws IOException {
>         DirectoryReader reader = DirectoryReader.open(new 
> NIOFSDirectory(indexDirectory));
>         Bits liveDocs = MultiFields.getLiveDocs(reader);
>         int numDocs = reader.numDocs();
>         System.out.println("Found " + numDocs + " documents");
>         for (int i = 0; i < numDocs; i++) {
>             Document document = reader.document(i);
>             StringBuffer sb = new StringBuffer();
>             sb.append("Document at = ").append(i);
>             sb.append("; isDeleted = ").append(liveDocs != null ? 
> !liveDocs.get(i) : false).append("; ");
>             for (IndexableField field : document.getFields()) {
>                 String fieldName = field.name();
>                 for (String value : document.getValues(fieldName)) {
>                     sb.append(fieldName).append(" = 
> ").append(value).append("; ");
>                 }
>             }
>             System.out.println(sb.toString());
>         }
>     }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to