Hi -
Using Lucene 2.9.3, I'm indexing the metadata in image files. For each image
("document" in Lucene), I have 2 additional special fields: "FILE-PATH"
(containing the full path of the file) and "DIR-PATH" (containing the full path
of the directory the file is in).
The FILE-PATH Field is created only once like:
private final Field m_fieldFilePath = new Field(
"FILE-PATH", "INIT", Field.Store.YES, Field.Index.NOT_ANALYZED
);
and reused; the DIR-PATH Field is created once per document like:
new Field(
"DIR-PATH", file.getParentFile().getAbsolutePath(),
Field.Store.NO, Field.Index.NOT_ANALYZED
)
(The reason the DIR-PATH Field is created once per document is because it's
part of indexing the rest of the image metadata and isn't a special-case like
FILE-PATH. I don't believe this is relevant to the problem at hand, however.)
If an image file (or an entire directory of image files) gets deleted, I need
to delete it (them) from the index. When deleting a single image, I could do:
Term fileTerm = new Term( "FILE-PATH", file.getAbsolutePath() );
writer.deleteDocuments( new TermQuery( fileTerm ) );
When deleting an entire directory of images, I could do:
Term dirTerm = new Term( "DIR-PATH", file.getAbsolutePath() );
writer.deleteDocuments( new TermQuery( dirTerm ) );
However, at the time of deletion, I don't know whether "file" refers to a
single image file or to a directory of images files. I can't do file.isFile()
or file.isDirectory() because "file" no longer exists (it was deleted). So to
cover both cases, I do:
Query[] queries = new Query[]{
new TermQuery( fileTerm ),
new TermQuery( dirTerm )
};
writer.deleteDocuments( queries );
I have non-Lucene code that monitors the filesystem for changes. For Mac OS X,
I can only get directory-level change notifications. So if a file is deleted
from a directory, I get a notification that the directory has changed. So I
delete all the documents in that directory then re-add them.
However (and here's the problem), the deletes never happen. If I delete a file
from a directory, the directory (looks like) its unindexed and reindexed, but a
query for that image file still returns a result. So it's like the delete
never happened.
Why not?
Additional information: I create/close a new IndexWriter for the delete. Even
if I quit the application, relaunch, and run the query, the result still shows
up (hence it's not that the current reader isn't seeing the deletion change).
- Paul
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]