: I'm using pylucene to index documents and I'm interested in checking if a
: given document from the list A (that is going to be indexed) is already
: indexed. Can I do it?
FYI: PyLucene is not an Apache project, it has it's own mailing lists
and documentation that you may want to consult...
http://pylucene.osafoundation.org/
I personally don't know anything about PyLucene, and I have no idea what
features it may add -- but that said, assuming it is a simple wrapper
arround the Lucene-Java APIs, and adds no extra functionality then you'll
need som way to indentify a document to determine if it is in the index or
not .. essentailly: you search for each document you have by some criteria
to see if it's there.
if your document space allows for a "uniquey key" on each document, just
make sure it is indexed ... if not, then compute something appropraite
(ie: and MD5 sum) and use that.
-Hoss