I'd like to 'update' a single Document in a Lucene index. In practice, this
'update' is actually just a removal of a single TermPosition for a given Term
for a given doc Id.
I don't think this is currently possible, but would it be easy to change Lucene
to support this type of usage?
The reason for this is to optimise my index usage. I'm using Lucene to index
arbitrary data sets, however, in some data sets, each Document is indexed once
for each user who has an interest in the document. For example, with mail data,
a mail item (with a single recipient) is stored as two Documents, once with the
'user' field set to the sender's user Id and again with the user field set to
the recipents's user Id. Searches just filter mail for a given user by the user
field.
When one of those users deletes the mail, the Document with the 'user' field is
simply deleted. One of the original reasons for doing this was to enable
horizontal partitioning of the index. This works nicely, but of course the
index is bigger than necessary and the number of terms positions is at least
double what is necessary.
I had thought to originally indexed the data once, with the user field set to
the sender and recipient user Id, but when the sender or recipient deletes the
mail from their mailbox, searching becomes more complicated as the index does
not reflect the external database state unless the mail is reindexed.
Is this something other's have wanted or are there other solutions to this
problem?
Thanks
Antony
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]