Otis Gospodnetic wrote:
Is your user field stored?  If so, you cold find the target Document, get the
user field value, modify it, and re-add it to the Document (or something
close to this -- I am doing this with one of the indices on simpy.com and
it's working well).

No, it's not stored. I'm not sure I understand how you 'modify it' as it's not possible to modify an existing Document or do you mean you fetch all the stored fields from the existing Document, delete the existing Document then add it back with the modified field?

I have ~20 fields per Document and most are not stored

Antony



Otis

-- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ---- From: Antony Bowesman <[EMAIL PROTECTED]> To:
java-user@lucene.apache.org Sent: Tuesday, January 8, 2008 12:47:05 AM Subject: Deleting a single TermPosition <doc, frequency, position> for a
Document

I'd like to 'update' a single Document in a Lucene index.  In practice, this
 'update' is actually just a removal of a single TermPosition for a given
Term for a given doc Id.

I don't think this is currently possible, but would it be easy to change
Lucene to support this type of usage?

The reason for this is to optimise my index usage.  I'm using Lucene to index
 arbitrary data sets, however, in some data sets, each Document is indexed
once for each user who has an interest in the document. For example, with mail data, a mail item (with a single recipient) is stored as two Documents,
once with the 'user' field set to the sender's user Id and again with the
user field set to the recipents's user Id.  Searches just filter mail for a
given user by the user field.

When one of those users deletes the mail, the Document with the 'user' field
is simply deleted.  One of the original reasons for doing this was to enable
 horizontal partitioning of the index.  This works nicely, but of course the
 index is bigger than necessary and the number of terms positions is at least
 double what is necessary.

I had thought to originally indexed the data once, with the user field set to
 the sender and recipient user Id, but when the sender or recipient deletes
the mail from their mailbox, searching becomes more complicated as the index
does not reflect the external database state unless the mail is reindexed.

Is this something other's have wanted or are there other solutions to this
problem?

Thanks Antony




--------------------------------------------------------------------- To
unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED]





--------------------------------------------------------------------- To
unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to