The actual time it takes to delete or update the document is unlikely to make a difference to you.

What might make a difference to you is the time it takes to actually finalize the commit, and the time it takes to re-warm your indexes after a commit, and especially the time it takes to run any warming queries you have set in newSearcher. Most of these probably won't differ between delete or update, but could be a problem either way; one way to find out, try it and measure it.

Whether you do a delete or an update, if you're planning on making changes to your index more often than, oh, 10 or 20 minute seperation, you may run into trouble. Solr isn't so good at frequent changes to the index like that. I haven't looked at it myself, but the Solr patches that get called "near real-time" seem like they're intended to deal with this, among other things, and allow frequent commits without killing performance or RAM usage.

I am not sure how/if other people are effectively dealing with user-generated content that needs to be included in the index for filtering and searching against. Would be very curious if anyone has any successful strategies to share. Another example would be user-generated tagging.

Erick Erickson wrote:
Just deleting a document is faster because all that really happens
is the document is marked as deleted. An update is really
a delete followed by an add of the same document, so by definition
an update will be slower...

But... does it really make a difference? How often to you expect this to
happen? Perter Karich added a note while I was typing this, and he
makes some cogent points.

I'm starting to think that I don't care about better unless and until my
users notice (or I have a reasonable expectation that they #will# notice).
I'm far more interested in simpler code that I can maintain than I am
shaving off another 4 milliseconds from the response time. That gives
me more chance to put in cool new features that the user will notice...

Best
Erick

On Mon, Nov 1, 2010 at 5:04 PM, Andy <angelf...@yahoo.com> wrote:

My documents have a "down_vote" field. Every time a user votes down a
document, I increment the "down_vote" field in my database and also re-index
the document to Solr to reflect the new down_vote value.
During searches, I want to restrict the results to only documents with, say
fewer than 3 down_vote. 2 ways to implement that:
1) When a user down vote a document, check to see if total down votes have
reached 3. If it has, delete document from Solr index.
2) When a user down vote a document, update the document in Solr index to
reflect the new down_vote value even if total down votes might have been
more than 3. During query, add a "fq" to restrict results to documents with
fewer than 3 down votes.
Which approach is better? Is it faster to delete a document from index or
to update the document to reflect the new down_vote value?
Thanks.Andy




Reply via email to