Hi,

I agree with Kranti on this...

There is a multitude of reasons that would cause an index to be "bad". One
such common case is when the index is built without any functional problems
(no errors, exceptions), but the document counts are low. This occurs when
there is a glitch in the document pipelines feeding the index. In this case,
the index is "correctly" built, but has only 50% or less of the expected
documents.

For a 24x7 system such as mine, which services > 500 million queries/day,
I'd rather have an older index than replicating a fresh index (say, on a
Saturday night) that has 50% or less of the expected documents. This:
a) provides customers with usable experience (though somewhat stale data)
b) gives my Ops team an opportunity to fix the issue during biz hours, or
next convenient time. We can trigger alarms to Ops when this occurs. Bottom
line, customers don't get severely impacted, and we can fix it at the next
available time.

A fresh index is built every X hours. Therefore, this problem is guaranteed
to happen in a 24x7x365 system with multiple/dynamic data sources. It's a
safeguard needed in SOLR.







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Threshold-Checks-for-Replication-in-solrconfig-xml-tp4082458p4164856.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to