Varun Thacker created SOLR-12816:
------------------------------------
Summary: Don't allow RunUpdateProcessorFactory to be set before
DistributedUpdateProcessorFactory
Key: SOLR-12816
URL: https://issues.apache.org/jira/browse/SOLR-12816
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Varun Thacker
Here's the problem that came up with a customer call today morning - "My
documents are not getting replicated to the replicas and the doc counts don't
match up"
It was a 3 node cluster. The collection was 1 shard X 3 replicas .
This is a scary situation to be in. We started down the patch of debugging
replica types , auto-commits , checking if the {{_version_}} field and {{id}}
fields were defined correctly etc.
The problem was the user had defined a custom update processor chain and had
RunUpdateProcessorFactory defined before DistributedUpdateProcessorFactory
{code:java}
<updateRequestProcessorChain ...
....
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
<processor class="solr.DistributedUpdateProcessorFactory"/>
</updateRequestProcessorChain>{code}
With this update chain, whichever node you index the document against will be
the only one indexing the document. It will never forward to the other nodes.
So you can index against a node hosting a replica and the leader will never
get this document.
Is there any use-case where having RunUpdateProcessor before
DistributedUpdateProcessor is needed?
Perhaps we could borrow the idea from TRA or make these two update processors
default and remove them from the default configs?
{code:java}
When processing an update for a TRA, Solr initializes its
UpdateRequestProcessor chain as usual, but when DistributedUpdateProcessor
(DUP) initializes, it detects that the update targets a TRA and injects
TimeRoutedUpdateProcessor (TRUP) in front of itself.{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]