[ https://issues.apache.org/jira/browse/SOLR-12816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642597#comment-16642597 ]
Alexandre Rafalovitch commented on SOLR-12816: ---------------------------------------------- I did find the passage in the documentation about [URPs that may want to run after DistributedUpdateProcessor|https://lucene.apache.org/solr/guide/7_5/update-request-processors.html#atomic-update-processor-factory]. Basically, if they need to operate on the full document even if only atomic update was sent, they need to be in the chain after the the DistributedUpdateProcessor (which reconstructs the document). I think this must be what I was trying to remember. > Don't allow RunUpdateProcessorFactory to be set before > DistributedUpdateProcessorFactory > ---------------------------------------------------------------------------------------- > > Key: SOLR-12816 > URL: https://issues.apache.org/jira/browse/SOLR-12816 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Varun Thacker > Priority: Major > > Here's the problem that came up with a customer call today morning - "My > documents are not getting replicated to the replicas and the doc counts don't > match up" > It was a 3 node cluster. The collection was 1 shard X 3 replicas . > This is a scary situation to be in. We started down the patch of debugging > replica types , auto-commits , checking if the {{_version_}} field and > {{id}} fields were defined correctly etc. > > The problem was the user had defined a custom update processor chain and had > RunUpdateProcessorFactory defined before DistributedUpdateProcessorFactory > {code:java} > <updateRequestProcessorChain ... > .... > <processor class="solr.LogUpdateProcessorFactory"/> > <processor class="solr.RunUpdateProcessorFactory"/> > <processor class="solr.DistributedUpdateProcessorFactory"/> > </updateRequestProcessorChain>{code} > > With this update chain, whichever node you index the document against will be > the only one indexing the document. It will never forward to the other nodes. > So you can index against a node hosting a replica and the leader will never > get this document. > Is there any use-case where having RunUpdateProcessor before > DistributedUpdateProcessor is needed? > > Perhaps we could borrow the idea from TRA or make these two update processors > default and remove them from the default configs? > {code:java} > When processing an update for a TRA, Solr initializes its > UpdateRequestProcessor chain as usual, but when DistributedUpdateProcessor > (DUP) initializes, it detects that the update targets a TRA and injects > TimeRoutedUpdateProcessor (TRUP) in front of itself.{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org