Re: AtomicUpdate on SolrCloud is not working

Shawn Heisey Sat, 18 Jul 2020 14:29:45 -0700

On 7/17/2020 1:32 AM, yo tomi wrote:

When I did AtomicUpdate on SolrCloud by the following setting, it does
not work properly.

As Jörn Franke already mentioned, you haven't said exactly what "doesnot work properly" actually means in your situation. Without thatinformation, it will be very difficult to provide any real help.

Atomic update functionality is currently implemented inDistributedUpdateProcessorFactory.

---
<updateRequestProcessorChain name="skip-empty">
  <processor class="solr.DistributedUpdateProcessorFactory"/>
  <processor class="TrimFieldUpdateProcessorFactory" />
  <processor class="RemoveBlankFieldUpdateProcessorFactory" />
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---
When changed as follows and made it work, it became as expected.
---
<updateRequestProcessorChain name="skip-empty">
  <processor class="TrimFieldUpdateProcessorFactory" />
  <processor class="RemoveBlankFieldUpdateProcessorFactory" />
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---

The effective result difference between these configurations is thatatomic updates will happen first with the first config, and in thesecond, atomic updates will happen second to last -- just beforeRunUpdateProcessorFactory.

Also, with the first config, most of the update processors are going tobe executed on the machine with the shard leader (after the update isdistributed) and if there is more than one NRT replica, they will beexecuted multiple times. With the second config, most of the processorswill be executed on the machine that actually receives the updaterequest. For the purposes of that discussion, remember that when a PULLreplica is elected leader, it is effectively an NRT replica.


Does that information help you determine why it doesn't do what you expect?

The later setting and the way of using post-processor could make the
same result, I though,
but using post-processor, bug of SOLR-8030 makes me not feel like using it.
By the latter setting even, is there any possibility of SOLR-8030 to
become?

See this part of the reference guide for a bunch of gory details aboutDistributedUpdateProcessorFactory:


https://cwiki.apache.org/confluence/display/SOLR/UpdateRequestProcessor#UpdateRequestProcessor-DistributedUpdates

In SOLR-8030, the general consensus among committers is that you shouldconfigure almost all update processors as "pre" processors -- placedbefore DistributedUpdatePorcessorFactory in the config. When done thisway, updates are usually faster and less likely to yield inconsistentresults.

There may be situations where having them as "post" processors iscorrect, but that won't happen very often. The second config above doesimplicitly use "pre" for most of the processors.


Thanks,
Shawn

Re: AtomicUpdate on SolrCloud is not working

Reply via email to