On 7/17/2020 1:32 AM, yo tomi wrote:
When I did AtomicUpdate on SolrCloud by the following setting, it does
not work properly.
As Jörn Franke already mentioned, you haven't said exactly what "does
not work properly" actually means in your situation. Without that
information, it will be very difficult to provide any real help.
Atomic update functionality is currently implemented in
DistributedUpdateProcessorFactory.
---
<updateRequestProcessorChain name="skip-empty">
<processor class="solr.DistributedUpdateProcessorFactory"/>
<processor class="TrimFieldUpdateProcessorFactory" />
<processor class="RemoveBlankFieldUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---
When changed as follows and made it work, it became as expected.
---
<updateRequestProcessorChain name="skip-empty">
<processor class="TrimFieldUpdateProcessorFactory" />
<processor class="RemoveBlankFieldUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
---
The effective result difference between these configurations is that
atomic updates will happen first with the first config, and in the
second, atomic updates will happen second to last -- just before
RunUpdateProcessorFactory.
Also, with the first config, most of the update processors are going to
be executed on the machine with the shard leader (after the update is
distributed) and if there is more than one NRT replica, they will be
executed multiple times. With the second config, most of the processors
will be executed on the machine that actually receives the update
request. For the purposes of that discussion, remember that when a PULL
replica is elected leader, it is effectively an NRT replica.
Does that information help you determine why it doesn't do what you expect?
The later setting and the way of using post-processor could make the
same result, I though,
but using post-processor, bug of SOLR-8030 makes me not feel like using it.
By the latter setting even, is there any possibility of SOLR-8030 to
become?
See this part of the reference guide for a bunch of gory details about
DistributedUpdateProcessorFactory:
https://cwiki.apache.org/confluence/display/SOLR/UpdateRequestProcessor#UpdateRequestProcessor-DistributedUpdates
In SOLR-8030, the general consensus among committers is that you should
configure almost all update processors as "pre" processors -- placed
before DistributedUpdatePorcessorFactory in the config. When done this
way, updates are usually faster and less likely to yield inconsistent
results.
There may be situations where having them as "post" processors is
correct, but that won't happen very often. The second config above does
implicitly use "pre" for most of the processors.
Thanks,
Shawn