[
https://issues.apache.org/jira/browse/SOLR-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680981#comment-16680981
]
Shawn Heisey commented on SOLR-12974:
-------------------------------------
One thing you can do for a workaround is to upgrade to 7.x and use the new TLOG
or PULL replica types. Downside to this is that it requires upgrading to a new
major version. If you have a test environment, that may not be a major problem.
I suspect that it would be very difficult to guarantee the same index version
when using NRT replicas, which was the only type before 7.x. I could be wrong
about that.
> RandomSort not consistent in SolrCloud Mode
> -------------------------------------------
>
> Key: SOLR-12974
> URL: https://issues.apache.org/jira/browse/SOLR-12974
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: 6.5.1
> Reporter: Shrey Shivam
> Priority: Minor
>
> Expected behaviour of RandomSort is that given the same random field name
> (random_<seed>) which acts a seed, the sorting order will remain consistent
> with the same version of Solr Index.
> From schema.xml:
> {{~<!-- The "RandomSortField" is not used to store or search any data. You
> can declare fields of this type it in your schema to generate pseudo-random
> orderings of your docs for sorting or function purposes. The ordering is
> generated based on the field name and the version of the index. As long as
> the index version remains unchanged, and the same field name is reused, the
> ordering of the docs will be consistent. If you want different psuedo-random
> orderings of documents, for the same version of the index, use a dynamicField
> and change the field name in the request. -->~}}
>
> In master slave mode, replication happens based on index version. If version
> number of slave is different than that of master, replication is done by
> slaves and the index number is updated to match the index version of master.
> However in SolrCloud mode, observation has been that replicas of the same
> shard do not maintain the same version number at all times even though the
> documents are same and consistent.
> This has been previously discussed in [mailing list
> |https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201508.mbox/%3ccae3utzmggprv-p6juwjwm2yyyxfw893xayq7+2hav7mmobm...@mail.gmail.com%3E]as
> well.
> {quote}SolrCloud works very differently than the old master-slave replication.
> The index is NOT copied from the leader to the other replicas, except
> in extreme recovery circumstances.
> Each replica builds its own copy of the index independently from the
> others. Due to slight timing differences in the indexing operations,
> and possible actions related to transaction log replay on node restart,
> each replica may end up with a different index layout. There also could
> be differences in the number of deleted documents. Unless something
> goes really wrong, all replicas should contain the same live documents.
> {quote}
>
> When a query to a shard is made which has 2 or more replicas, any replica is
> chosen to respond to the query. Now, if all replicas do not have the same
> index number, RandomSort will generate random hash seed differently for the
> same random_<seed> field name.
> In the source code of
> [RandomSort|https://github.com/apache/lucene-solr/blob/branch_6_5/solr/core/src/java/org/apache/solr/schema/RandomSortField.java]
> class, in line 86, it mentions the use of index version (of shard) to create
> random hash seed.
> Hence when querying a Solr Collection, for the same query, Solr is giving
> different results depending on version mismatch in replicas as well as based
> on which replica is serving request each time.
>
> Example of Solr Query where random field is being used:
> {code:java}
> https://solr-stage.mydomain.com:8983/solr/mycollection/select?wt=json&q=*:*&defType=edismax&fl=id&boost=if(query({!v='documentDate:[2018-11-07
> TO
> *]'}),sum(div(scale(random_SW84gaDAf3RynhOyGQDZlgAAAYc1,0,1),1),sub(1,div(1,1))),if(or(exists(query({!v='documentType:sponsored'})),exists(query({!v='documentType:featured'}))),sum(div(scale(random_SW84gaDAf3RynhOyGQDZlgAAAYc1,0,1),4),sub(1,div(1,4))),
>
> if(or(exists(query({!v='documentType:listing'})),exists(query({!v='documentType:promotional'}))),sum(div(scale(random_SW84gaDAf3RynhOyGQDZlgAAAYc1,0,1),2),sub(1,div(1,2))),scale(random_SW84gaDAf3RynhOyGQDZlgAAAYc1,0,1))))
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]