[ 
https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507393#comment-14507393
 ] 

Timothy Potter commented on SOLR-7332:
--------------------------------------

Been running some larger-scale perf tests in Ec2 with this, same basic setup as 
described here: https://lucidworks.com/blog/introducing-the-solr-scale-toolkit/

Previously, I indexed 130M docs into a 10x2 (10 shards, rf=2) collection using 
10 r3.2xlarge instances at an avg. rate of 34,881 docs/sec using Solr 4.8.1. 
With branch5x with the latest patches for SOLR-7332 and SOLR-7333 applied, the 
same test resulted in 74,713 docs/sec, which is better than 2x improvement. The 
results repeated several times :-)

Next, I tried increasing the number of reducers I was using to see how hard I 
could push Solr and unfortunately, I ended up with 2 shards that had replicas 
that were out-of-sync with their leader. I'm digging into what may have caused 
that (proving hard to reproduce now) ... [~yo...@apache.org] can you think of a 
case where docs could be dropped with this new version bucket seeding stuff? My 
test is all new adds into an empty collection, no deletes, no updates. At first 
I was thinking it may be due to the seeding of the highest using the new clock 
from VersionInfo when the index is empty.
{code}
+      long maxVersion = Math.max(maxVersionFromIndex, maxVersionFromRecent);
+      if (maxVersion == 0L) {
+        maxVersion = versions.getNewClock();
+        log.warn("Could not find max version in index or recent updates, using 
new clock {}", maxVersion);
+      }
{code}

But I can't see how that would cause an issue with this logic in 
DistributedUpdateProcessor's versionAdd method (which is the only code I see 
that drops requests on a replica):

{code}
            if (bucketVersion != 0 && bucketVersion < versionOnUpdate) {
              // we're OK... this update has a version higher than anything 
we've seen
              // in this bucket so far, so we know that no reordering has yet 
occurred.
              bucket.updateHighest(versionOnUpdate);
            } else {
              // there have been updates higher than the current update.  we 
need to check
              // the specific version for this id.
              Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId());
              if (lastVersion != null && Math.abs(lastVersion) >= 
versionOnUpdate) {
                // This update is a repeat, or was reordered.  We need to drop 
this update.
                log.debug("Dropping add update due to version {}", 
idBytes.utf8ToString());
                return true;
              }

              // also need to re-apply newer deleteByQuery commands
              checkDeleteByQueries = true;
            }
{code}

Seems to me like if the leader and replica's clocks are out-of-sync, then for a 
new add, either the replica's highest is too low so the if block applies or too 
high and the else block applies, but since the doc doesn't exist, lastVersion 
== null. I'll know more once I reproduce it again, but wanted to let you know 
the current status of this and see if anything jumped out at you as to what 
could cause the replica to be out-of-sync with the leader.

> Seed version buckets with max version from index
> ------------------------------------------------
>
>                 Key: SOLR-7332
>                 URL: https://issues.apache.org/jira/browse/SOLR-7332
>             Project: Solr
>          Issue Type: Sub-task
>          Components: SolrCloud
>            Reporter: Timothy Potter
>            Assignee: Timothy Potter
>         Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch, 
> SOLR-7332.patch, SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each 
> version bucket to the MAX value of the {{__version__}} field in the index as 
> early as possible, such as after the first soft- or hard- commit. This will 
> ensure that bulk adds where the docs don't exist avoid an unnecessary lookup 
> for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to