[ 
https://issues.apache.org/jira/browse/SOLR-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240321#comment-15240321
 ] 

Steve Rowe commented on SOLR-8662:
----------------------------------

Lots of good changes in your patch, Varun!  I ran all Solr+Solrj+contrib tests 
and everything passes.

I found a few issues though.

This change is good, but there remains a pre-existing problem:

{code:java|title:SchemaManager.java}
-    int timeout = req.getParams().getInt(BaseSolrResource.UPDATE_TIMEOUT_SECS, 
-1);
-    long startTime = System.nanoTime();
-    long endTime = timeout > 0 ? System.nanoTime() + (timeout * 1000 * 1000) : 
Long.MAX_VALUE;
+    //The default timeout is 60s when no BaseSolrResource.UPDATE_TIMEOUT_SECS 
is specified
+    int timeout = req.getParams().getInt(BaseSolrResource.UPDATE_TIMEOUT_SECS, 
60);
+
+    //If BaseSolrResource.UPDATE_TIMEOUT_SECS=0 or -1 then end time is 
unlimited.
+    long endTime = timeout > 0 ? TimeUnit.NANOSECONDS.convert(timeout, 
TimeUnit.SECONDS) + System.nanoTime() : Long.MAX_VALUE;
{code}

The problem is that {{System.nanoTime()}} can return *anything* in the range 
\[{{Long.MIN_VALUE}}, {{Long.MAX_VALUE}}\] and it intentionally overflows, so 
making {{endTime}} be {{Long.MAX_VALUE}} is effectively choosing a random time 
in the future; this is *not* the same as "unlimited".  This same situation is 
dealt with in the {{QueryTimeout}} implementations - used by 
{{ExitableDirectoryReader}} - by adding a large value to the current value of 
{{nanoTime()}} to arrive at an effectively unlimited end time.

----

Your changes in {{ZkSolrResourceLoader.openResource()}} introduce the 
possibility of retrying 10 times to fetch a resource from ZK when it's not 
there (according to {{zkController.pathExists(file)}) - I think in this case 
the retry loop should be immediately exited, so that the classpath search can 
start immediately.  (I'm assuming that the intent here is *not* to wait around 
for another thread/process to finish uploading a resource - if that's the case 
then the branch for when the path doesn't exist should have a timeout, which it 
doesn't in your patch.)

----

bq. Without the changes to SchemaManager the test would fail.

Indeed, when I revert the SchemaManager changes, 
{{TestManagedSchemaAPI.testAddFieldAndDocument()}} fails.  But 
{{testReloadAndAddSimple()}} always succeeds, regardless of the SchemaManager 
changes, so I'm not sure why it's there, since it doesn't have anything to do 
with new field visibility - can it be removed?

> SchemaManager doesn't wait correctly for replicas to update
> -----------------------------------------------------------
>
>                 Key: SOLR-8662
>                 URL: https://issues.apache.org/jira/browse/SOLR-8662
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Varun Thacker
>            Assignee: Varun Thacker
>         Attachments: SOLR-8662.patch
>
>
> Currently in SchemaManager, waitForOtherReplicasToUpdate doesn't execute 
> since it gets passed the timeout value instead of the end time 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to