muse-dev[bot] commented on a change in pull request #2285:
URL: https://github.com/apache/lucene-solr/pull/2285#discussion_r568077584



##########
File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java
##########
@@ -1058,10 +1056,29 @@ public ZkStateReader getZkStateReader() {
   }
 
   public void offerStateUpdate(byte[] data) throws KeeperException, 
InterruptedException {
+    // When cluster state change is distributed, the Overseer cluster state 
update queue should only ever receive only QUIT messages.
+    // These go to sendQuitToOverseer for execution path clarity.
+    if (distributedClusterChangeUpdater.isDistributedStateChange()) {
+      final ZkNodeProps message = ZkNodeProps.load(data);

Review comment:
       *THREAD_SAFETY_VIOLATION:*  Unprotected write. Non-private method 
`Overseer.offerStateUpdate(...)` indirectly writes to field 
`noggit.JSONParser.devNull.buf` outside of synchronization.
    Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##########
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##########
@@ -463,7 +468,11 @@ public boolean isClosed() {
 
     init();

Review comment:
       *THREAD_SAFETY_VIOLATION:*  Read/Write race. Non-private method 
`ZkController(...)` indirectly reads with synchronization from 
`noggit.JSONParser.devNull.buf`. Potentially races with unsynchronized write in 
method `ZkController.preClose()`.
    Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##########
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##########
@@ -2660,17 +2679,26 @@ public boolean 
checkIfCoreNodeNameAlreadyExists(CoreDescriptor dcore) {
    */
   public void publishNodeAsDown(String nodeName) {
     log.info("Publish node={} as DOWN", nodeName);
-    ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION, 
OverseerAction.DOWNNODE.toLower(),
-        ZkStateReader.NODE_NAME_PROP, nodeName);
-    try {
-      overseer.getStateUpdateQueue().offer(Utils.toJSON(m));
-    } catch (AlreadyClosedException e) {
-      log.info("Not publishing node as DOWN because a resource required to do 
so is already closed.");
-    } catch (InterruptedException e) {
-      Thread.currentThread().interrupt();
-      log.debug("Publish node as down was interrupted.");
-    } catch (KeeperException e) {
-      log.warn("Could not publish node as down: ", e);
+    if (distributedClusterChangeUpdater.isDistributedStateChange()) {
+      // Note that with the current implementation, when distributed cluster 
state updates are enabled, we mark the node
+      // down synchronously from this thread, whereas the Overseer cluster 
state update frees this thread right away and
+      // the Overseer will async mark the node down but updating all affected 
collections.
+      // If this is an issue (i.e. takes too long), then the call below should 
be executed from another thread so that
+      // the calling thread can immediately return.
+      distributedClusterChangeUpdater.executeNodeDownStateChange(nodeName, 
zkStateReader);

Review comment:
       *THREAD_SAFETY_VIOLATION:*  Unprotected write. Non-private method 
`ZkController.publishNodeAsDown(...)` indirectly writes to field 
`noggit.JSONParser.devNull.buf` outside of synchronization.
    Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##########
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##########
@@ -1662,22 +1680,20 @@ public void unregister(String coreName, CoreDescriptor 
cd, boolean removeCoreFro
     }
     CloudDescriptor cloudDescriptor = cd.getCloudDescriptor();
     if (removeCoreFromZk) {
-      ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION,
-          OverseerAction.DELETECORE.toLower(), ZkStateReader.CORE_NAME_PROP, 
coreName,
+      ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION, 
OverseerAction.DELETECORE.toLower(),
+          ZkStateReader.CORE_NAME_PROP, coreName,
           ZkStateReader.NODE_NAME_PROP, getNodeName(),
           ZkStateReader.COLLECTION_PROP, cloudDescriptor.getCollectionName(),
           ZkStateReader.CORE_NODE_NAME_PROP, coreNodeName);
-      overseerJobQueue.offer(Utils.toJSON(m));
+      if (distributedClusterChangeUpdater.isDistributedStateChange()) {
+        
distributedClusterChangeUpdater.doSingleStateUpdate(DistributedClusterChangeUpdater.MutatingCommand.SliceRemoveReplica,
 m,
+            getSolrCloudManager(), zkStateReader);

Review comment:
       *THREAD_SAFETY_VIOLATION:*  Read/Write race. Non-private method 
`ZkController.unregister(...)` indirectly reads without synchronization from 
`this.cloudManager`. Potentially races with write in method 
`ZkController.getSolrCloudManager()`.
    Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##########
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##########
@@ -1662,22 +1680,20 @@ public void unregister(String coreName, CoreDescriptor 
cd, boolean removeCoreFro
     }
     CloudDescriptor cloudDescriptor = cd.getCloudDescriptor();
     if (removeCoreFromZk) {
-      ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION,
-          OverseerAction.DELETECORE.toLower(), ZkStateReader.CORE_NAME_PROP, 
coreName,
+      ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION, 
OverseerAction.DELETECORE.toLower(),
+          ZkStateReader.CORE_NAME_PROP, coreName,
           ZkStateReader.NODE_NAME_PROP, getNodeName(),
           ZkStateReader.COLLECTION_PROP, cloudDescriptor.getCollectionName(),
           ZkStateReader.CORE_NODE_NAME_PROP, coreNodeName);
-      overseerJobQueue.offer(Utils.toJSON(m));
+      if (distributedClusterChangeUpdater.isDistributedStateChange()) {
+        
distributedClusterChangeUpdater.doSingleStateUpdate(DistributedClusterChangeUpdater.MutatingCommand.SliceRemoveReplica,
 m,

Review comment:
       *THREAD_SAFETY_VIOLATION:*  Unprotected write. Non-private method 
`ZkController.unregister(...)` indirectly writes to field 
`noggit.JSONParser.devNull.buf` outside of synchronization.
    Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##########
File path: solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java
##########
@@ -1446,9 +1446,9 @@ public void refreshAndWatch() {
     }
   }
 
-  public static DocCollection getCollectionLive(ZkStateReader zkStateReader, 
String coll) {
+  public DocCollection getCollectionLive(String coll) {
     try {
-      return zkStateReader.fetchCollectionState(coll, null);
+      return fetchCollectionState(coll, null);

Review comment:
       *THREAD_SAFETY_VIOLATION:*  Unprotected write. Non-private method 
`ZkStateReader.getCollectionLive(...)` indirectly writes to field 
`noggit.JSONParser.devNull.buf` outside of synchronization.
    Reporting because another access to the same memory occurs on a background 
thread, although this access may not.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to