Github user dragonsinth commented on a diff in the pull request: https://github.com/apache/lucene-solr/pull/32#discussion_r61644721 --- Diff: solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java --- @@ -491,19 +493,28 @@ private void refreshLegacyClusterState(Watcher watcher) final Stat stat = new Stat(); final byte[] data = zkClient.getData(CLUSTER_STATE, watcher, stat, true); final ClusterState loadedData = ClusterState.load(stat.getVersion(), data, emptySet(), CLUSTER_STATE); + final Set<String> liveNodes = new HashSet<>(this.liveNodes); synchronized (getUpdateLock()) { if (this.legacyClusterStateVersion >= stat.getVersion()) { // Nothing to do, someone else updated same or newer. return; } - this.legacyCollectionStates = loadedData.getCollectionStates(); - this.legacyClusterStateVersion = stat.getVersion(); - for (Map.Entry<String, ClusterState.CollectionRef> entry : this.legacyCollectionStates.entrySet()) { - if (entry.getValue().isLazilyLoaded() == false) { - // a watched collection - trigger notifications - notifyStateWatchers(entry.getKey(), entry.getValue().get()); + LOG.info("Updating legacy cluster state - {} entries in legacyCollectionStates", legacyCollectionStates.size()); + for (Map.Entry<String, CollectionWatch> watchEntry : this.collectionWatches.entrySet()) { + String coll = watchEntry.getKey(); + CollectionWatch collWatch = watchEntry.getValue(); + ClusterState.CollectionRef ref = this.legacyCollectionStates.get(coll); + if (ref == null) + continue; + // watched collection, so this will always be local + DocCollection newState = ref.get(); + if (!collWatch.stateWatchers.isEmpty() + && !Objects.equals(loadedData.getCollectionStates().get(coll).get(), newState)) { + notifyStateWatchers(liveNodes, coll, newState); --- End diff -- I just realized you don't want to call user code while holding the update lock. I think you're going to need to move this out of the synchronized block. In fact.... this is really nasty now that I think about it. In general, you're going to want to defer calling any user code until the current constuctState() operation finishes. Otherwise, the user code is potentially going to see a stale copy of the state that you haven't finished updating yet. I think we're going to have to build a queue of outstanding state watchers to notify and always call them later, probably in an executor. I know that sounds like a bit of work, but I'm not sure I can see how it would be safe otherwise. @markrmiller any thoughts?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org