EdColeman opened a new issue, #2689: URL: https://github.com/apache/accumulo/issues/2689
From PR Comment ZooPropStore.get() method - this can be reviewed as a follow-on to the single node prop store changes. @keith-turner said: Tracing through the code, I think the following sequence of events could happen. Does this seem correct? Want to make sure I am following the code correctly. Thread 1 calls checkZkConnection() and does not block because the conneciton is currently ok Thread 2 sees connection loss event in watcher and clears the ready and queues up an event to clear the cache Thread 3 clears the cache (running in an executor on behalf of event queued by thread 2) Thread 1 executes cache.get()... because the cache was cleared, a loader is executed in a thread pool Thread 4 executes the code to load data from ZK... the connection is still lost, so what happens here? This thread is executing on behalf of the cache.get() initiated by thread 1. I am not sure exactly what happens in step 5 above, does the load from ZK retrty when it sees a connection lost exception? If it does then the checkZkConnection() is redundant. If it does not then the checkZkConnection() does not always prevent problems in the case of connection loss, and maybe it should retry inorder to eliminate the race condition. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
