Heh! So, apparently the ClientMembership <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembership.html#registerClientMembershipListener(com.gemstone.gemfire.management.membership.ClientMembershipListener)> class [0] is the (only) way to register a ClientMembershipListener, for both clients and servers... (Gotta love those statics; foolish me for thinking there was a more OO way to do that (i.e. via a [ClientCache|Pool]Factory before the ClientCache/Pool is constructed and initialized; or even after the fact for that matter with either the ClientCache or the Pool API itself) ;).
This led me to wonder how the (memberJoined) event actually gets fired then when a client successfully connects to a server, and this appears to be my answer <https://github.com/apache/incubator-geode/blob/develop/gemfire-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/InternalClientMembership.java#L369-L393> [1]. -j [0] - http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembership.html#registerClientMembershipListener(com.gemstone.gemfire.management.membership.ClientMembershipListener) [1] - https://github.com/apache/incubator-geode/blob/develop/gemfire-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/InternalClientMembership.java#L369-L393 On Wed, Jan 20, 2016 at 6:59 PM, John Blum <[email protected]> wrote: > Hi Barry- > > Thank you for the quick response. > > In my test class, I only start one (standalone) server (i.e. no locators, > no other servers, etc) during setup, and the client (test) connects > directly to that server. Unfortunately, without explicit coordination, it > is entirely possible that the client will start first and attempt to > connect while performing a cache operation before the server is fully > started/initialized (and technically listening for/accepting client > connections). > > So, unless there is retry logic in the Pool to try and connect until X > number of attempts at Y intervals has been made, I am not sure how the > ClientMembershipListener will work in this case, especially since the > client has not connected (to anything) yet. > > > *1. How do you register a ClientMembershipListener on the client?* > > I see that it can be registered with the ClientMembership > <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembership.html#registerClientMembershipListener(com.gemstone.gemfire.management.membership.ClientMembershipListener)> > [0] > class, but that appears to be a management component used on the server. > However, based on the *Javadoc* description (for memberJoined(event) > <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembershipListener.html#memberJoined(com.gemstone.gemfire.management.membership.ClientMembershipEvent)> > [1])... > > *"Invoked when a client has connected to this process or when this process > has connected to a CacheServer."* > > It does imply the ClientMembershipListener can be registered and used on > the client, assuming the 2nd reference to "this process" actually refers to > the client. > > > 2. Then, I was thinking, if a ClientMembershipListener can somehow be > registered on the client (where?), that I could potentially implement > memberJoined(event) > <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembershipListener.html#memberJoined(com.gemstone.gemfire.management.membership.ClientMembershipEvent)> > to > block any client cache operations (implemented in the test) until the > client has actually connected. I assume this is what you mean by... > > *"You could possibly use a ClientMembershipListener. If you install one in > your client, the memberJoined callback will tell you when the client > connects to the server."* > > However, I do not see where to register the ClientMembershipListener; the > ClientCacheFactory nor the PoolFactory has any such API to perform the > registration, and presumably, this would need to be done prior to > connecting in order to receive the event when the client does, finally, > successfully connect. > > > 3. Or, I was also thinking, based on the "*How the Pool Connects to a > Server*" section in the GemFire User Guide here > <http://gemfire.docs.pivotal.io/docs-gemfire/latest/topologies_and_comm/topology_concepts/how_the_pool_manages_connections.html>, > that may also be feasible to use a combination of freeConnectionTimeout > <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setFreeConnectionTimeout(int)> > [2] with > pingIntervals > <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setPingInterval(long)> > [3] > (and perhaps, readTimeout > <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int)> > [4]) > to delay the cache operation until the server become available. Though, my > thinking here may be off basis, and this approach seems less reliable given > the time-dependent, race condition nature of it. > > > Thanks, > John > > [0] - > http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembership.html#registerClientMembershipListener(com.gemstone.gemfire.management.membership.ClientMembershipListener) > [1] - > http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembershipListener.html#memberJoined(com.gemstone.gemfire.management.membership.ClientMembershipEvent) > [2] - > http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setFreeConnectionTimeout(int) > [3] - > http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setPingInterval(long) > [4] - > http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int) > > > On Wed, Jan 20, 2016 at 4:11 PM, Barry Oglesby <[email protected]> > wrote: > >> John, >> >> CacheServer isRunning is not a reliable way to determine whether the >> CacheServer acceptor is actually listening for connections. >> BridgeServerImpl isRunning (the implementation) asks if the Acceptor is >> non-null and isRunning, which in turn just asks whether it (the Acceptor) >> is not shutdown. The CacheServer isRunning could be true before the >> Acceptor is listening for connections. >> >> You could possibly use a ClientMembershipListener. If you install one in >> your client, the memberJoined callback will tell you when the client >> connects to the server. This will more-or-less do what your custom socket >> code is doing now. It doesn't necessarily tell you all the servers though - >> only the ones that the client has connected to. >> >> There is another option to see all the servers. It only works if you have >> a locator, and it uses some java public (but not Geode public) API. This >> API can be used by the client to determine how many servers there are and >> their locations. I can point you to that if you're interested. >> >> >> Barry Oglesby >> GemFire Advanced Customer Engineering (ACE) >> For immediate support please contact Pivotal Support at >> http://support.pivotal.io/ >> >> >> On Wed, Jan 20, 2016 at 3:28 PM, John Blum <[email protected]> wrote: >> >>> Is there a recommended, (more) reliable means to determine whether a >>> CacheServer (listening for cache clients) has successfully started in a >>> GemFire server from the client-side? >>> >>> Currently, I am employing a form of inter-process communication (e.g. >>> control file) to coordinate the successful startup and general readiness of >>> a server before a client cache attempts to connect inside an integration >>> test. >>> >>> In this case, the test acts as the cache client and connects to the >>> server, but not before forking a GemFire server process during setup, and >>> ideally not before the server is ready (and specifically, not until >>> ServerSocket is "accepting" connections). >>> >>> For the most part, this works fairly consistently, except there exists >>> potential timing issues in the test for server readiness (and specifically, >>> CacheServer listening for connections), particularly on the server >>> before writing the control file. For example, I have included this code >>> block... >>> >>> assertThat(*waitOnCondition*(new Condition() { >>> @Override public boolean evaluate() { >>> * return gemfireCacheServer.isRunning();* >>> } >>> }), is(true)); >>> >>> writeProcessControlFile(WORKING_DIRECTORY); >>> >>> The client (i.e. test) then checks for the presence of this control file >>> before executing the tests. >>> >>> The waitOnCondition(:Condition) method (see below) functions properly, >>> waiting on the condition for a specified duration (defaults to 20 seconds), >>> checking every 500 ms. However, it would seem CacheServer.isRunning() >>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/server/CacheServer.html#isRunning()> >>> [0] can >>> potentially return *true* before the ServerSocket listening for client >>> connections is actually "accepting" connections. It is less than clear >>> from the Javadoc, (and thus, the user's POV) what >>> CacheServer.isRunning() actually does (without having to dig into code). >>> >>> So, I thought, perhaps a more reliable means to determine whether the >>> server is actually ready, listening for and accepting connections, would be >>> to just open a Socket connection on the client. If I can connect, then >>> the server is presumably ready. So, I coded... >>> >>> boolean waitForCacheServerToStart(final String host, >>> final int port, long duration) { >>> return *waitOnCondition*(new Condition() { >>> AtomicBoolean connected = new AtomicBoolean(false); >>> >>> public boolean evaluate() { >>> Socket socket = null; >>> >>> try { >>> // NOTE: the following code is not meant to be an atomic, >>> compound action (a possible race condition) >>> // opening another connection (at the expense of using >>> system resources) after connectivity >>> // has already been established is not detrimental in this >>> use case >>> if (!connected.get()) { >>> * socket = new Socket(host, port);* >>> connected.set(true); >>> } >>> } >>> catch (IOException ignore) { >>> } >>> finally { >>> GemFireUtils.close(socket); >>> } >>> >>> return connected.get(); >>> } >>> }, duration); >>> } >>> >>> This seems to work OK, though, since I turn around and close the >>> connection right of way, before completing the "handshake", Geode throws... >>> >>> [warn 2016/01/20 14:12:42.599 PST <Handshaker localhost/127.0.0.1:12480 >>> Thread 0> tid=0x22] Bridge server: failed accepting client connection {0} >>> java.io.EOFException >>> at >>> >>> com.gemstone.gemfire.internal.cache.tier.sockets.AcceptorImpl.handleNewClientConnection(AcceptorImpl.java:1508) >>> at >>> com.gemstone.gemfire.internal.cache.tier.sockets.AcceptorImpl$5.run(AcceptorImpl.java:1391) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> >>> There really does not appear to be a better way using the Geode API, and >>> in particular, the PoolFactory >>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html> >>> [1], >>> to set, say, a *retryConnectionTimeout* property along with a >>> *retryConnectionAttempts* property when populating the pool with >>> connections, at least initially during startup, or even when adding more >>> connections to the pool (up to the "max") during heavier loads, unlike >>> similar properties for read/requests operations... setReadTimeout(:int) >>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int)> >>> [2] >>> and setRetryAttempts(:int) >>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setRetryAttempts(int)> >>> [3]. >>> >>> Am I missing anything? Other ideas/recommendations? >>> >>> Thanks, >>> -John >>> >>> [0] - >>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/server/CacheServer.html#isRunning() >>> [1] - >>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html >>> [2] - >>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int) >>> [3] - >>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setRetryAttempts(int) >>> >>> >>> P.S. code for waitOnCondition(..) for the curious minded, ;-) >>> >>> static final long DEFAULT_WAIT_DURATION = TimeUnit.SECONDS.toMillis(20); >>> static final long DEFAULT_WAIT_INTERVAL = 500l; >>> >>> @SuppressWarnings("unused") >>> boolean waitOnCondition(Condition condition) { >>> return waitOnCondition(condition, DEFAULT_WAIT_DURATION); >>> } >>> >>> @SuppressWarnings("all") >>> boolean waitOnCondition(Condition condition, long duration) { >>> final long timeout = (System.currentTimeMillis() + duration); >>> >>> try { >>> while (!condition.evaluate() && System.currentTimeMillis() < >>> timeout) { >>> synchronized (condition) { >>> TimeUnit.MILLISECONDS.timedWait(condition, >>> DEFAULT_WAIT_INTERVAL); >>> } >>> } >>> } >>> catch (InterruptedException e) { >>> Thread.currentThread().interrupt(); >>> } >>> >>> return condition.evaluate(); >>> } >>> >>> >> > > > -- > -John > 503-504-8657 > john.blum10101 (skype) > -- -John 503-504-8657 john.blum10101 (skype)
