Hi Dharam- Regarding...
But the point for *spring data geode* is, >> *How to deal with such scenario where node is no more a part of distributed system but app is running?* Usually one would like to take some automated corrective action on such alerts. Let me ask you question.... what do you think would happen if you ran the exact same Geode server, but configure only using the Apache Geode public API. I agree with your assessment above. But, it also requires underlying support from Apache Geode, like the ability to register a listener and capture events (e.g. on Authentication failure) with enough details to be able to provide a means to deal with such error conditions. Minimally, Apache Geode ought to fail fast and bubble Exceptions out, not eat them, so that SDG could perhaps apply appropriate Exception translation and/or error handling/recovery, etc. There are several instances where Apache Geode, even when it throws an Exception, does not provide enough details in the Exception and the error message to ascertain the problem. Case in point, something I ran into recently, was incompatibility in the cache DEFAULT Pool with the configuration of (supposedly) the "DEFAULT" Pool using o.a.g.cache.client.ClientCacheFactory. The Exception being... Caused by: java.lang.IllegalStateException: Existing cache's default pool was not compatible at org.apache.geode.internal.cache. GemFireCacheImpl.determineDefaultPool(GemFireCacheImpl.java:2965) <https://github.com/apache/geode/blob/rel/v1.4.0/geode-core/src/main/java/org/apache/geode/internal/cache/GemFireCacheImpl.java#L2965> *[1]* at org.apache.geode.cache.client.ClientCacheFactory.basicCreate(ClientCacheFactory.java:253) at org.apache.geode.cache.client.ClientCacheFactory.create(ClientCacheFactory.java:214) at org.springframework.data.gemfire.client.ClientCacheFactoryBean.createCache(ClientCacheFactoryBean.java:407) .... Of course, when you look into PoolImpl.isCompatible(:Pool) <https://github.com/apache/geode/blob/rel/v1.4.0/geode-core/src/main/java/org/apache/geode/cache/client/internal/PoolImpl.java#L278-L300> [2] you will find no assistance in what is "deemed incompatible". You literally need to debug this. So,what can SDG do? When I look into your *Spring (Data Geode)* XML configuration file, I am 100% certain that it is the CacheServer preventing your SDG configured Geode server from shutting down anyway. The CacheServer Socket listening for client connections is handled by a non-daemon Thread, and in fact, is the only reason your Geode server does not fall through and exit on startup. Remove the CacheServer configuration from your config and your server would have no reason to exist even though it connects to your cluster with the "locators" property set and most likely gets its Region configuration from Cluster Config (since I see you have enabled that as well). The server will literally exit. I'd argue that if the Geode server gets disconnected, enters a reconnecting state, and fails after a number of (perhaps, configured attempts), or fails because of an Authentication failure, then Geode should shutdown the CacheServer and the JVM should exit. For now, your best bet would be to query and use the cache.getDistributedSystem().isConnected() state, or similar operations on the DistributedSystem interface (you can even stop the reconnecting operation). Additionally, get a Thread dump and be certain what Threads are preventing the JVM exit. Regards, John [1] https://github.com/apache/geode/blob/rel/v1.4.0/geode-core/src/main/java/org/apache/geode/internal/cache/GemFireCacheImpl.java#L2965 [2] https://github.com/apache/geode/blob/rel/v1.4.0/geode-core/src/main/java/org/apache/geode/cache/client/internal/PoolImpl.java#L278-L300 On Mon, Apr 16, 2018 at 1:57 AM, Thacker, Dharam < [email protected]> wrote: > Hi John, > > > > *1. **What type of application are you building to configure your > (embedded) peer cache? E.g. is this a Web application with an embedded > peer cache instance? Or, are you simply using Spring to configure a peer > cache, Geode server node, that is itself not an actual application?* > > > > It is client/server topology where both client & server nodes are > configured using spring data geode. > > I am using spring data geode to simply configure geode server node > [Attached actual config] > > > > *2. *I have enabled enable-auto-reconnect="true" as well. > > > > *3. *As per pulse and gfsh shell, node was thrown out of the > distributed system as it does not show there. > > > > *4. *security-udp-dhalgo is coming as “******” on node restart > where I am assuming that it might be setting some value by default. I am > not sure as it’s masked. > > > > > > *Ok. I would request geode team to validate on issue part.* > > > > But the point for *spring data geode* is, > > >> *How to deal with such scenario where node is no more a part of > distributed system but app is running?* > > > > Usually one would like to take some automated corrective action on such > alerts. > > > > Thanks & Regards, > > Dharam > > > > *From:* John Blum [[email protected]] > *Sent:* Monday, April 16, 2018 1:03 PM > *To:* Thacker, Dharam > *Cc:* [email protected]; Bruce Schuchardt > > *Subject:* Re: AuthenticationRequiredException on force disconnection > > > > Dharam- > > > > There is nothing *Spring* can do if Apache Geode is not respecting the > setting for the Apache Geode property, "*security-udp-dhalgo*". That is > all Apache Geode. > > > > Also, I suspect that the AuthenticationRequiredException is not bubbling > out because somewhere Geode is eating the Exception in the chain of calls > and simply logging it (at log level, "warning", no less, as in it is > considered "fatal"). Though, I would point out that even if the Exception > was propagated out that it is not necessarily going to cause the JVM to > exit either. It all depends on the existing Thread types (e.g. non/daemon) > and their state (e.g. BLOCKED). > > > > *What type of application are you building to configure your (embedded) > peer cache? E.g. is this a Web application with an embedded peer cache > instance? Or, are you simply using Spring to configure a peer cache, Geode > server node, that is itself not an actual application?* > > > > A quick Thread dump on the JVM when this Exception occurs will quickly > reveal what "non-daemon" Threads are blocking the JVM from shutting down, > whether that is perhaps a Web application Thread (e.g. embedded Web > server/container request processing Thread, etc) or a Geode Thread, holding > up the JVM. > > > > Also keep in mind, that I deliberately disabled [1] the reconnecting state > when using SDG, since I would assume that a user is using SDG to build an > application using Apache Geode (whether as a client cache app, the common > scenario, or even when building embedded peer cache applications). As > such, a user would need to explicitly enable this Geode property, for > instance, when the user is only using Spring to configure a peer cache, > Geode server. To enable this "reconnectable" state in an application is > futile since, if a Geode node gets "disconnected", then all Geode object > references (from the Cache to Regions, to anything else) are not "stale". > Therefore, if you have injected references to Geode objects into your > application components (e.g. like a Region into a DAO/Repo), then you are > going to have problems. > > > > As far as reporting the state, you can determine whether your peer node > (application) is still connected by querying... cache. > getDistributedSystem() > <http://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/GemFireCache.html#getDistributedSystem--> > [2].isConnected() > <http://geode.apache.org/releases/latest/javadoc/org/apache/geode/distributed/DistributedSystem.html#isConnected--> > [3], or even isReconnecting() > <http://geode.apache.org/releases/latest/javadoc/org/apache/geode/distributed/DistributedSystem.html#isReconnecting--> > [4]. As far as trying to determining whether an > AuthenticationRequiredException has happened, great question! > > > > If I think of anything, I will let you know. > > > > Regards, > > John > > > > > > [1] https://github.com/spring-projects/spring-data-geode/ > blob/2.0.6.RELEASE/src/main/java/org/springframework/data/ > gemfire/CacheFactoryBean.java#L204-L205 > > [2] http://geode.apache.org/releases/latest/javadoc/org/ > apache/geode/cache/GemFireCache.html#getDistributedSystem-- > > [3] http://geode.apache.org/releases/latest/javadoc/org/ > apache/geode/distributed/DistributedSystem.html#isConnected-- > > [4] http://geode.apache.org/releases/latest/javadoc/org/ > apache/geode/distributed/DistributedSystem.html#isReconnecting-- > > > > > > On Sun, Apr 15, 2018 at 11:08 PM, Thacker, Dharam < > [email protected]> wrote: > > Thanks John/Bruce! > > > > But it does not work as expected. > > > > I tried setting (*security-udp-dhalgo=*) in properties file in both > locators and servers. > > I also confirmed the same by verifying config level logs and using locator > with following command which explicitly mentions that *“security-udp-dhalgo” > is empty (gemfire.sys.security-udp-dhalgo =)* > > > > >> describe config –member=locator1 > > >> describe config –member=server1 > > > > But even after that, I see following exception which is same as before. > > > > More, it looks like that once GEODE server member reboot itself after > force disconnection, it does not respect *security-udp-dhalgo* override > in properties file (*My assumption based on logs*) > > I see *security-udp-dhalgo=********* in startup configuration *after > member’s attempt to connect to distributed system*. > > > > [warning 2018/04/15 21:23:57.095 EDT event-server-1<ReconnectThread> > tid=0x215] Exception occurred while trying to connect the system during > reconnect > > org.apache.geode.security.AuthenticationRequiredException: Failed to find > credentials from [host-001(event-server-1:32054)<ec>:1025] > > at org.apache.geode.distributed.internal.membership.gms. > membership.GMSJoinLeave.attemptToJoin(GMSJoinLeave.java:424) > > at org.apache.geode.distributed.internal.membership.gms. > membership.GMSJoinLeave.join(GMSJoinLeave.java:318) > > at org.apache.geode.distributed.internal.membership.gms.mgr. > GMSMembershipManager.join(GMSMembershipManager.java:656) > > at org.apache.geode.distributed.internal.membership.gms.mgr. > GMSMembershipManager.joinDistributedSystem(GMSMembershipManager.java:745) > > at org.apache.geode.distributed.internal.membership.gms. > Services.start(Services.java:181) > > at org.apache.geode.distributed.internal.membership.gms. > GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:102) > > at org.apache.geode.distributed.internal.membership.MemberFactory. > newMembershipManager(MemberFactory.java:89) > > at org.apache.geode.distributed.internal.DistributionManager.< > init>(DistributionManager.java:1112) > > at org.apache.geode.distributed.internal.DistributionManager.< > init>(DistributionManager.java:1160) > > at org.apache.geode.distributed.internal.DistributionManager. > create(DistributionManager.java:531) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.initialize(InternalDistributedSystem.java:687) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.newInstance(InternalDistributedSystem.java:299) > > at org.apache.geode.distributed.DistributedSystem.connect( > DistributedSystem.java:202) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2675) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.tryReconnect(InternalDistributedSystem. > java:2508) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.disconnect(InternalDistributedSystem.java:983) > > at org.apache.geode.distributed.internal.DistributionManager$ > MyListener.membershipFailure(DistributionManager.java:4307) > > > > John, > > > > This does not kill GEODE application. It still runs as it is. This makes > APM tool to assume that application is healthy and is not facing any issues. > > What do you suggest to rectify this? > > > > Is there any example if I can report state of GEODE server as > “UNHEALTHY”/”DISCONNECTED”? > > Is there any example if I can listen to these notifications and come up > with some health check service? > > > > Thanks & Regards, > > Dharam > > > > *From:* John Blum [[email protected]] > *Sent:* Friday, April 13, 2018 2:59 AM > *To:* [email protected] > *Subject:* Re: AuthenticationRequiredException on force disconnection > > > > Regarding *Spring*, not really too differently actually, see here > <https://github.com/jxblum/contacts-application/blob/master/configuration-example/src/main/resources/spring-server-cache.xml#L24-L33> > [1] > (XML) and here > <https://github.com/jxblum/contacts-application/blob/master/configuration-example/src/main/java/example/app/spring/java/server/JavaConfiguredGeodeServerApplication.java#L66-L84> > [2] > (*JavaConfig*) (followed by this > <https://github.com/jxblum/contacts-application/blob/master/configuration-example/src/main/java/example/app/spring/java/server/JavaConfiguredGeodeServerApplication.java#L91> > [3] > and this > <https://github.com/jxblum/contacts-application/blob/master/configuration-example/src/main/java/example/app/spring/java/server/JavaConfiguredGeodeServerApplication.java#L96> > [4]). > > > > There is even an Annotation-based approach > <https://github.com/jxblum/contacts-application/blob/master/configuration-example/src/main/java/example/app/spring/annotation/server/AnnotationConfiguredGeodeServerApplication.java> > [5] > for the curious onlooker. > > > > > > [1] https://github.com/jxblum/contacts-application/blob/ > master/configuration-example/src/main/resources/spring- > server-cache.xml#L24-L33 > > [2] https://github.com/jxblum/contacts-application/blob/ > master/configuration-example/src/main/java/example/app/spring/java/server/ > JavaConfiguredGeodeServerApplication.java#L66-L84 > > [3] https://github.com/jxblum/contacts-application/blob/ > master/configuration-example/src/main/java/example/app/spring/java/server/ > JavaConfiguredGeodeServerApplication.java#L91 > > [4] https://github.com/jxblum/contacts-application/blob/ > master/configuration-example/src/main/java/example/app/spring/java/server/ > JavaConfiguredGeodeServerApplication.java#L96 > > [5] https://github.com/jxblum/contacts-application/blob/ > master/configuration-example/src/main/java/example/app/ > spring/annotation/server/AnnotationConfiguredGeodeServerApplication.java > > > > > > On Thu, Apr 12, 2018 at 2:17 PM, Bruce Schuchardt <[email protected]> > wrote: > > The setting merely causes Geode to encrypt packets sent over UDP. > > > > On 4/11/18 10:29 AM, Thacker, Dharam wrote: > > > Would there be any negative impact on disabling 'security-udp-dhalgo' on > peer to peer members or pulse or jmx notifications ? > > Thanks, > Dharam > > > > > > > > -- > > -John > > john.blum10101 (skype) > > This message is confidential and subject to terms at: http:// > www.jpmorgan.com/emaildisclaimer including on confidentiality, legal > privilege, viruses and monitoring of electronic messages. If you are not > the intended recipient, please delete this message and notify the sender > immediately. Any unauthorized use is strictly prohibited. > > > > ---------- Forwarded message ---------- > From: Bruce Schuchardt <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Bcc: > Date: Thu, 12 Apr 2018 21:15:21 +0000 > Subject: Re: AuthenticationRequiredException on force disconnection > > I'm not sure what your development context is so it's hard to answer that > question. If you're programatically creating a cache then set the cache > property ConfigurationProperties.SECURITY_UDP_DHALGO to an empty string. > If you're using a properties file set it to blank. > > security-udp-dhalgo= > > -or- > > cachefactory.set(SECURITY_UDP_DHALGO, "") > > I don't recall how you set properties for the cache under Spring. > > > > On 4/11/18 11:44 PM, Thacker, Dharam wrote: > > Hello Bruce, > > > > I have not manually specified this property to enable udp encryption using > "security-udp-dhalgo" anywhere. I am using TCP mode only. > > > > Is it by default enabled? If yes, how can I disable it? > > > > I could not find any documentation on it. > > > > Thanks, > > Dharam > > > > Sent with BlackBerry Work (www.blackberry.com) > ------------------------------ > > *From: *"Thacker, Dharam" <[email protected]> > *Sent: *Apr 11, 2018 10:59 PM > *To: *[email protected] > *Subject: *RE: AuthenticationRequiredException on force disconnection > > Thank you Bruce! > > > > I will surely open a JIRA soon. > > > > "Geode sends membership information, alerts and on rare occasions a PDX > > registration message over UDP" > > > > Would there be any negative impact on disabling 'security-udp-dhalgo' on > peer to peer members or pulse or jmx notifications ? > > > > Thanks, > > Dharam > > > > Sent with BlackBerry Work (www.blackberry.com) > ------------------------------ > > *From: *Bruce Schuchardt <[email protected]> > *Sent: *Apr 11, 2018 8:45 PM > *To: *[email protected] > *Subject: *Re: AuthenticationRequiredException on force disconnection > > That looks like a bug in UDP encryption. Can you open a JIRA ticket to > track this? Set the component to "membership". Looking at the unit > test suite I don't think there is any coverage for auto-reconnect with > security-udp-dhalgo enabled. > > As a workaround you could, if you're comfortable doing so, disable > security-udp-dhalgo until this is fixed. There are other known issues > with this fairly new setting that people have been working on recently. > > Geode sends membership information, alerts and on rare occasions a PDX > registration message over UDP. No client/server messages are sent over > UDP so its use is confined to your server cluster. No messages > containing application objects (keys, values, callback args etc) are > sent over UDP unless you set disable-tcp=true to disable use of tcp/ip > stream sockets. > > > On 4/11/18 4:38 AM, Thacker, Dharam wrote: > > warning 2018/04/10 02:40:59.541 EDT event-server-1 <ReconnectThread> > tid=0x217] Exception occurred while trying to connect the system during > reconnect > > org.apache.geode.security.AuthenticationRequiredException: Failed to > find credentials from [host001(event-server-1:3525)<ec>:1026] > > at org.apache.geode.distributed.internal.membership.gms. > membership.GMSJoinLeave.attemptToJoin(GMSJoinLeave.java:424) > > at org.apache.geode.distributed.internal.membership.gms. > membership.GMSJoinLeave.join(GMSJoinLeave.java:318) > > at org.apache.geode.distributed.internal.membership.gms.mgr. > GMSMembershipManager.join(GMSMembershipManager.java:656) > > at org.apache.geode.distributed.internal.membership.gms.mgr. > GMSMembershipManager.joinDistributedSystem(GMSMembershipManager.java:745) > > at org.apache.geode.distributed.internal.membership.gms. > Services.start(Services.java:181) > > at org.apache.geode.distributed.internal.membership.gms. > GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:102) > > at org.apache.geode.distributed.internal.membership. > MemberFactory.newMembershipManager(MemberFactory.java:89) > > at org.apache.geode.distributed.internal.DistributionManager.< > init>(DistributionManager.java:1112) > > at org.apache.geode.distributed.internal.DistributionManager.< > init>(DistributionManager.java:1160) > > at org.apache.geode.distributed.internal.DistributionManager. > create(DistributionManager.java:531) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.initialize(InternalDistributedSystem.java:687) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.newInstance(InternalDistributedSystem.java:299) > > at org.apache.geode.distributed.DistributedSystem.connect( > DistributedSystem.java:202) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2675) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.tryReconnect(InternalDistributedSystem. > java:2508) > > at org.apache.geode.distributed.internal. > InternalDistributedSystem.disconnect(InternalDistributedSystem.java:983) > > at org.apache.geode.distributed.internal.DistributionManager$ > MyListener.membershipFailure(DistributionManager.java:4307) > > at org.apache.geode.distributed.internal.membership.gms.mgr. > GMSMembershipManager.uncleanShutdown(GMSMembershipManager.java:1530) > > at org.apache.geode.distributed.internal.membership.gms.mgr. > GMSMembershipManager.lambda$forceDisconnect$0(GMSMembershipManager.java: > 2550) > > at java.lang.Thread.run(Thread.java:745) > > This message is confidential and subject to terms at: http:// > www.jpmorgan.com/emaildisclaimer including on confidentiality, legal > privilege, viruses and monitoring of electronic messages. If you are not > the intended recipient, please delete this message and notify the sender > immediately. Any unauthorized use is strictly prohibited. > > This message is confidential and subject to terms at: http:// > www.jpmorgan.com/emaildisclaimer including on confidentiality, legal > privilege, viruses and monitoring of electronic messages. If you are not > the intended recipient, please delete this message and notify the sender > immediately. Any unauthorized use is strictly prohibited. > > > > > > > > > > -- > > -John > > john.blum10101 (skype) > > This message is confidential and subject to terms at: http:// > www.jpmorgan.com/emaildisclaimer including on confidentiality, legal > privilege, viruses and monitoring of electronic messages. If you are not > the intended recipient, please delete this message and notify the sender > immediately. Any unauthorized use is strictly prohibited. > -- -John john.blum10101 (skype)
