It looks to me somewhat related to below JIRA, but needs
investigation if it would work in below situations.
https://jira.apache.org/jira/browse/GEODE-3649
https://jira.apache.org/jira/browse/GEODE-3580 (Does not look
like FIXED even though it’s marked as FIXED in 1.3.0)
Thanks,
Dharam
*From:*Thacker, Dharam
*Sent:* Tuesday, February 05, 2019 2:39 PM
*To:* 'user@geode.apache.org <mailto:user@geode.apache.org>'
<user@geode.apache.org <mailto:user@geode.apache.org>>
*Subject:* RE: NullPointerException with GMS services (Geode 1.6.0)
Hi Udo,
Did you get chance to review code?
In below one, I have used ReflectionBasedAutoSerializer, but
it’s same issue with ‘MappingPdxSerializer’ as well.
Please let me know if I can help you with any more samples.
Thanks,
Dharam
*From:*Dharam Thacker [mailto:dharamthacke...@gmail.com]
*Sent:* Monday, February 04, 2019 5:41 PM
*To:* user@geode.apache.org <mailto:user@geode.apache.org>
*Subject:* Re: NullPointerException with GMS services (Geode 1.6.0)
Hi Udo,
Sure! I have attached demo project with apache-geode 1.8.0 &
spring-data-geode 2.1.4.RELEASE.
Hope that helps!
Download demo.tar from ->
https://drive.google.com/open?id=10j3NQEkb74n9tNBuWxza9J772p7IYg-7
*_Steps:_*
1. Start locator
$GFSH start locator --name=locator1 --port=10334
--J=-Denable-network-partition-detection=false --log-level=config
2. Start server1
java -jar demo-0.0.1-SNAPSHOT.jar -Ddemo.name
<http://Ddemo.name>=demo-server-1 -Ddemo.port=40440
3. Start server2
java -jar demo-0.0.1-SNAPSHOT.jar -Ddemo.name
<http://Ddemo.name>=demo-server-2 -Ddemo.port=40441
4. Execute below command to suspend server 'demo-server-2'
PID=`ps auxwww | fgrep 'java' | fgrep 'demo-server-2' | awk
'{print $2}'` ; kill -STOP $PID
5. Make sure server 'demo-server-2' is thrown out of distributed
system from logs/pulse
6. Resume server 'demo-server-2' and let it bootstrap and rejoin
distributed system (This will take few seconds so be patient!)
PID=`ps auxwww | fgrep 'java' | fgrep 'demo-server-2' | awk
'{print $2}'` ; kill -CONT $PID
7. Let the member make at least 2 reconnect attempt (You will
see all possible track traces I provided earlier)
Thanks,
- Dharam Thacker
On Fri, Feb 1, 2019 at 11:23 PM Udo Kohlmeyer <u...@apache.org
<mailto:u...@apache.org>> wrote:
Hi there Dharam,
Given that you are testing with Geode 1.6 + 1.8 and
seemingly you are using Spring Data Geode. Could you
possibly provide a little more information on HOW you are
starting the servers? Also what version Spring Data Geode
and maybe, if you could, a simplified configuration / java
classes that cause this to happen.
--Udo
On 2/1/19 04:37, Thacker, Dharam wrote:
John/Bruce,
It’s the same behavior even with “MappingPdxSerializer”
for “Force Reconnect” L
I would really appreciate if we can address into
GEODE-1.9.0 release J
It tries for 6 attempts to reconnect distributed system.
In first attempt it still complains for “pdxSerializer”
issue and in later attempts it always gives,
[info 2019-02-01 18:01:33.878 IST <ReconnectThread>
internal.ClusterDistributionManager tid=142] Serial
Queue info : THROTTLE_PERCENT: 0.75
SERIAL_QUEUE_BYTE_LIMIT :41943040 SERIAL_QUEUE_THROTTLE
:31457280 TOTAL_SERIAL_QUEUE_BYTE_LIMIT :83886080
TOTAL_SERIAL_QUEUE_THROTTLE :31457280
SERIAL_QUEUE_SIZE_LIMIT :20000
SERIAL_QUEUE_SIZE_THROTTLE :15000
[info 2019-02-01 18:01:33.883 IST <ReconnectThread>
gms.Services tid=142] Starting membership services
[fatal 2019-02-01 18:01:33.883 IST <ReconnectThread>
gms.Services tid=142] Unexpected exception while booting
membership services
_java.lang.NullPointerException_
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.establishLocalAddress(_JGroupsMessenger.java:483_)
…
@EnablePdx(readSerialized = *true*, serializerBeanName =
"mappingPdxSerializer")
@Bean
public MappingPdxSerializer mappingPdxSerializer() {
MappingPdxSerializer mappingPdxSerializer = new
MappingPdxSerializer();
Map<Class<?>, EntityInstantiator> customInstantiators =
new HashMap<>();
EntityInstantiator instantiator =
ReflectionEntityInstantiator.INSTANCE;
customInstantiators.put(A1.class, instantiator);
customInstantiators.put(A1Audit.class, instantiator);
customInstantiators.put(B1.class, instantiator);
customInstantiators.put(B1Audit.class, instantiator);
return mappingPdxSerializer;
}
1 more issue with MappingPdxSerializer,
Class A1 àregion A1
a1Reposiroty.findAll() àThis should give me List<A1> as
a standard contact.
ØWorks with ReflectionBasedAutoSerializer
ØFails with “MappingPdxSerializer” (It gives
List<PdxInstanceImp> and internally you have to call
pdxInstance.getObject() to get real instance of A1)
Thanks,
Dharam
*From:*Thacker, Dharam
*Sent:* Friday, February 01, 2019 12:58 PM
*To:* 'user@geode.apache.org
<mailto:user@geode.apache.org>' <user@geode.apache.org
<mailto:user@geode.apache.org>>
*Subject:* RE: NullPointerException with GMS services
(Geode 1.6.0)
Hi John/Bruce,
Typo in email was by me but code does not have that.
Here is the detailed logs explaining what might have
happened and looks like a CRITICAL BUG only in both
(1.6.0) and (1.8.0) as well.
[Attached email thread for GEODE 1.8.0 as well]
*_ISSUES:_*
Ø1 common scenario which fails – Member is unable to
join distributed system back after FORCE DICONNECT
ØWhy does PDXSerializer play a role while boot strapping
cache member? (Note : No disk persistence is being used
but only ReflectionBasedAutoSerializer)
ØGMS is throwing NullPointer exception in this situation
ØSame error messages and stack trace for GMS is being
printed once with fatal/error/warn log levels which does
not look consistent.
[Pending] : Dharam to find out if that’s the same issue
with MappingPdxSerializer as suggested by John
Current logs both for GEODE – 1.6.0 & GEODE 1.8.0
[warn 2019-02-01 00:05:29.000 EST <StatSampler>
statistics.HostStatSampler] Statistics sampling thread
detected a wakeup delay of 16617 ms, indicating a
possible resource issue. Check the GC, memory, and CPU
statistics.
[fatal 2019-02-01 00:05:29.937 EST <unicast
receiver,iaasn00005748-49024> gms.Services] Membership
service failure: Failed to acknowledge a new membership
view and then failed tcp/ip connection attempt
org.apache.geode.ForcedDisconnectException: Failed to
acknowledge a new membership view and then failed tcp/ip
connection attempt
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.forceDisconnect(GMSMembershipManager.java:2503)
at
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1049)
at
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processRemoveRequest(GMSJoinLeave.java:654)
at
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:1810)
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1301)
at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
at org.jgroups.JChannel.up(JChannel.java:741)
at
org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
at
org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
at
org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
at
org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
at
org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:73)
at
org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:72)
at org.jgroups.protocols.TP
<http://org.jgroups.protocols.TP>.passMessageUp(TP.java:1658)
at org.jgroups.protocols.TP
<http://org.jgroups.protocols.TP>$SingleMessageHandler.run(TP.java:1876)
at
org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
at org.jgroups.protocols.TP
<http://org.jgroups.protocols.TP>.handleSingleMessage(TP.java:1789)
at org.jgroups.protocols.TP
<http://org.jgroups.protocols.TP>.receive(TP.java:1714)
at
org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:152)
at
org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701)
at java.lang.Thread.run(Thread.java:745)
[error 2019-02-01 00:06:35.651 EST <ReconnectThread>
cache.GemFireCacheImpl]
org.apache.geode.cache.CacheClosedException: Could not
PDX serialize because the cache was closed
[warn 2019-02-01 00:06:35.652 EST <ReconnectThread>
internal.InternalDistributedSystem] Exception occurred
while trying to create the cache during reconnect
org.apache.geode.cache.CacheClosedException: Could not
PDX serialize because the cache was closed
at
org.apache.geode.pdx.internal.TypeRegistry.getPdxSerializer(TypeRegistry.java:317)
at
org.apache.geode.internal.InternalDataSerializer.writeUserObject(InternalDataSerializer.java:1648)
at
org.apache.geode.internal.InternalDataSerializer.writeWellKnownObject(InternalDataSerializer.java:1548)
at
org.apache.geode.internal.InternalDataSerializer.basicWriteObject(InternalDataSerializer.java:2200)
at
org.apache.geode.DataSerializer.writeObject(DataSerializer.java:2952)
at
org.apache.geode.internal.cache.MemberFunctionStreamingMessage.toData(MemberFunctionStreamingMessage.java:315)
at
org.apache.geode.internal.InternalDataSerializer.invokeToData(InternalDataSerializer.java:2398)
at
org.apache.geode.internal.InternalDataSerializer.writeDSFID(InternalDataSerializer.java:1517)
at
org.apache.geode.internal.tcp.MsgStreamer.writeMessage(MsgStreamer.java:234)
at
org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:394)
at
org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:251)
at
org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:616)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1686)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1864)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2860)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:2780)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2819)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1523)
at
org.apache.geode.internal.cache.execute.StreamingFunctionOperation.getFunctionResultFrom(StreamingFunctionOperation.java:107)
at
org.apache.geode.internal.cache.execute.MemberFunctionExecutor.executeFunction(MemberFunctionExecutor.java:151)
at
org.apache.geode.internal.cache.execute.MemberFunctionExecutor.executeFunction(MemberFunctionExecutor.java:189)
at
org.apache.geode.internal.cache.execute.AbstractExecution.execute(AbstractExecution.java:392)
at
org.apache.geode.internal.cache.ClusterConfigurationLoader.requestConfigurationFromOneLocator(ClusterConfigurationLoader.java:312)
at
org.apache.geode.internal.cache.ClusterConfigurationLoader.requestConfigurationFromLocators(ClusterConfigurationLoader.java:282)
at
org.apache.geode.internal.cache.GemFireCacheImpl.requestSharedConfiguration(GemFireCacheImpl.java:1066)
at
org.apache.geode.internal.cache.GemFireCacheImpl.<init>(GemFireCacheImpl.java:857)
at
org.apache.geode.internal.cache.GemFireCacheImpl.basicCreate(GemFireCacheImpl.java:794)
at
org.apache.geode.internal.cache.GemFireCacheImpl.create(GemFireCacheImpl.java:773)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2765)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2530)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1044)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:3406)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.uncleanShutdown(GMSMembershipManager.java:1534)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.lambda$forceDisconnect$3(GMSMembershipManager.java:2531)
at java.lang.Thread.run(Thread.java:745)
[fatal 2019-02-01 00:07:35.810 EST <ReconnectThread>
gms.Services] Unexpected exception while booting
membership services
java.lang.NullPointerException
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.establishLocalAddress(JGroupsMessenger.java:483)
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:361)
at
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:146)
at
org.apache.geode.distributed.internal.membership.gms.GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:105)
at
org.apache.geode.distributed.internal.membership.MemberFactory.newMembershipManager(MemberFactory.java:90)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:771)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:889)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:533)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:769)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:362)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:348)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:342)
at
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:215)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2704)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2530)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1044)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:3406)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.uncleanShutdown(GMSMembershipManager.java:1534)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.lambda$forceDisconnect$3(GMSMembershipManager.java:2531)
at java.lang.Thread.run(Thread.java:745)
[error 2019-02-01 00:07:35.810 EST <ReconnectThread>
gms.Services] Unexpected problem starting up membership
services
java.lang.NullPointerException
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.establishLocalAddress(JGroupsMessenger.java:483)
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:361)
at
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:146)
at
org.apache.geode.distributed.internal.membership.gms.GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:105)
at
org.apache.geode.distributed.internal.membership.MemberFactory.newMembershipManager(MemberFactory.java:90)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:771)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:889)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:533)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:769)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:362)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:348)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:342)
at
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:215)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2704)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2530)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1044)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:3406)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.uncleanShutdown(GMSMembershipManager.java:1534)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.lambda$forceDisconnect$3(GMSMembershipManager.java:2531)
at java.lang.Thread.run(Thread.java:745)
Thanks,
Dharam
*From:*John Blum [mailto:jb...@pivotal.io]
*Sent:* Tuesday, January 29, 2019 3:49 AM
*To:* user@geode.apache.org <mailto:user@geode.apache.org>
*Subject:* Re: NullPointerException with GMS services
(Geode 1.6.0)
Although, there is an apparent problem with your Locator
configuration property, which can/will lead to an NPE
thrown by Geode in this case (as I reported in
https://issues.apache.org/jira/browse/GEODE-6153, which
is not exactly the same as your problem), as Bruce
points out, this is occurring during reconnect.
If the default value of the Locator property had been
used (which it apparently wasn't in this case), then the
node would have failed even on initial startup. So,
something else is going wrong here.
On Mon, Jan 28, 2019 at 10:32 AM Bruce Schuchardt
<bschucha...@pivotal.io <mailto:bschucha...@pivotal.io>>
wrote:
These NPEs are probably not your main problem since
their occurring in a Reconnect Thread. Something
bad happened when you tried to start these caches.
They joined the cluster and then were kicked out for
some reason. The log file for that process will
likely tell you what the problem was. Post it or PM
it to me if you want help.
On 1/27/19 10:56 PM, Thacker, Dharam wrote:
java.lang.NullPointerException
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.establishLocalAddress(JGroupsMessenger.java:468)
at
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:355)
at
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:157)
at
org.apache.geode.distributed.internal.membership.gms.GMSMemberFactory.newMembershipManager(GMSMemberFactory.java:106)
at
org.apache.geode.distributed.internal.membership.MemberFactory.newMembershipManager(MemberFactory.java:90)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1027)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.<init>(ClusterDistributionManager.java:1061)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:554)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:763)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:355)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:341)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.newInstance(InternalDistributedSystem.java:335)
at
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:211)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2736)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2560)
at
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1041)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:4033)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.uncleanShutdown(GMSMembershipManager.java:1554)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.lambda$forceDisconnect$1(GMSMembershipManager.java:2561)
at java.lang.Thread.run(Thread.java:745)
--
-John
john.blum10101 (skype)
This message is confidential and subject to terms at:
https://www.jpmorgan.com/emaildisclaimer
<https://www.jpmorgan.com/emaildisclaimer> including on
confidentiality, legal privilege, viruses and monitoring
of electronic messages. If you are not the intended
recipient, please delete this message and notify the
sender immediately. Any unauthorized use is strictly
prohibited.
This message is confidential and subject to terms at:
https://www.jpmorgan.com/emaildisclaimer
<https://www.jpmorgan.com/emaildisclaimer> including on
confidentiality, legal privilege, viruses and monitoring of
electronic messages. If you are not the intended recipient,
please delete this message and notify the sender immediately.
Any unauthorized use is strictly prohibited.