[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615307#comment-17615307 ] ASF subversion and git services commented on GEODE-9484: Commit 7838d5d5210ab3afe243b2045242d7bfd6acc6ed in geode's branch refs/heads/master from Mario Kevo [ https://gitbox.apache.org/repos/asf?p=geode.git;h=7838d5d521 ] GEODE-9484: Improve sending message to multy destinations (#7664) (#7853) Co-authored-by: Mario Ivanac <48509724+miva...@users.noreply.github.com> > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.1, 1.16.0 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17602980#comment-17602980 ] ASF subversion and git services commented on GEODE-9484: Commit 7838d5d5210ab3afe243b2045242d7bfd6acc6ed in geode's branch refs/heads/support/1.15 from Mario Kevo [ https://gitbox.apache.org/repos/asf?p=geode.git;h=7838d5d521 ] GEODE-9484: Improve sending message to multy destinations (#7664) (#7853) Co-authored-by: Mario Ivanac <48509724+miva...@users.noreply.github.com> > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.1, 1.16.0 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561512#comment-17561512 ] ASF subversion and git services commented on GEODE-9484: Commit 1d25728e09f788e2ca043cbaf393f01c4df576f9 in geode's branch refs/heads/develop from Mario Ivanac [ https://gitbox.apache.org/repos/asf?p=geode.git;h=1d25728e09 ] GEODE-9484: Improve sending message to multy destinations (#7664) * GEODE-9484: New solution to first try only one attempt to create all connections * GEODE-9484: added fix for NPE > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.1 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532560#comment-17532560 ] ASF subversion and git services commented on GEODE-9484: Commit 0ad4c8a9f66b1e9dd861efd7c76b28135a07cb36 in geode's branch refs/heads/develop from Mario Ivanac [ https://gitbox.apache.org/repos/asf?p=geode.git;h=0ad4c8a9f6 ] Revert "GEODE-9484: Improve sending message to multy destinations (#7381)" (#7655) This reverts commit 62cd12c7f0bbb3d092011555e714e57ce041791a. > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532128#comment-17532128 ] Mario Ivanac commented on GEODE-9484: - It seems that some scenario, exception without message is thrown, an due to that, when checking content of message, we have NPE. > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532066#comment-17532066 ] Ivan Godwin commented on GEODE-9484: Noting possible issue found during internal testing. Will analyze further and update. {code:java} [fatal 2022/05/04 18:27:45.084 GMT gemfire-cluster-server-3 tid=0x6c] While pushing message :41000; callbackArg=null; processorId=0; op=CREATE; applied=false; directAck=true; posdup=false; hasDelta=false; hasOldValue=false; version={v1; rv1; time=1651688854339} FilterRoutingInfo(remote={gemfire-cluster-server-0(gemfire-cluster-server-0:1):41000=}); lastModified=1651688854339; key=0; newValue=(5 bytes); eventId=EventID[id=58 bytes;threadID=853971;sequenceID=0]; deserializationPolicy=LAZY; context=identity(gemfire-clients-564c765b59-vg4wb(SpringBasedClientCacheApplication:1:loner):48828:dd7d5290:SpringBasedClientCacheApplication,connection=1)> to recipients: :41000> java.lang.NullPointerException at org.apache.geode.internal.tcp.TCPConduit.getFirstScanForConnection(TCPConduit.java:958) at org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:477) at org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:277) at org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:186) at org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:541) at org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:348) at org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:293) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2067) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1994) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2031) at org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1088) at org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:556) at org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:267) at org.apache.geode.internal.cache.BucketRegion.basicPutPart2(BucketRegion.java:715) at org.apache.geode.internal.cache.map.RegionMapPut.doBeforeCompletionActions(RegionMapPut.java:282) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutAndDeliverEvent(AbstractRegionMapPut.java:301) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWithIndexUpdatingInProgress(AbstractRegionMapPut.java:308) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutIfPreconditionsSatisified(AbstractRegionMapPut.java:296) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutOnSynchronizedRegionEntry(AbstractRegionMapPut.java:282) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutOnRegionEntryInMap(AbstractRegionMapPut.java:273) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.addRegionEntryToMapAndDoPut(AbstractRegionMapPut.java:251) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutRetryingIfNeeded(AbstractRegionMapPut.java:216) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doWithIndexInUpdateMode(AbstractRegionMapPut.java:198) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPut(AbstractRegionMapPut.java:180) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWhileLockedForCacheModification(AbstractRegionMapPut.java:119) at org.apache.geode.internal.cache.map.RegionMapPut.runWhileLockedForCacheModification(RegionMapPut.java:161) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.put(AbstractRegionMapPut.java:169) at org.apache.geode.internal.cache.AbstractRegionMap.basicPut(AbstractRegionMap.java:2016) at org.apache.geode.internal.cache.BucketRegion.virtualPut(BucketRegion.java:544) at org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5635) at org.apache.geode.internal.cache.PartitionedRegionDataStore.putLocally(PartitionedRegionDataStore.java:1193) at org.apache.geode.internal.cache.PartitionedRegion.putInBucket(PartitionedRegion.java:3033) at org.apache.geode.internal.cache.PartitionedRegion.virtualPut(PartitionedRegion.java:2248) at org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:171) at org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5628) at org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5588) at org.apache.geode.internal.cache.LocalRegion.basicBridgePut(LocalRegion.java:5259) at
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531898#comment-17531898 ] ASF subversion and git services commented on GEODE-9484: Commit 758ef27045019cbe5654aed42b52898cc41ceaa8 in geode's branch refs/heads/revert-7381-newfeature1/GEODE-9484 from Mario Ivanac [ https://gitbox.apache.org/repos/asf?p=geode.git;h=758ef27045 ] Revert "GEODE-9484: Improve sending message to multy destinations (#7381)" This reverts commit 62cd12c7f0bbb3d092011555e714e57ce041791a. > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531896#comment-17531896 ] Anilkumar Gingade commented on GEODE-9484: -- [~mivanac] Can you please revert these changes. Re-open the ticket, address the NPE with new additional tests. > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531895#comment-17531895 ] Anilkumar Gingade commented on GEODE-9484: -- [~mivanac] The PR/Fix for this issue is showing up NPE in internal tests. Here is the stack trace: java.lang.NullPointerException at org.apache.geode.internal.tcp.TCPConduit.getFirstScanForConnection(TCPConduit.java:958) at org.apache.geode.distributed.internal.direct.DirectChannel.getConnections(DirectChannel.java:477) at org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:277) at org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:543) at org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:348) at org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:293) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2067) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1994) at org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2031) at org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1088) at org.apache.geode.internal.cache.CreateRegionProcessor.initializeRegion(CreateRegionProcessor.java:115) at org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1176) at org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1107) at org.apache.geode.internal.cache.HARegion.initialize(HARegion.java:323) at org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3103) at org.apache.geode.internal.cache.HARegion.getInstance(HARegion.java:246) at org.apache.geode.internal.cache.ha.HARegionQueue.(HARegionQueue.java:365) at org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.(HARegionQueue.java:2233) at org.apache.geode.internal.cache.ha.HARegionQueue$DurableHARegionQueue.(HARegionQueue.java:2478) at org.apache.geode.internal.cache.ha.HARegionQueue.getHARegionQueueInstance(HARegionQueue.java:2015) at org.apache.geode.internal.cache.tier.sockets.MessageDispatcher.getMessageQueue(MessageDispatcher.java:166) at org.apache.geode.internal.cache.tier.sockets.MessageDispatcher.(MessageDispatcher.java:146) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.createMessageDispatcher(CacheClientProxy.java:1685) at org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.initializeMessageDispatcher(CacheClientProxy.java:1677) at org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.initializeProxy(CacheClientNotifier.java:502) at org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.registerClientInternal(CacheClientNotifier.java:406) at org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.registerClient(CacheClientNotifier.java:221) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$ClientQueueInitializerTask.run(AcceptorImpl.java:1896) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeClientQueueInitializerThreadPool$1(AcceptorImpl.java:678) at org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120) at java.lang.Thread.run(Thread.java:750) > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that
[jira] [Commented] (GEODE-9484) Data inconsistency in replicated region with 3 or more servers, and one server is down
[ https://issues.apache.org/jira/browse/GEODE-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530825#comment-17530825 ] ASF subversion and git services commented on GEODE-9484: Commit 62cd12c7f0bbb3d092011555e714e57ce041791a in geode's branch refs/heads/develop from Mario Ivanac [ https://gitbox.apache.org/repos/asf?p=geode.git;h=62cd12c7f0 ] GEODE-9484: Improve sending message to multy destinations (#7381) * GEODE-9484: New solution to first try only one attempt to create all connections > Data inconsistency in replicated region with 3 or more servers, and one > server is down > --- > > Key: GEODE-9484 > URL: https://issues.apache.org/jira/browse/GEODE-9484 > Project: Geode > Issue Type: Improvement > Components: client/server, regions >Affects Versions: 1.13.0 >Reporter: Mario Ivanac >Assignee: Mario Ivanac >Priority: Major > Labels: pull-request-available > > We have configured replicated region with 3 or more servers, and client is > configured with read timeout set to value same or smaller than member timeout. > In case while client is putting data in region, one of replicated servers is > shutdown, it is observed that we have data inconsistency. > > We see that data part of data is written in server connected with client, but > in remaining replicated servers it is missing. -- This message was sent by Atlassian Jira (v8.20.7#820007)