Eric Shu created GEODE-9311:
-------------------------------
Summary: It is possible that JedisCluster client may still retry
if the Geode Redis server it is connected to shuts down
Key: GEODE-9311
URL: https://issues.apache.org/jira/browse/GEODE-9311
Project: Geode
Issue Type: Bug
Components: redis
Reporter: Eric Shu
Even after the issue (GEODE-9310) is addressed, the retry issue may still occur
if the JedisCluster client is connected to the node is being shut down.
Here is a test run result:
vm2 gets the command:
[vm2] [warn 2021/05/25 16:01:41.479 PDT server-2 <GeodeRedisServer-Command-1>
tid=0x64] Executing Redis command: ZREM key member1-212 member1-211 member1-214
member1-213 member1-210 member1-209 member1-208 member1-205 member1-204
*** command is executed on the primary ***
[vm2] [warn 2021/05/25 16:01:41.503 PDT server-2 <Function Execution
Processor2> tid=0x51] bucket region is
BucketRegion[path='/__PR/_B__REDIS__DATA_123;serial=200;primary=true] key key
[vm2] java.lang.Exception
[vm2] at
org.apache.geode.internal.cache.BucketRegion.virtualPut(BucketRegion.java:533)
[vm2] at
org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5586)
[vm2] at
org.apache.geode.internal.cache.PartitionedRegionDataStore.putLocally(PartitionedRegionDataStore.java:1213)
[vm2] at
org.apache.geode.internal.cache.PartitionedRegion.putInBucket(PartitionedRegion.java:3024)
[vm2] at
org.apache.geode.internal.cache.PartitionedRegion.virtualPut(PartitionedRegion.java:2236)
[vm2] at
org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5586)
And distributed to replica (vm1):
vm1] [warn 2021/05/25 16:01:41.518 PDT server-1 <P2P message reader for
192.168.0.14(server-2:83136)<v2>:41002 shared ordered sender uid=7 local
port=58643 remote port=53459> tid=0x59] membersRemoveAll invoked
[vm1] java.lang.Exception
[vm1] at
org.apache.geode.redis.internal.data.RedisSortedSet.membersRemoveAll(RedisSortedSet.java:175)
[vm1] at
org.apache.geode.redis.internal.data.RedisSortedSet.applyDelta(RedisSortedSet.java:89)
[vm1] at
org.apache.geode.redis.internal.data.AbstractRedisData.fromDelta(AbstractRedisData.java:193)
[vm1] at
org.apache.geode.internal.cache.EntryEventImpl.processDeltaBytes(EntryEventImpl.java:1841)
[vm1] at
org.apache.geode.internal.cache.EntryEventImpl.setNewValueInRegion(EntryEventImpl.java:1696)
[vm1] at
org.apache.geode.internal.cache.EntryEventImpl.putExistingEntry(EntryEventImpl.java:1643)
[vm1] at
org.apache.geode.internal.cache.map.RegionMapPut.updateEntry(RegionMapPut.java:485)
[vm1] at
org.apache.geode.internal.cache.map.RegionMapPut.createOrUpdateEntry(RegionMapPut.java:256)
vm2 is bounced:
[vm2] [info 2021/05/25 16:01:41.526 PDT server-2 <RMI TCP
Connection(1)-192.168.0.14> tid=0x14] Got result: 83136
[vm2] from org.apache.geode.test.dunit.VM$$Lambda$370/1141741369.call with 0
args on object: org.apache.geode.test.dunit.VM$$Lambda$370/1141741369@68ecd55a
(took 0 ms)
[info 2021/05/25 16:01:41.527 PDT <pool-1-thread-2> tid=0x36] Bouncing 2 old
pid is 83136 and version is 10240.0.0
The JedisCluster client did not get response back (possibly detects connection
is gone), and it does retry again, as the test failed with following:
redis.clients.jedis.exceptions.JedisClusterMaxAttemptsException: No more
cluster attempts left.
at
redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:84)
at
redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
at
redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
at
redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
at
redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
at
redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
at
redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:25)
at redis.clients.jedis.JedisCluster.zscore(JedisCluster.java:1075)
at
org.apache.geode.redis.internal.executor.sortedset.ZRemDUnitTest.verifyDataNotExist(ZRemDUnitTest.java:194)
at
org.apache.geode.redis.internal.executor.sortedset.ZRemDUnitTest.zRemCanRemoveMembersFromSortedSetDuringPrimaryIsCrashed(ZRemDUnitTest.java:180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:138)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)