[jira] [Updated] (GEODE-9617) CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9617:

Labels: needsTriage  (was: )

> CI Failure: PartitionedRegionSingleHopDUnitTest fails with 
> ConditionTimeoutException waiting for server to bucket map size
> --
>
> Key: GEODE-9617
> URL: https://issues.apache.org/jira/browse/GEODE-9617
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Kirk Lund
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testClientMetadataForPersistentPrs FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService, 
> org.apache.geode.cache.client.internal.ClientMetadataServiceorg.apache.geode.cache.Region
>  
> Expecting actual not to be null within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs(PartitionedRegionSingleHopDUnitTest.java:971)
> Caused by:
> java.lang.AssertionError: 
> Expecting actual not to be null
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:976)
> {noformat}
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testMetadataServiceCallAccuracy_FromGetOp FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e> within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testMetadataServiceCallAccuracy_FromGetOp(PartitionedRegionSingleHopDUnitTest.java:394)
> Caused by:
> org.junit.ComparisonFailure: 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e>
> at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
> Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testMetadataServiceCallAccuracy_FromGetOp$6(PartitionedRegionSingleHopDUnitTest.java:395)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9617) CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9617:

Affects Version/s: 1.15.0

> CI Failure: PartitionedRegionSingleHopDUnitTest fails with 
> ConditionTimeoutException waiting for server to bucket map size
> --
>
> Key: GEODE-9617
> URL: https://issues.apache.org/jira/browse/GEODE-9617
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Kirk Lund
>Priority: Major
>
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testClientMetadataForPersistentPrs FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService, 
> org.apache.geode.cache.client.internal.ClientMetadataServiceorg.apache.geode.cache.Region
>  
> Expecting actual not to be null within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs(PartitionedRegionSingleHopDUnitTest.java:971)
> Caused by:
> java.lang.AssertionError: 
> Expecting actual not to be null
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:976)
> {noformat}
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testMetadataServiceCallAccuracy_FromGetOp FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e> within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testMetadataServiceCallAccuracy_FromGetOp(PartitionedRegionSingleHopDUnitTest.java:394)
> Caused by:
> org.junit.ComparisonFailure: 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e>
> at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
> Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testMetadataServiceCallAccuracy_FromGetOp$6(PartitionedRegionSingleHopDUnitTest.java:395)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9662) CI Failure: acceptance-test-openjdk8 timed out

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols resolved GEODE-9662.
-
Fix Version/s: 1.15.0
   Resolution: Fixed

> CI Failure: acceptance-test-openjdk8 timed out
> --
>
> Key: GEODE-9662
> URL: https://issues.apache.org/jira/browse/GEODE-9662
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Darrel Schneider
>Assignee: Owen Nichols
>Priority: Major
>  Labels: flaky-test, pull-request-available
> Fix For: 1.15.0
>
>
> this run of acceptance-test-openjdk8 timed out with a total duration of 3h5m. 
> The next one passed with a total duration of 3h1m. Is it possible that we 
> need to extend how much time we give acceptance tests to run? 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/229



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9662) CI Failure: acceptance-test-openjdk8 timed out

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424771#comment-17424771
 ] 

ASF subversion and git services commented on GEODE-9662:


Commit d5995b501931de01a95784f4fe2d69c269612ae8 in geode's branch 
refs/heads/develop from Owen Nichols
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=d5995b5 ]

GEODE-9662: increase acceptance test timeout (#6940)



> CI Failure: acceptance-test-openjdk8 timed out
> --
>
> Key: GEODE-9662
> URL: https://issues.apache.org/jira/browse/GEODE-9662
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Darrel Schneider
>Assignee: Owen Nichols
>Priority: Major
>  Labels: flaky-test, pull-request-available
>
> this run of acceptance-test-openjdk8 timed out with a total duration of 3h5m. 
> The next one passed with a total duration of 3h1m. Is it possible that we 
> need to extend how much time we give acceptance tests to run? 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/229



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9675) CI: ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9675:

Affects Version/s: 1.15.0

> CI: ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED
> -
>
> Key: GEODE-9675
> URL: https://issues.apache.org/jira/browse/GEODE-9675
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1983
> {code:java}
> ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED
> org.apache.geode.SystemConnectException: Problem starting up membership 
> services
> at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:186)
> at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:466)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:499)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:328)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:757)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:133)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3013)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:283)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:209)
> at 
> org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:159)
> at 
> org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:180)
> at 
> org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:256)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManagerDUnitTest.testConnectAfterBeingShunned(ClusterDistributionManagerDUnitTest.java:170)
> Caused by:
> 
> org.apache.geode.distributed.internal.membership.api.MemberStartupException: 
> unable to create jgroups channel
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:401)
> at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:203)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1642)
> at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
> ... 13 more
> Caused by:
> java.lang.Exception: failed to open a port in range 41003-41003
> at 
> org.jgroups.protocols.UDP.createMulticastSocketWithBindPort(UDP.java:503)
> at org.jgroups.protocols.UDP.createSockets(UDP.java:348)
> at org.jgroups.protocols.UDP.start(UDP.java:266)
> at 
> org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:966)
> at org.jgroups.JChannel.startStack(JChannel.java:889)
> at org.jgroups.JChannel._preConnect(JChannel.java:553)
> at org.jgroups.JChannel.connect(JChannel.java:288)
> at org.jgroups.JChannel.connect(JChannel.java:279)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:397)
> ... 16 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9677) CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9677:

Affects Version/s: 1.15.0

> CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED
> 
>
> Key: GEODE-9677
> URL: https://issues.apache.org/jira/browse/GEODE-9677
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> https://hydradb.hdb.gemfire-ci.info/hdb/testresult/11846626
> {code:java}
> GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED
> java.lang.AssertionError: expected:<0> but was:<1>
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.geode.internal.cache.GIIDeltaDUnitTest.testExpiredTombstoneSkippedGC(GIIDeltaDUnitTest.java:1534)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9672) Disable native redis multibulk test

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9672:

Labels: blocks-1.14.1 pull-request-available  (was: pull-request-available)

> Disable native redis multibulk test
> ---
>
> Key: GEODE-9672
> URL: https://issues.apache.org/jira/browse/GEODE-9672
> Project: Geode
>  Issue Type: Test
>  Components: redis, tests
>Affects Versions: 1.14.1, 1.15.0
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>  Labels: blocks-1.14.1, pull-request-available
> Fix For: 1.15.0
>
>
> In Redis 5.0.14 a new AUTH-related test was introduced that is failing when 
> run as part of the pipeline. The change here is simply to disable the test. 
> In the future we will evaluate whether we can re-enable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9672) Disable native redis multibulk test

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9672:

Fix Version/s: 1.15.0

> Disable native redis multibulk test
> ---
>
> Key: GEODE-9672
> URL: https://issues.apache.org/jira/browse/GEODE-9672
> Project: Geode
>  Issue Type: Test
>  Components: redis, tests
>Affects Versions: 1.14.1, 1.15.0
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> In Redis 5.0.14 a new AUTH-related test was introduced that is failing when 
> run as part of the pipeline. The change here is simply to disable the test. 
> In the future we will evaluate whether we can re-enable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9636) CI failure: NoClassDefFoundError in lucene examples

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9636:

Labels: GeodeOperationAPI pull-request-available  (was: GeodeOperationAPI 
blocks-1.15.0​ pull-request-available)

> CI failure: NoClassDefFoundError in lucene examples
> ---
>
> Key: GEODE-9636
> URL: https://issues.apache.org/jira/browse/GEODE-9636
> Project: Geode
>  Issue Type: Bug
>  Components: lucene
>Reporter: Darrel Schneider
>Assignee: Xiaojian Zhou
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
>
> The lucene examples have started failing (3 runs in a row) with the following 
> exceptions:
> org.apache.geode_examples.luceneSpatial.TrainStopSerializerTest > 
> serializerReturnsSingleDocument FAILED
> java.lang.NoClassDefFoundError at TrainStopSerializerTest.java:30
> Caused by: java.lang.ClassNotFoundException at 
> TrainStopSerializerTest.java:30
> org.apache.geode_examples.luceneSpatial.SpatialHelperTest > 
> queryFindsADocumentThatWasAdded FAILED
> java.lang.NoClassDefFoundError at SpatialHelperTest.java:45
> The first failed run was: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-examples/jobs/test-examples/builds/243



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9630) Gateway sender has public setter methods that should not be exposed

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9630:

Labels:   (was: needsTriage)

> Gateway sender has public setter methods that should not be exposed
> ---
>
> Key: GEODE-9630
> URL: https://issues.apache.org/jira/browse/GEODE-9630
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Udo Kohlmeyer
>Priority: Blocker
>
> Looking at the GatewaySender interface I noticed there are numerous public 
> setter methods. Geode should not allow for the ability to directly change 
> GatewaySender functionality without proper process.
> This is largely to avoid the introduction of side effects into the system. A 
> prime example of this is, the ability to call `setGroupTransactionEvents`, 
> which from what I understand should NEVER be allowed to be changed in just 1 
> server instead of cluster-wide. This by writing a function and changing the 
> setting on only 1 server can run the risk of the whole system behaving 
> incorrectly causing failures which would be close to impossible to track down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9662) CI Failure: acceptance-test-openjdk8 timed out

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9662:

Labels: flaky-test pull-request-available  (was: flaky-test needsTriage 
pull-request-available)

> CI Failure: acceptance-test-openjdk8 timed out
> --
>
> Key: GEODE-9662
> URL: https://issues.apache.org/jira/browse/GEODE-9662
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Darrel Schneider
>Assignee: Owen Nichols
>Priority: Major
>  Labels: flaky-test, pull-request-available
>
> this run of acceptance-test-openjdk8 timed out with a total duration of 3h5m. 
> The next one passed with a total duration of 3h1m. Is it possible that we 
> need to extend how much time we give acceptance tests to run? 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/229



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9672) Disable native redis multibulk test

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9672:

Affects Version/s: 1.15.0
   1.14.1

> Disable native redis multibulk test
> ---
>
> Key: GEODE-9672
> URL: https://issues.apache.org/jira/browse/GEODE-9672
> Project: Geode
>  Issue Type: Test
>  Components: redis, tests
>Affects Versions: 1.14.1, 1.15.0
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
>
> In Redis 5.0.14 a new AUTH-related test was introduced that is failing when 
> run as part of the pipeline. The change here is simply to disable the test. 
> In the future we will evaluate whether we can re-enable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9662) CI Failure: acceptance-test-openjdk8 timed out

2021-10-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9662:
--
Labels: flaky-test needsTriage pull-request-available  (was: flaky-test 
needsTriage)

> CI Failure: acceptance-test-openjdk8 timed out
> --
>
> Key: GEODE-9662
> URL: https://issues.apache.org/jira/browse/GEODE-9662
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Darrel Schneider
>Assignee: Owen Nichols
>Priority: Major
>  Labels: flaky-test, needsTriage, pull-request-available
>
> this run of acceptance-test-openjdk8 timed out with a total duration of 3h5m. 
> The next one passed with a total duration of 3h1m. Is it possible that we 
> need to extend how much time we give acceptance tests to run? 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/229



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9630) Gateway sender has public setter methods that should not be exposed

2021-10-05 Thread Udo Kohlmeyer (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424756#comment-17424756
 ] 

Udo Kohlmeyer commented on GEODE-9630:
--

[~mivanac], Would this be something that you could look at and resolve?

> Gateway sender has public setter methods that should not be exposed
> ---
>
> Key: GEODE-9630
> URL: https://issues.apache.org/jira/browse/GEODE-9630
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Udo Kohlmeyer
>Priority: Blocker
>  Labels: needsTriage
>
> Looking at the GatewaySender interface I noticed there are numerous public 
> setter methods. Geode should not allow for the ability to directly change 
> GatewaySender functionality without proper process.
> This is largely to avoid the introduction of side effects into the system. A 
> prime example of this is, the ability to call `setGroupTransactionEvents`, 
> which from what I understand should NEVER be allowed to be changed in just 1 
> server instead of cluster-wide. This by writing a function and changing the 
> setting on only 1 server can run the risk of the whole system behaving 
> incorrectly causing failures which would be close to impossible to track down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9663) throw and handle AuthenticationExpiredException at login time

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424755#comment-17424755
 ] 

ASF subversion and git services commented on GEODE-9663:


Commit c5e8031f314c3d58df0ba7bbdf0d68704bdb in geode's branch 
refs/heads/develop from Jinmei Liao
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=c5e ]

GEODE-9663: throw and handle AuthenticationExpiredException at login time 
(#6927)

* GEODE-9663: throw and handle AuthenticationExpiredException at login time

> throw and handle AuthenticationExpiredException at login time
> -
>
> Key: GEODE-9663
> URL: https://issues.apache.org/jira/browse/GEODE-9663
> Project: Geode
>  Issue Type: Sub-task
>Reporter: Jinmei Liao
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
>
> There is a time gap between credentials are gathered at the client and 
> credentials are examined on the server, if between this time period, 
> credentials expired (rare case, but it does happen a lot during stress tests, 
> see `AuthExpirationMultiServerDunitTest.consecutivePut` test), client 
> operations might fail during the operations (even after user is 
> authenticated, doing a put/get might need to open new connections and require 
> authentication again). So we should allow "authenticate" method also throw 
> AuthExpirationException if they want to give client a chance to retry (only 
> retry once).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9630) Gateway sender has public setter methods that should not be exposed

2021-10-05 Thread Owen Nichols (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424751#comment-17424751
 ] 

Owen Nichols commented on GEODE-9630:
-

1.14.0 had 0 setters in the GatewaySender interface

1.15.0 currently has 5 (was 6 but GEODE-9629 removed one of them):
  void setAlertThreshold(int alertThreshold);
  void setBatchSize(int batchSize);
  void setBatchTimeInterval(int batchTimeInterval);
  void setGroupTransactionEvents(boolean groupTransactionEvents);
  void setGatewayEventFilters(List filters);

> Gateway sender has public setter methods that should not be exposed
> ---
>
> Key: GEODE-9630
> URL: https://issues.apache.org/jira/browse/GEODE-9630
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Udo Kohlmeyer
>Priority: Blocker
>  Labels: needsTriage
>
> Looking at the GatewaySender interface I noticed there are numerous public 
> setter methods. Geode should not allow for the ability to directly change 
> GatewaySender functionality without proper process.
> This is largely to avoid the introduction of side effects into the system. A 
> prime example of this is, the ability to call `setGroupTransactionEvents`, 
> which from what I understand should NEVER be allowed to be changed in just 1 
> server instead of cluster-wide. This by writing a function and changing the 
> setting on only 1 server can run the risk of the whole system behaving 
> incorrectly causing failures which would be close to impossible to track down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9629) GatewaySender.setRetriesToGetTransactionEventsFromQueue on public API

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols resolved GEODE-9629.
-
Fix Version/s: 1.15.0
   Resolution: Fixed

> GatewaySender.setRetriesToGetTransactionEventsFromQueue on public API
> -
>
> Key: GEODE-9629
> URL: https://issues.apache.org/jira/browse/GEODE-9629
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Udo Kohlmeyer
>Priority: Blocker
>  Labels: needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> GatewaySender.setRetriesToGetTransactionEventsFromQueue is defined on the 
> public API. 
> The problem with this is that Geode should not allow for the simple 
> modification of settings for a GatewaySender. Without proper process / 
> management around the changing of the properties it would be too simple to 
> introduce side-effects by changing settings on the GatewaySender.
> We (Geode) should NOT allow for the direct manipulation of configuration of 
> ANY component without it having gone through a controlled process, to ensure 
> that there aren't any side effects resulting from the change. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9672) Disable native redis multibulk test

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424746#comment-17424746
 ] 

Geode Integration commented on GEODE-9672:
--

Seen on support/1.14 in [redis-test-openjdk11 
#51|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-support-1-14-main/jobs/redis-test-openjdk11/builds/51]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-support-1-14-main/1.14.1-build.0866/test-results/redisAPITest/1633458255/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-support-1-14-main/1.14.1-build.0866/test-artifacts/1633458255/redistests-openjdk11-1.14.1-build.0866.tgz].

> Disable native redis multibulk test
> ---
>
> Key: GEODE-9672
> URL: https://issues.apache.org/jira/browse/GEODE-9672
> Project: Geode
>  Issue Type: Test
>  Components: redis, tests
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
>
> In Redis 5.0.14 a new AUTH-related test was introduced that is failing when 
> run as part of the pipeline. The change here is simply to disable the test. 
> In the future we will evaluate whether we can re-enable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9672) Disable native redis multibulk test

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424745#comment-17424745
 ] 

Geode Integration commented on GEODE-9672:
--

Seen on support/1.14 in [redis-test-openjdk11 
#52|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-support-1-14-main/jobs/redis-test-openjdk11/builds/52]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-support-1-14-main/1.14.1-build.0866/test-results/redisAPITest/1633473188/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-support-1-14-main/1.14.1-build.0866/test-artifacts/1633473188/redistests-openjdk11-1.14.1-build.0866.tgz].

> Disable native redis multibulk test
> ---
>
> Key: GEODE-9672
> URL: https://issues.apache.org/jira/browse/GEODE-9672
> Project: Geode
>  Issue Type: Test
>  Components: redis, tests
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
>
> In Redis 5.0.14 a new AUTH-related test was introduced that is failing when 
> run as part of the pipeline. The change here is simply to disable the test. 
> In the future we will evaluate whether we can re-enable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9662) CI Failure: acceptance-test-openjdk8 timed out

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424744#comment-17424744
 ] 

Geode Integration commented on GEODE-9662:
--

Seen in [acceptance-test-openjdk8 
#239|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/239]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0543/test-results/acceptanceTest/1633472320/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0543/test-artifacts/1633472320/acceptancetestfiles-openjdk8-1.15.0-build.0543.tgz].

> CI Failure: acceptance-test-openjdk8 timed out
> --
>
> Key: GEODE-9662
> URL: https://issues.apache.org/jira/browse/GEODE-9662
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Darrel Schneider
>Priority: Major
>  Labels: flaky-test, needsTriage
>
> this run of acceptance-test-openjdk8 timed out with a total duration of 3h5m. 
> The next one passed with a total duration of 3h1m. Is it possible that we 
> need to extend how much time we give acceptance tests to run? 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/229



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9662) CI Failure: acceptance-test-openjdk8 timed out

2021-10-05 Thread Owen Nichols (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols reassigned GEODE-9662:
---

Assignee: Owen Nichols

> CI Failure: acceptance-test-openjdk8 timed out
> --
>
> Key: GEODE-9662
> URL: https://issues.apache.org/jira/browse/GEODE-9662
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Darrel Schneider
>Assignee: Owen Nichols
>Priority: Major
>  Labels: flaky-test, needsTriage
>
> this run of acceptance-test-openjdk8 timed out with a total duration of 3h5m. 
> The next one passed with a total duration of 3h1m. Is it possible that we 
> need to extend how much time we give acceptance tests to run? 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/229



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. With that mode, it will be possible for an 
administrator/operator to avoid many of the problematic scenarios mentioned 
above, while retaining the ability to start a first locator which is allowed to 
become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, an administrator/operator starts the first 
locator in "seed" mode. After that the operator starts all subsequent locators 
in "join only" mode. If network partitions occur during startup, those newly 
started nodes will exit with a failure status, but will not become coordinators.

To add a locator to a running cluster, an operator starts it in "join only" 
mode. The new member will similarly either join with an existing coordinator or 
exit with a failure status, thereby avoiding split brains.

When an operator restarts a locator, e.g. during a rolling upgrade, they will 
restarted in "join only" mode. If a network partition is encountered, or the 
{{.dat}} file is missing or stale, the new locator will exit with a failure 
status and split brain will be avoided.
h2. 
FAQ

Q: What should happen if a locator is 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. With that mode, it will be possible for an 
administrator/operator to avoid many of the problematic scenarios mentioned 
above, while retaining the ability to start a first locator which is allowed to 
become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, an administrator/operator starts the first 
locator in "seed" mode. After that the operator starts all subsequent locators 
in "join only" mode. If network partitions occur during startup, those newly 
started nodes will exit with a failure status, but will not become coordinators.

To add a locator to a running cluster, an operator starts it in "join only" 
mode. The new member will similarly either join with an existing coordinator or 
exit with a failure status, thereby avoiding split brains.

When an operator restarts a locator, e.g. during a rolling upgrade, they will 
restarted in "join only" mode. If a network partition is encountered, or the 
{{.dat}} file is missing or stale, the new locator will exit with a failure 
status and split brain will be avoided.

  was:
The issues described here are present in 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD {{LocatorLauncher}} startup parameter) that prevents that locator from 
becoming a coordinator. With that mode, it will be possible for an 
administrator to avoid many of the problematic scenarios mentioned above, while 
retaining the ability to start a first locator which is allowed to become a 
coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, the first locator is started in "seed" mode. 
After that all subsequent locators are started in "join only" mode. If network 
partitions occur, the newly started nodes will exit with a failure status, but 
will not become coordinators.

To add a locator to a running cluster, it will be started in "join only" mode. 
It will similarly either join with an existing coordinator or exit with a 
failure status, thereby avoiding split brains.

When restarting a locator, e.g. during a rolling upgrade, it will be restarted 
in "join only" mode. If a network partition is encountered, or the {{.dat}} 
file is missing or stale, the locator will exit with a failure status and split 
brain will be avoided.

  was:
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption 

[jira] [Updated] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9666:
--
Labels: needsTriage pull-request-available  (was: needsTriage)

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
>   at 
> 

[jira] [Commented] (GEODE-9664) Two different clients with the same durable id will both connect to the servers and receive messages

2021-10-05 Thread Barrett Oglesby (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424732#comment-17424732
 ] 

Barrett Oglesby commented on GEODE-9664:


I checked the behavior of the client if there are no servers. I was thinking 
that these durable client scenarios should behave similar to the no servers 
scenario.

When the Pool is created, the QueueManagerImpl.initializeConnections attempts 
to create the connections. If there are no servers, the ConnectionList's 
primaryDiscoveryException is initialized like:
{noformat}
List servers =
findQueueServers(excludedServers, queuesNeeded, true, false, null);
if (servers == null || servers.isEmpty()) {
  scheduleRedundancySatisfierIfNeeded(redundancyRetryInterval);
  synchronized (lock) {
queueConnections = queueConnections.setPrimaryDiscoveryFailed(null);
lock.notifyAll();
  }
  return;
}
{noformat}
And the empty ConnectionList is created here:
{noformat}
java.lang.Exception: Stack trace
at java.lang.Thread.dumpStack(Thread.java:1333)
at 
org.apache.geode.cache.client.internal.QueueManagerImpl$ConnectionList.(QueueManagerImpl.java:1318)
at 
org.apache.geode.cache.client.internal.QueueManagerImpl$ConnectionList.setPrimaryDiscoveryFailed(QueueManagerImpl.java:1337)
at 
org.apache.geode.cache.client.internal.QueueManagerImpl.initializeConnections(QueueManagerImpl.java:439)
at 
org.apache.geode.cache.client.internal.QueueManagerImpl.start(QueueManagerImpl.java:293)
at 
org.apache.geode.cache.client.internal.PoolImpl.start(PoolImpl.java:359)
at 
org.apache.geode.cache.client.internal.PoolImpl.finishCreate(PoolImpl.java:183)
at 
org.apache.geode.cache.client.internal.PoolImpl.create(PoolImpl.java:169)
at 
org.apache.geode.internal.cache.PoolFactoryImpl.create(PoolFactoryImpl.java:378)
{noformat}
Then when Region.registerInterestForAllKeys is called, it invokes 
ServerRegionProxy.registerInterest which:

- adds the key to the RegisterInterestTracker
- executes the RegisterInterestOp
- removed from key from the RegisterInterestTracker if the RegisterInterestOp 
fails

Here is the code in Region.registerInterestForAllKeys that does the above steps:
{noformat}
try {
  rit.addSingleInterest(region, key, interestType, policy, isDurable,
  receiveUpdatesAsInvalidates);
  result = RegisterInterestOp.execute(pool, regionName, key, interestType, 
policy,
  isDurable, receiveUpdatesAsInvalidates, regionDataPolicy);
  finished = true;
  return result;
} finally {
  if (!finished) {
rit.removeSingleInterest(region, key, interestType, isDurable,
receiveUpdatesAsInvalidates);
  }
}
{noformat}
The Connections are retrieved in QueueManagerImpl.getAllConnections. If there 
are none, a NoSubscriptionServersAvailableException wrapping the 
primaryDiscoveryException is thrown:
{noformat}
Exception in thread "main" 
org.apache.geode.cache.NoSubscriptionServersAvailableException: 
org.apache.geode.cache.NoSubscriptionServersAvailableException: Primary 
discovery failed.
at 
org.apache.geode.cache.client.internal.QueueManagerImpl.getAllConnections(QueueManagerImpl.java:191)
at 
org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:428)
at 
org.apache.geode.cache.client.internal.PoolImpl.executeOnQueuesAndReturnPrimaryResult(PoolImpl.java:875)
at 
org.apache.geode.cache.client.internal.RegisterInterestOp.execute(RegisterInterestOp.java:58)
at 
org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterest(ServerRegionProxy.java:364)
at 
org.apache.geode.internal.cache.LocalRegion.processSingleInterest(LocalRegion.java:3815)
at 
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3911)
at 
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3890)
at 
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3885)
at 
org.apache.geode.cache.Region.registerInterestForAllKeys(Region.java:1709)
{noformat}
Here is logging that shows all this behavior:
{noformat}
[warn 2021/10/04 10:58:22.184 PDT client-a-1  tid=0x1] XXX 
ConnectionList. 
primaryDiscoveryException=org.apache.geode.cache.NoSubscriptionServersAvailableException:
 Primary discovery failed.

[warn 2021/10/04 10:58:22.238 PDT client-a-1  tid=0x1] XXX 
RegisterInterestTracker.addSingleInterest key=.*; rieInterests={.*=KEYS_VALUES}

[warn 2021/10/04 10:58:22.238 PDT client-a-1  tid=0x1] XXX 
ServerRegionProxy.registerInterest about to execute RegisterInterestOp

[warn 2021/10/04 10:58:22.244 PDT client-a-1  tid=0x1] XXX 
QueueManagerImpl.getAllConnections about to throw 
exception=org.apache.geode.cache.NoSubscriptionServersAvailableException: 
org.apache.geode.cache.NoSubscriptionServersAvailableException: Primary 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Description: 
The issues described here are present in all versions of Geode (this is not new 
to 1.15.0)…

Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:
 # it first tries to find the coordinator through one of the (other) configured 
locators
 # if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:
 * if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
 ** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
 ** maybe there is a network partition between the starting locators so they 
can't communicate
 ** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
 * if a cluster is already running and the administrator/operator wants to 
scale it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
 * if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
 ** if the "view persistence" {{.dat}} file is missing or deleted
 ** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD configuration parameter, {{LocatorLauncher}} startup parameter) that 
prevents that locator from becoming a coordinator. With that mode, it will be 
possible for an administrator to avoid many of the problematic scenarios 
mentioned above, while retaining the ability to start a first locator which is 
allowed to become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, the first locator is started in "seed" mode. 
After that all subsequent locators are started in "join only" mode. If network 
partitions occur, the newly started nodes will exit with a failure status, but 
will not become coordinators.

To add a locator to a running cluster, it will be started in "join only" mode. 
It will similarly either join with an existing coordinator or exit with a 
failure status, thereby avoiding split brains.

When restarting a locator, e.g. during a rolling upgrade, it will be restarted 
in "join only" mode. If a network partition is encountered, or the {{.dat}} 
file is missing or stale, the locator will exit with a failure status and split 
brain will be avoided.

  was:
Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever 

[jira] [Updated] (GEODE-9680) Newly Started/Restarted Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-9680:

Summary: Newly Started/Restarted Locators are Susceptible to Split-Brains  
(was: Newly Started Locators are Susceptible to Split-Brains)

> Newly Started/Restarted Locators are Susceptible to Split-Brains
> 
>
> Key: GEODE-9680
> URL: https://issues.apache.org/jira/browse/GEODE-9680
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Priority: Major
>  Labels: needsTriage
>
> Geode is built on the assumption that views progress linearly in a sequence. 
> If that sequence ever forks into two or more parallel lines then we have a 
> "split brain".
> In a split brain condition, each of the parallel views are independent. It's 
> as if you have more than one system running concurrently. It's possible e.g. 
> for some clients to connect to members of one view and other clients to 
> connect to members of another view. Updates to members in one view are not 
> seen by members of a parallel view.
> Geode views are produced by a coordinator. As long as only a single 
> coordinator is running, there is no possibility of a split brain. Split brain 
> arises when more than one coordinator is producing views at the same time.
> Each Geode member (peer) is started with the {{locators}} configuration 
> parameter. That parameter specifies locator(s) to use to find the (already 
> running!) coordinator (member) to join with.
> When a locator (member) starts, it goes through this sequence to find the 
> coordinator:
> # it first tries to find the coordinator through one of the (other) 
> configured locators
> # if it can't contact any of those, it tries contacting non-locator (cache 
> server) members it has retrieved from the "view presistence" ({{.dat}}) file
> If it hasn't found a coordinator to join with, then the locator may _become_ 
> a coordinator.
> Sometimes this is ok. If no other coordinator is currently running then this 
> behavior is fine. An example is when an [administrator is starting up a brand 
> new 
> cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
>  In that case we want the very first locator we start to become the 
> coordinator.
> But there are a number of situations where there may already be another 
> coordinator running but it cannot be reached:
> * if the administrator/operator is starting up a brand new cluster with 
> multiple locators and…
> ** maybe Geode is running in a managed environment like Kubernetes and the 
> locators hostnames are not (yet) resolvable in DNS
> ** maybe there is a network partition between the starting locators so they 
> can't communicate
> ** maybe the existing locators or coordinator are running very slowly or the 
> network is degraded. This is effectively the same as the network partition 
> just mentioned
> * if a cluster is already running and the administrator/operator wants to 
> scale it up by starting/adding a new locator Geode is susceptible to that 
> same network partition issue
> * if a cluster is already running and the administrator/operator needs to 
> restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
> {{locators}} configuration parameter are reachable (maybe they are not 
> running, or maybe there is a network partition) and…
> ** if the "view persistence" {{.dat}} file is missing or deleted
> ** or if the current set of running Geode members has evolved so far that the 
> coordinates (host+port) in the {{.dat}} file are completely out of date
> In each of those cases, the newly starting locator will become a coordinator 
> and will start producing views. Now we'll have the old coordinator producing 
> views at the same time as the new one.
> *When this ticket is complete*, Geode will offer a locator startup mode (via 
> TBD configuration parameter, {{LocatorLauncher}} startup parameter) that 
> prevents that locator from becoming a coordinator. With that mode, it will be 
> possible for an administrator to avoid many of the problematic scenarios 
> mentioned above, while retaining the ability to start a first locator which 
> is allowed to become a coordinator.
> For purposes of discussion we'll call the startup mode that allows the 
> locator to become a coordinator "seed" mode, and we'll call the new startup 
> mode that prevents the locator from becoming a coordinator before first 
> joining, "join-only" mode.
> To start a brand new cluster, the first locator is started in "seed" mode. 
> After that all subsequent locators are started in "join only" mode. If 
> network partitions occur, the newly started nodes will exit with a failure 
> status, 

[jira] [Created] (GEODE-9681) Explain "view persistence" (.dat) files

2021-10-05 Thread Bill Burcham (Jira)
Bill Burcham created GEODE-9681:
---

 Summary: Explain "view persistence" (.dat) files
 Key: GEODE-9681
 URL: https://issues.apache.org/jira/browse/GEODE-9681
 Project: Geode
  Issue Type: Improvement
  Components: docs, membership
Affects Versions: 1.15.0
Reporter: Bill Burcham


The [javadoc 
describes|https://geode.apache.org/releases/latest/javadoc/org/apache/geode/distributed/Locator.html]
 the view persistence ({{.dat}}) file:

{noformat}
Locators persist membership information in a locatorXXXview.dat file. This file 
is used to recover information about the cluster when a locator starts if there 
are no other currently running locators. It allows the restarted locator to 
rejoin the cluster.
{noformat}

When this ticket is complete, the [Running Geode Locator 
Processes|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html]
 page will mention the view persistence {{.dat}} file and explain its purpose. 
Also, the page will highlight the split-brain vulnerabilities (see #GEODE-9680) 
associated with missing or stale {{.dat}} files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9680) Newly Started Locators are Susceptible to Split-Brains

2021-10-05 Thread Bill Burcham (Jira)
Bill Burcham created GEODE-9680:
---

 Summary: Newly Started Locators are Susceptible to Split-Brains
 Key: GEODE-9680
 URL: https://issues.apache.org/jira/browse/GEODE-9680
 Project: Geode
  Issue Type: Bug
  Components: membership
Affects Versions: 1.15.0
Reporter: Bill Burcham


Geode is built on the assumption that views progress linearly in a sequence. If 
that sequence ever forks into two or more parallel lines then we have a "split 
brain".

In a split brain condition, each of the parallel views are independent. It's as 
if you have more than one system running concurrently. It's possible e.g. for 
some clients to connect to members of one view and other clients to connect to 
members of another view. Updates to members in one view are not seen by members 
of a parallel view.

Geode views are produced by a coordinator. As long as only a single coordinator 
is running, there is no possibility of a split brain. Split brain arises when 
more than one coordinator is producing views at the same time.

Each Geode member (peer) is started with the {{locators}} configuration 
parameter. That parameter specifies locator(s) to use to find the (already 
running!) coordinator (member) to join with.

When a locator (member) starts, it goes through this sequence to find the 
coordinator:

# it first tries to find the coordinator through one of the (other) configured 
locators
# if it can't contact any of those, it tries contacting non-locator (cache 
server) members it has retrieved from the "view presistence" ({{.dat}}) file

If it hasn't found a coordinator to join with, then the locator may _become_ a 
coordinator.

Sometimes this is ok. If no other coordinator is currently running then this 
behavior is fine. An example is when an [administrator is starting up a brand 
new 
cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
 In that case we want the very first locator we start to become the coordinator.

But there are a number of situations where there may already be another 
coordinator running but it cannot be reached:

* if the administrator/operator is starting up a brand new cluster with 
multiple locators and…
** maybe Geode is running in a managed environment like Kubernetes and the 
locators hostnames are not (yet) resolvable in DNS
** maybe there is a network partition between the starting locators so they 
can't communicate
** maybe the existing locators or coordinator are running very slowly or the 
network is degraded. This is effectively the same as the network partition just 
mentioned
* if a cluster is already running and the administrator/operator wants to scale 
it up by starting/adding a new locator Geode is susceptible to that same 
network partition issue
* if a cluster is already running and the administrator/operator needs to 
restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
{{locators}} configuration parameter are reachable (maybe they are not running, 
or maybe there is a network partition) and…
** if the "view persistence" {{.dat}} file is missing or deleted
** or if the current set of running Geode members has evolved so far that the 
coordinates (host+port) in the {{.dat}} file are completely out of date

In each of those cases, the newly starting locator will become a coordinator 
and will start producing views. Now we'll have the old coordinator producing 
views at the same time as the new one.

*When this ticket is complete*, Geode will offer a locator startup mode (via 
TBD configuration parameter, {{LocatorLauncher}} startup parameter) that 
prevents that locator from becoming a coordinator. With that mode, it will be 
possible for an administrator to avoid many of the problematic scenarios 
mentioned above, while retaining the ability to start a first locator which is 
allowed to become a coordinator.

For purposes of discussion we'll call the startup mode that allows the locator 
to become a coordinator "seed" mode, and we'll call the new startup mode that 
prevents the locator from becoming a coordinator before first joining, 
"join-only" mode.

To start a brand new cluster, the first locator is started in "seed" mode. 
After that all subsequent locators are started in "join only" mode. If network 
partitions occur, the newly started nodes will exit with a failure status, but 
will not become coordinators.

To add a locator to a running cluster, it will be started in "join only" mode. 
It will similarly either join with an existing coordinator or exit with a 
failure status, thereby avoiding split brains.

When restarting a locator, e.g. during a rolling upgrade, it will be restarted 
in "join only" mode. If a network partition is encountered, or the {{.dat}} 
file is missing or stale, the locator will exit with a failure status and split 
brain will be avoided.



--
This message 

[jira] [Updated] (GEODE-9680) Newly Started Locators are Susceptible to Split-Brains

2021-10-05 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9680:
-
Labels: needsTriage  (was: )

> Newly Started Locators are Susceptible to Split-Brains
> --
>
> Key: GEODE-9680
> URL: https://issues.apache.org/jira/browse/GEODE-9680
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Bill Burcham
>Priority: Major
>  Labels: needsTriage
>
> Geode is built on the assumption that views progress linearly in a sequence. 
> If that sequence ever forks into two or more parallel lines then we have a 
> "split brain".
> In a split brain condition, each of the parallel views are independent. It's 
> as if you have more than one system running concurrently. It's possible e.g. 
> for some clients to connect to members of one view and other clients to 
> connect to members of another view. Updates to members in one view are not 
> seen by members of a parallel view.
> Geode views are produced by a coordinator. As long as only a single 
> coordinator is running, there is no possibility of a split brain. Split brain 
> arises when more than one coordinator is producing views at the same time.
> Each Geode member (peer) is started with the {{locators}} configuration 
> parameter. That parameter specifies locator(s) to use to find the (already 
> running!) coordinator (member) to join with.
> When a locator (member) starts, it goes through this sequence to find the 
> coordinator:
> # it first tries to find the coordinator through one of the (other) 
> configured locators
> # if it can't contact any of those, it tries contacting non-locator (cache 
> server) members it has retrieved from the "view presistence" ({{.dat}}) file
> If it hasn't found a coordinator to join with, then the locator may _become_ 
> a coordinator.
> Sometimes this is ok. If no other coordinator is currently running then this 
> behavior is fine. An example is when an [administrator is starting up a brand 
> new 
> cluster|https://geode.apache.org/docs/guide/114/configuring/running/running_the_locator.html].
>  In that case we want the very first locator we start to become the 
> coordinator.
> But there are a number of situations where there may already be another 
> coordinator running but it cannot be reached:
> * if the administrator/operator is starting up a brand new cluster with 
> multiple locators and…
> ** maybe Geode is running in a managed environment like Kubernetes and the 
> locators hostnames are not (yet) resolvable in DNS
> ** maybe there is a network partition between the starting locators so they 
> can't communicate
> ** maybe the existing locators or coordinator are running very slowly or the 
> network is degraded. This is effectively the same as the network partition 
> just mentioned
> * if a cluster is already running and the administrator/operator wants to 
> scale it up by starting/adding a new locator Geode is susceptible to that 
> same network partition issue
> * if a cluster is already running and the administrator/operator needs to 
> restart a locator, e.g. for a rolling upgrade, if none of the locators in the 
> {{locators}} configuration parameter are reachable (maybe they are not 
> running, or maybe there is a network partition) and…
> ** if the "view persistence" {{.dat}} file is missing or deleted
> ** or if the current set of running Geode members has evolved so far that the 
> coordinates (host+port) in the {{.dat}} file are completely out of date
> In each of those cases, the newly starting locator will become a coordinator 
> and will start producing views. Now we'll have the old coordinator producing 
> views at the same time as the new one.
> *When this ticket is complete*, Geode will offer a locator startup mode (via 
> TBD configuration parameter, {{LocatorLauncher}} startup parameter) that 
> prevents that locator from becoming a coordinator. With that mode, it will be 
> possible for an administrator to avoid many of the problematic scenarios 
> mentioned above, while retaining the ability to start a first locator which 
> is allowed to become a coordinator.
> For purposes of discussion we'll call the startup mode that allows the 
> locator to become a coordinator "seed" mode, and we'll call the new startup 
> mode that prevents the locator from becoming a coordinator before first 
> joining, "join-only" mode.
> To start a brand new cluster, the first locator is started in "seed" mode. 
> After that all subsequent locators are started in "join only" mode. If 
> network partitions occur, the newly started nodes will exit with a failure 
> status, but will not become coordinators.
> To add a locator to a running cluster, it will be started in "join only" 
> mode. It 

[jira] [Commented] (GEODE-9340) Benchmark instability in PartitionedPutLongBenchmark

2021-10-05 Thread Geode Integration (Jira)

[ 
https://issues.apache.org/jira/browse/GEODE-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424717#comment-17424717
 ] 

Geode Integration commented on GEODE-9340:
--

Seen in [benchmark-base 
#241|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/241].

> Benchmark instability in PartitionedPutLongBenchmark
> 
>
> Key: GEODE-9340
> URL: https://issues.apache.org/jira/browse/GEODE-9340
> Project: Geode
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 1.15.0
>Reporter: Sarah Abbey
>Assignee: Hale Bales
>Priority: Major
>  Labels: pull-request-available
>
> PartitionedPutLongBenchmark failed in CI 
> (https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/6):
> {code:java}
> This is ITERATION 1 of benchmarking against baseline.
>   P2pPartitionedGetBenchmark avg ops/sec  
> Baseline:825011.38  Test:835847.67  Difference:   +1.3%
>  avg latency  
> Baseline:871392.31  Test:859444.66  Difference:   -1.4%
>   P2pPartitionedPutBenchmark avg ops/sec  
> Baseline:123838.43  Test:122686.30  Difference:   -0.9%
>  avg latency  
> Baseline:   6015719.73  Test:   6119472.19  Difference:   +1.7%
>  P2pPartitionedPutBytesBenchmark avg ops/sec  
> Baseline:174887.77  Test:171040.93  Difference:   -2.2%
>  avg latency  
> Baseline:   4145337.60  Test:   4236159.60  Difference:   +2.2%
>    PartitionedFunctionExecutionBenchmark avg ops/sec  
> Baseline:248635.36  Test:261498.94  Difference:   +5.2%
>  avg latency  
> Baseline:867122.63  Test:824550.34  Difference:   -4.9%
>   PartitionedFunctionExecutionWithArgumentsBenchmark avg ops/sec  
> Baseline:280071.19  Test:275305.31  Difference:   -1.7%
>  avg latency  
> Baseline:   1026643.12  Test:   1044307.43  Difference:   +1.7%
> PartitionedFunctionExecutionWithFiltersBenchmark avg ops/sec  
> Baseline:301416.23  Test:304317.30  Difference:   +1.0%
>  avg latency  
> Baseline:   1908390.88  Test:   1890040.46  Difference:   -1.0%
>  PartitionedGetBenchmark avg ops/sec  
> Baseline:790800.52  Test:784514.74  Difference:   -0.8%
>  avg latency  
> Baseline:908357.58  Test:915790.96  Difference:   +0.8%
>  PartitionedGetLongBenchmark avg ops/sec  
> Baseline:   1020821.32  Test:996529.93  Difference:   -2.4%
>  avg latency  
> Baseline:703761.09  Test:720744.36  Difference:   +2.4%
>    PartitionedGetStringBenchmark avg ops/sec  
> Baseline:   1028992.93  Test:   1010447.47  Difference:   -1.8%
>  avg latency  
> Baseline:698009.55  Test:710765.29  Difference:   +1.8%
> PartitionedIndexedQueryBenchmark avg ops/sec  
> Baseline: 30868.78  Test: 31478.90  Difference:   +2.0%
>  avg latency  
> Baseline:  18670093.21  Test:  18278083.16  Difference:   -2.1%
>  PartitionedNonIndexedQueryBenchmark avg ops/sec  
> Baseline:99.45  Test:   101.97  Difference:   +2.5%
>  avg latency  
> Baseline: 723415530.75  Test: 705653061.86  Difference:   -2.5%
>   PartitionedPutAllBenchmark avg ops/sec  
> Baseline:  7921.61  Test:  7816.66  Difference:   -1.3%
>  avg latency  
> Baseline:  18172638.37  Test:  18416169.28  Difference:   +1.3%
>   PartitionedPutAllLongBenchmark avg ops/sec  
> Baseline:  1379.53  Test:  1169.16  Difference:  -15.2%
>  avg latency  
> Baseline: 105140260.44  Test: 123722914.94  Difference:  +17.7%
>  PartitionedPutBenchmark avg ops/sec  
> Baseline:474986.11  Test:467924.19  Difference:   -1.5%
>  

[jira] [Created] (GEODE-9679) Improve Test Coverage for SingleResultCollector

2021-10-05 Thread Wayne (Jira)
Wayne created GEODE-9679:


 Summary: Improve Test Coverage for SingleResultCollector
 Key: GEODE-9679
 URL: https://issues.apache.org/jira/browse/GEODE-9679
 Project: Geode
  Issue Type: Test
  Components: redis
Reporter: Wayne


The org.apache.geode.redis.internal.executor.SingleResultCollector has 
insufficient test coverage that must be improved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails

2021-10-05 Thread Anilkumar Gingade (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Gingade updated GEODE-9645:
-
Labels: GeodeOperationAPI pull-request-available  (was: 
pull-request-available)

> MultiUserAuth: DataSerializerRecoveryListener is called without auth 
> information. Promptly fails
> 
>
> Key: GEODE-9645
> URL: https://issues.apache.org/jira/browse/GEODE-9645
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
>
> When multiuserSecureModeEnabled is enabled,  a user may register a 
> DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
> to register the data serializers with other machines. This is a problem was 
> there is no authentication information in the background process to 
> authenticate. Hence the error seen below.
>  
> {noformat}
> [warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
> DataSerializerRecoveryTask - Error recovering dataSerializers: 
> java.lang.UnsupportedOperationException: Use Pool APIs for doing operations 
> when multiuser-secure-mode-enabled is set to true. 
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
>  
> at 
> org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9666:


Assignee: Aaron Lindsey

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
>   at 
> 

[jira] [Updated] (GEODE-9678) Improve Test Coverage for redis.internal.proxy Package

2021-10-05 Thread Wayne (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wayne updated GEODE-9678:
-
Component/s: redis

> Improve Test Coverage for redis.internal.proxy Package
> --
>
> Key: GEODE-9678
> URL: https://issues.apache.org/jira/browse/GEODE-9678
> Project: Geode
>  Issue Type: Test
>  Components: redis
>Reporter: Wayne
>Priority: Major
> Attachments: redis.internal.proxy.pdf
>
>
> The org.apache.geode.redis.internal.proxy package requires additional test 
> coverage because the coverage statistics are inadequate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9678) Improve Test Coverage for redis.internal.proxy Package

2021-10-05 Thread Wayne (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wayne updated GEODE-9678:
-
Attachment: redis.internal.proxy.pdf

> Improve Test Coverage for redis.internal.proxy Package
> --
>
> Key: GEODE-9678
> URL: https://issues.apache.org/jira/browse/GEODE-9678
> Project: Geode
>  Issue Type: Test
>Reporter: Wayne
>Priority: Major
> Attachments: redis.internal.proxy.pdf
>
>
> The org.apache.geode.redis.internal.proxy package requires additional test 
> coverage because the coverage statistics are inadequate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9678) Improve Test Coverage for redis.internal.proxy Package

2021-10-05 Thread Wayne (Jira)
Wayne created GEODE-9678:


 Summary: Improve Test Coverage for redis.internal.proxy Package
 Key: GEODE-9678
 URL: https://issues.apache.org/jira/browse/GEODE-9678
 Project: Geode
  Issue Type: Test
Reporter: Wayne


The org.apache.geode.redis.internal.proxy package requires additional test 
coverage because the coverage statistics are inadequate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9666:


Assignee: (was: Aaron Lindsey)

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
>   at 
> 

[jira] [Commented] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424702#comment-17424702
 ] 

Aaron Lindsey commented on GEODE-9666:
--

I no longer see the {{NoAvailableLocatorsException}} when running our test with 
this change: 
[https://github.com/apache/geode/compare/develop...aaronlindsey:GEODE-9666-NoAvailableLocatorsException?expand=1]

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> 

[jira] [Updated] (GEODE-9676) Limit Radish RESP bulk input sizes for unauthenticated connections

2021-10-05 Thread Wayne (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wayne updated GEODE-9676:
-
Labels: redis  (was: )

> Limit Radish RESP bulk input sizes for unauthenticated connections
> --
>
> Key: GEODE-9676
> URL: https://issues.apache.org/jira/browse/GEODE-9676
> Project: Geode
>  Issue Type: Improvement
>  Components: redis
>Reporter: Jens Deppe
>Priority: Major
>  Labels: redis
>
> Redis recently implemented a response to a CVE which allows for 
> unauthenticated users to craft RESP requests which consume a lot of memory. 
> Our implementation suffers from the same problem.
> For example, a command input starting with `*` would result in the 
> JVM trying to allocate an array of size `MAX_INT`. 
> We need to be able to provide the same safeguards as Redis does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9600) Add Programmatic Cluster Support to NetCore.Test

2021-10-05 Thread Michael Martell (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Martell reassigned GEODE-9600:
--

Assignee: Michael Martell

> Add Programmatic Cluster Support to NetCore.Test
> 
>
> Key: GEODE-9600
> URL: https://issues.apache.org/jira/browse/GEODE-9600
> Project: Geode
>  Issue Type: Test
>  Components: native client
>Reporter: Michael Martell
>Assignee: Michael Martell
>Priority: Major
>
> We just need to bring a few files over from the new clicache test framework 
> (clicache/integration-test2):
> * Cluster.cs
> * Gfsh.cs
> * GfshExecute.cs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9676) Limit Radish RESP bulk input sizes for unauthenticated connections

2021-10-05 Thread Jens Deppe (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jens Deppe updated GEODE-9676:
--
Issue Type: Improvement  (was: Test)

> Limit Radish RESP bulk input sizes for unauthenticated connections
> --
>
> Key: GEODE-9676
> URL: https://issues.apache.org/jira/browse/GEODE-9676
> Project: Geode
>  Issue Type: Improvement
>  Components: redis
>Reporter: Jens Deppe
>Priority: Major
>
> Redis recently implemented a response to a CVE which allows for 
> unauthenticated users to craft RESP requests which consume a lot of memory. 
> Our implementation suffers from the same problem.
> For example, a command input starting with `*` would result in the 
> JVM trying to allocate an array of size `MAX_INT`. 
> We need to be able to provide the same safeguards as Redis does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9640) When cluster shut down completely and restarted, new operations may be lost on client

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424653#comment-17424653
 ] 

ASF subversion and git services commented on GEODE-9640:


Commit 4b3c49e788157df94f7d3e4b455adb7a6eaef96b in geode's branch 
refs/heads/develop from Eric Shu
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=4b3c49e ]

GEODE-9640: Initiate threadId in EventID. (#6905)

  * This is to make sure a new EventID can be generated after server restarted
after a whole cluster is shut down.

 * Wrap around original threadID before it interferes with bulkOp or wan 
generated threadID.


> When cluster shut down completely and restarted, new operations may be lost 
> on client
> -
>
> Key: GEODE-9640
> URL: https://issues.apache.org/jira/browse/GEODE-9640
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Reporter: Eric Shu
>Assignee: Eric Shu
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> In Geode, client keeps track of events received based on EventID. If 
> duplicate events received from server, they are thrown away.
> The current EventID takes parts of membership ID information, and it seems 
> not adequate enough if whole cluster is down. (The coordinator is down and 
> member viewId will start from 0 again.) This can lead to same event ids are 
> generated, and cause client miss CQ events, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9658) CI Failure: AuthExpirationDUnitTest > newClient_registeredInterest_slowReAuth_policyNone_durableClient failed due to

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424649#comment-17424649
 ] 

Geode Integration commented on GEODE-9658:
--

Seen in [upgrade-test-openjdk8 
#247|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/upgrade-test-openjdk8/builds/247]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0542/test-results/upgradeTest/1633458837/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0542/test-artifacts/1633458837/upgradetestfiles-openjdk8-1.15.0-build.0542.tgz].

> CI Failure: AuthExpirationDUnitTest > 
> newClient_registeredInterest_slowReAuth_policyNone_durableClient failed due 
> to 
> -
>
> Key: GEODE-9658
> URL: https://issues.apache.org/jira/browse/GEODE-9658
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Donal Evans
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: pull-request-available
>
> {noformat}
> AuthExpirationDUnitTest > 
> newClient_registeredInterest_slowReAuth_policyNone_durableClient[10240.0.0] 
> FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.security.AuthExpirationDUnitTest$$Lambda$756/0x0001007c6440.run
>  in VM 0 running on Host 
> heavy-lifter-4baf3206-8afb-569f-b74b-b22341f348ed.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:94)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.newClient_registeredInterest_slowReAuth_policyNone_durableClient(AuthExpirationDUnitTest.java:629)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.security.AuthExpirationDUnitTest that uses 
> org.apache.geode.cache.Region 
> Expected size: 100 but was: 95 in:
> ["key93",
>...
> "key55"] within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$newClient_registeredInterest_slowReAuth_policyNone_durableClient$bb17a952$2(AuthExpirationDUnitTest.java:631)
> Caused by:
> java.lang.AssertionError: 
> Expected size: 100 but was: 95 in:
> ["key93",
>...
> "key55"]
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$null$17(AuthExpirationDUnitTest.java:632)
> 1502 tests completed, 1 failed, 110 skipped
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0526/test-results/upgradeTest/1632961901/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0526/test-artifacts/1632961901/upgradetestfiles-openjdk11-1.15.0-build.0526.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9677) CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424646#comment-17424646
 ] 

Geode Integration commented on GEODE-9677:
--

Seen in [distributed-test-openjdk8 
#1973|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1973]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-results/distributedTest/1633203681/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-artifacts/1633203681/distributedtestfiles-openjdk8-1.15.0-build.0536.tgz].

> CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED
> 
>
> Key: GEODE-9677
> URL: https://issues.apache.org/jira/browse/GEODE-9677
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> https://hydradb.hdb.gemfire-ci.info/hdb/testresult/11846626
> {code:java}
> GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED
> java.lang.AssertionError: expected:<0> but was:<1>
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.geode.internal.cache.GIIDeltaDUnitTest.testExpiredTombstoneSkippedGC(GIIDeltaDUnitTest.java:1534)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9677) CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED

2021-10-05 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9677:
-
Labels: needsTriage  (was: )

> CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED
> 
>
> Key: GEODE-9677
> URL: https://issues.apache.org/jira/browse/GEODE-9677
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> https://hydradb.hdb.gemfire-ci.info/hdb/testresult/11846626
> {code:java}
> GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED
> java.lang.AssertionError: expected:<0> but was:<1>
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:633)
> at 
> org.apache.geode.internal.cache.GIIDeltaDUnitTest.testExpiredTombstoneSkippedGC(GIIDeltaDUnitTest.java:1534)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9677) CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED

2021-10-05 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-9677:


 Summary: CI: GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC 
FAILED
 Key: GEODE-9677
 URL: https://issues.apache.org/jira/browse/GEODE-9677
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


https://hydradb.hdb.gemfire-ci.info/hdb/testresult/11846626

{code:java}
GIIDeltaDUnitTest > testExpiredTombstoneSkippedGC FAILED
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
at 
org.apache.geode.internal.cache.GIIDeltaDUnitTest.testExpiredTombstoneSkippedGC(GIIDeltaDUnitTest.java:1534)

{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8045) CI failure: PersistentRecoveryOrderOldConfigDUnitTest.testCrashDuringPreparePersistentId

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424644#comment-17424644
 ] 

Geode Integration commented on GEODE-8045:
--

Seen in [distributed-test-openjdk8 
#1923|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1923]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-results/distributedTest/1633162937/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-artifacts/1633162937/distributedtestfiles-openjdk8-1.15.0-build.0536.tgz].

> CI failure: 
> PersistentRecoveryOrderOldConfigDUnitTest.testCrashDuringPreparePersistentId
> 
>
> Key: GEODE-8045
> URL: https://issues.apache.org/jira/browse/GEODE-8045
> Project: Geode
>  Issue Type: Bug
>Reporter: Jianxia Chen
>Priority: Major
>  Labels: ci-failure
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/101]
> org.apache.geode.internal.cache.persistence.PersistentRecoveryOrderOldConfigDUnitTest
>  > testCrashDuringPreparePersistentId FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1428
> [fatal 2020/04/29 00:13:35.107 GMT  
> tid=1101] Uncaught exception in thread Thread[Geode Failure Detection thread 
> 5,5,RMI Runtime]
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.geode.distributed.internal.membership.gms.fd.GMSHealthMonitor$$Lambda$417/1890661885@6752fe6c
>  rejected from java.util.concurrent.ThreadPoolExecutor@636d665e[Shutting 
> down, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 
> 8]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
>   at 
> org.apache.geode.distributed.internal.membership.gms.fd.GMSHealthMonitor.checkIfAvailable(GMSHealthMonitor.java:1297)
>   at 
> org.apache.geode.distributed.internal.membership.gms.fd.GMSHealthMonitor.processMessage(GMSHealthMonitor.java:1233)
>   at 
> org.apache.geode.distributed.internal.membership.gms.fd.GMSHealthMonitor.sendSuspectRequest(GMSHealthMonitor.java:1481)
>   at 
> org.apache.geode.distributed.internal.membership.gms.fd.GMSHealthMonitor.initiateSuspicion(GMSHealthMonitor.java:480)
>   at 
> org.apache.geode.distributed.internal.membership.gms.fd.GMSHealthMonitor.lambda$checkMember$1(GMSHealthMonitor.java:464)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9676) Limit Radish RESP bulk input sizes for unauthenticated connections

2021-10-05 Thread Jens Deppe (Jira)
Jens Deppe created GEODE-9676:
-

 Summary: Limit Radish RESP bulk input sizes for unauthenticated 
connections
 Key: GEODE-9676
 URL: https://issues.apache.org/jira/browse/GEODE-9676
 Project: Geode
  Issue Type: Test
  Components: redis
Reporter: Jens Deppe


Redis recently implemented a response to a CVE which allows for unauthenticated 
users to craft RESP requests which consume a lot of memory. Our implementation 
suffers from the same problem.

For example, a command input starting with `*` would result in the JVM 
trying to allocate an array of size `MAX_INT`. 

We need to be able to provide the same safeguards as Redis does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9568) Geode User Guide build script: allow doc to be retained

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424622#comment-17424622
 ] 

ASF subversion and git services commented on GEODE-9568:


Commit 1c4d8f3a896657a63668f95a67a3632df1b46be6 in geode's branch 
refs/heads/support/1.14 from Dave Barnes
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=1c4d8f3 ]

GEODE-9628: Documentation fix for setting credentials in client (#6928)

* GEODE-9568 Documentation fix for setting credentials in client (format & 
style)


> Geode User Guide build script: allow doc to be retained
> ---
>
> Key: GEODE-9568
> URL: https://issues.apache.org/jira/browse/GEODE-9568
> Project: Geode
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 1.13.4
>Reporter: Dave Barnes
>Assignee: Alberto Bustamante Reyes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.1, 1.15.0
>
>
> The user guide build script allows the user to build and preview the user 
> guide.
> When the script completes, it always deletes the user guide.
> Allow the user to keep the compiled guide.
> Relevant files:
> dev-tools/docker/docs/preview-user-guide.sh
> geode-book/README.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9628) Documentation fix for setting credentials in a client

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424621#comment-17424621
 ] 

ASF subversion and git services commented on GEODE-9628:


Commit 1c4d8f3a896657a63668f95a67a3632df1b46be6 in geode's branch 
refs/heads/support/1.14 from Dave Barnes
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=1c4d8f3 ]

GEODE-9628: Documentation fix for setting credentials in client (#6928)

* GEODE-9568 Documentation fix for setting credentials in client (format & 
style)


> Documentation fix for setting credentials in a client
> -
>
> Key: GEODE-9628
> URL: https://issues.apache.org/jira/browse/GEODE-9628
> Project: Geode
>  Issue Type: Bug
>  Components: docs
>Reporter: Nabarun Nag
>Assignee: Dave Barnes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.1, 1.15.0
>
>
> Current:
> {noformat}
> How a Client Cache Sets Its CredentialIn order to connect with a locator or a 
> server that does authentication, a client will need to set its credential, 
> composed of the two properties security-username and security-password. 
> Choose one of these two ways to accomplish this:{noformat}
> The last line needs to be changed to needing both the steps, instead of 
> choosing one of the two steps 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9675) CI: ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424615#comment-17424615
 ] 

Geode Integration commented on GEODE-9675:
--

Seen in [distributed-test-openjdk8 
#1983|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1983]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-results/distributedTest/1633211922/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-artifacts/1633211922/distributedtestfiles-openjdk8-1.15.0-build.0536.tgz].

> CI: ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED
> -
>
> Key: GEODE-9675
> URL: https://issues.apache.org/jira/browse/GEODE-9675
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1983
> {code:java}
> ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED
> org.apache.geode.SystemConnectException: Problem starting up membership 
> services
> at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:186)
> at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:466)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:499)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:328)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:757)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:133)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3013)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:283)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:209)
> at 
> org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:159)
> at 
> org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:180)
> at 
> org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:256)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManagerDUnitTest.testConnectAfterBeingShunned(ClusterDistributionManagerDUnitTest.java:170)
> Caused by:
> 
> org.apache.geode.distributed.internal.membership.api.MemberStartupException: 
> unable to create jgroups channel
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:401)
> at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:203)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1642)
> at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
> ... 13 more
> Caused by:
> java.lang.Exception: failed to open a port in range 41003-41003
> at 
> org.jgroups.protocols.UDP.createMulticastSocketWithBindPort(UDP.java:503)
> at org.jgroups.protocols.UDP.createSockets(UDP.java:348)
> at org.jgroups.protocols.UDP.start(UDP.java:266)
> at 
> org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:966)
> at org.jgroups.JChannel.startStack(JChannel.java:889)
> at org.jgroups.JChannel._preConnect(JChannel.java:553)
> at org.jgroups.JChannel.connect(JChannel.java:288)
> at org.jgroups.JChannel.connect(JChannel.java:279)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:397)
> ... 16 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9675) CI: ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED

2021-10-05 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9675:
-
Labels: needsTriage  (was: )

> CI: ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED
> -
>
> Key: GEODE-9675
> URL: https://issues.apache.org/jira/browse/GEODE-9675
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: needsTriage
>
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1983
> {code:java}
> ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED
> org.apache.geode.SystemConnectException: Problem starting up membership 
> services
> at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:186)
> at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:466)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:499)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:328)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:757)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:133)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3013)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:283)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:209)
> at 
> org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:159)
> at 
> org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:180)
> at 
> org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:256)
> at 
> org.apache.geode.distributed.internal.ClusterDistributionManagerDUnitTest.testConnectAfterBeingShunned(ClusterDistributionManagerDUnitTest.java:170)
> Caused by:
> 
> org.apache.geode.distributed.internal.membership.api.MemberStartupException: 
> unable to create jgroups channel
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:401)
> at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:203)
> at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1642)
> at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
> ... 13 more
> Caused by:
> java.lang.Exception: failed to open a port in range 41003-41003
> at 
> org.jgroups.protocols.UDP.createMulticastSocketWithBindPort(UDP.java:503)
> at org.jgroups.protocols.UDP.createSockets(UDP.java:348)
> at org.jgroups.protocols.UDP.start(UDP.java:266)
> at 
> org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:966)
> at org.jgroups.JChannel.startStack(JChannel.java:889)
> at org.jgroups.JChannel._preConnect(JChannel.java:553)
> at org.jgroups.JChannel.connect(JChannel.java:288)
> at org.jgroups.JChannel.connect(JChannel.java:279)
> at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:397)
> ... 16 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9675) CI: ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED

2021-10-05 Thread Xiaojian Zhou (Jira)
Xiaojian Zhou created GEODE-9675:


 Summary: CI: ClusterDistributionManagerDUnitTest > 
testConnectAfterBeingShunned FAILED
 Key: GEODE-9675
 URL: https://issues.apache.org/jira/browse/GEODE-9675
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou


https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1983


{code:java}
ClusterDistributionManagerDUnitTest > testConnectAfterBeingShunned FAILED
org.apache.geode.SystemConnectException: Problem starting up membership 
services
at 
org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:186)
at 
org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:466)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:499)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:328)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:757)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:133)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3013)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:283)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:209)
at 
org.apache.geode.distributed.DistributedSystem.connect(DistributedSystem.java:159)
at 
org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:180)
at 
org.apache.geode.test.dunit.internal.JUnit4DistributedTestCase.getSystem(JUnit4DistributedTestCase.java:256)
at 
org.apache.geode.distributed.internal.ClusterDistributionManagerDUnitTest.testConnectAfterBeingShunned(ClusterDistributionManagerDUnitTest.java:170)

Caused by:

org.apache.geode.distributed.internal.membership.api.MemberStartupException: 
unable to create jgroups channel
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:401)
at 
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:203)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1642)
at 
org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
... 13 more

Caused by:
java.lang.Exception: failed to open a port in range 41003-41003
at 
org.jgroups.protocols.UDP.createMulticastSocketWithBindPort(UDP.java:503)
at org.jgroups.protocols.UDP.createSockets(UDP.java:348)
at org.jgroups.protocols.UDP.start(UDP.java:266)
at 
org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:966)
at org.jgroups.JChannel.startStack(JChannel.java:889)
at org.jgroups.JChannel._preConnect(JChannel.java:553)
at org.jgroups.JChannel.connect(JChannel.java:288)
at org.jgroups.JChannel.connect(JChannel.java:279)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.start(JGroupsMessenger.java:397)
... 16 more

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-6076) CI Failure: WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo.CreateGatewaySenderMixedSiteOneCurrentSiteTwo

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-6076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424612#comment-17424612
 ] 

Geode Integration commented on GEODE-6076:
--

Seen on support/1.12 in [upgrade-test-openjdk8 
#40|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-support-1-12-main/jobs/upgrade-test-openjdk8/builds/40]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.5-build.0288/test-results/upgradeTest/1633426559/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.5-build.0288/test-artifacts/1633426559/upgradetestfiles-openjdk8-1.12.5-build.0288.tgz].

> CI Failure: 
> WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo.CreateGatewaySenderMixedSiteOneCurrentSiteTwo
> 
>
> Key: GEODE-6076
> URL: https://issues.apache.org/jira/browse/GEODE-6076
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Jens Deppe
>Priority: Major
>
> {noformat}
> org.apache.geode.cache.wan.WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo
>  > CreateGatewaySenderMixedSiteOneCurrentSiteTwo[from_v110] FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache.wan.WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo$$Lambda$53/1234278717.call
>  in VM 0 running on Host 1d2a3ce05625 with 7 VMs
> at org.apache.geode.test.dunit.VM.invoke(VM.java:433)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:402)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:384)
> at 
> org.apache.geode.cache.wan.WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo.CreateGatewaySenderMixedSiteOneCurrentSiteTwo(WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo.java:92)
> Caused by:
> org.apache.geode.management.ManagementException: 
> java.rmi.server.ExportException: Port already in use: 26239; nested exception 
> is: 
>   java.net.BindException: Failed to create server socket on 
> 1d2a3ce05625/172.17.0.33[26239]
> at 
> org.apache.geode.management.internal.ManagementAgent.startAgent(ManagementAgent.java:162)
> at 
> org.apache.geode.management.internal.SystemManagementService.startManager(SystemManagementService.java:429)
> at 
> org.apache.geode.management.internal.beans.ManagementAdapter.handleCacheCreation(ManagementAdapter.java:173)
> at 
> org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:115)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201)
> at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:606)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1211)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.basicCreate(GemFireCacheImpl.java:797)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.create(GemFireCacheImpl.java:783)
> at 
> org.apache.geode.cache.CacheFactory.create(CacheFactory.java:176)
> at 
> org.apache.geode.cache.CacheFactory.create(CacheFactory.java:223)
> at 
> org.apache.geode.distributed.internal.InternalLocator.startCache(InternalLocator.java:652)
> at 
> org.apache.geode.distributed.internal.InternalLocator.startDistributedSystem(InternalLocator.java:639)
> at 
> org.apache.geode.distributed.internal.InternalLocator.startLocator(InternalLocator.java:326)
> at 
> org.apache.geode.distributed.Locator.startLocator(Locator.java:252)
> at 
> org.apache.geode.distributed.Locator.startLocatorAndDS(Locator.java:139)
> at 
> org.apache.geode.cache.wan.WANRollingUpgradeDUnitTest.startLocatorWithJmxManager(WANRollingUpgradeDUnitTest.java:112)
> at 
> org.apache.geode.cache.wan.WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo.lambda$CreateGatewaySenderMixedSiteOneCurrentSiteTwo$e0147a59$1(WANRollingUpgradeCreateGatewaySenderMixedSiteOneCurrentSiteTwo.java:92)
> Caused by:
> java.rmi.server.ExportException: Port already in use: 26239; 
> nested exception is: 
>   java.net.BindException: Failed to create server socket on 
> 1d2a3ce05625/172.17.0.33[26239]
> at 
> sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:346)
> at 
> sun.rmi.transport.tcp.TCPTransport.exportObject(TCPTransport.java:254)
> at 
> 

[jira] [Commented] (GEODE-9340) Benchmark instability in PartitionedPutLongBenchmark

2021-10-05 Thread Geode Integration (Jira)

[ 
https://issues.apache.org/jira/browse/GEODE-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424609#comment-17424609
 ] 

Geode Integration commented on GEODE-9340:
--

Seen in [benchmark-base 
#240|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/240].

> Benchmark instability in PartitionedPutLongBenchmark
> 
>
> Key: GEODE-9340
> URL: https://issues.apache.org/jira/browse/GEODE-9340
> Project: Geode
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 1.15.0
>Reporter: Sarah Abbey
>Assignee: Hale Bales
>Priority: Major
>  Labels: pull-request-available
>
> PartitionedPutLongBenchmark failed in CI 
> (https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/6):
> {code:java}
> This is ITERATION 1 of benchmarking against baseline.
>   P2pPartitionedGetBenchmark avg ops/sec  
> Baseline:825011.38  Test:835847.67  Difference:   +1.3%
>  avg latency  
> Baseline:871392.31  Test:859444.66  Difference:   -1.4%
>   P2pPartitionedPutBenchmark avg ops/sec  
> Baseline:123838.43  Test:122686.30  Difference:   -0.9%
>  avg latency  
> Baseline:   6015719.73  Test:   6119472.19  Difference:   +1.7%
>  P2pPartitionedPutBytesBenchmark avg ops/sec  
> Baseline:174887.77  Test:171040.93  Difference:   -2.2%
>  avg latency  
> Baseline:   4145337.60  Test:   4236159.60  Difference:   +2.2%
>    PartitionedFunctionExecutionBenchmark avg ops/sec  
> Baseline:248635.36  Test:261498.94  Difference:   +5.2%
>  avg latency  
> Baseline:867122.63  Test:824550.34  Difference:   -4.9%
>   PartitionedFunctionExecutionWithArgumentsBenchmark avg ops/sec  
> Baseline:280071.19  Test:275305.31  Difference:   -1.7%
>  avg latency  
> Baseline:   1026643.12  Test:   1044307.43  Difference:   +1.7%
> PartitionedFunctionExecutionWithFiltersBenchmark avg ops/sec  
> Baseline:301416.23  Test:304317.30  Difference:   +1.0%
>  avg latency  
> Baseline:   1908390.88  Test:   1890040.46  Difference:   -1.0%
>  PartitionedGetBenchmark avg ops/sec  
> Baseline:790800.52  Test:784514.74  Difference:   -0.8%
>  avg latency  
> Baseline:908357.58  Test:915790.96  Difference:   +0.8%
>  PartitionedGetLongBenchmark avg ops/sec  
> Baseline:   1020821.32  Test:996529.93  Difference:   -2.4%
>  avg latency  
> Baseline:703761.09  Test:720744.36  Difference:   +2.4%
>    PartitionedGetStringBenchmark avg ops/sec  
> Baseline:   1028992.93  Test:   1010447.47  Difference:   -1.8%
>  avg latency  
> Baseline:698009.55  Test:710765.29  Difference:   +1.8%
> PartitionedIndexedQueryBenchmark avg ops/sec  
> Baseline: 30868.78  Test: 31478.90  Difference:   +2.0%
>  avg latency  
> Baseline:  18670093.21  Test:  18278083.16  Difference:   -2.1%
>  PartitionedNonIndexedQueryBenchmark avg ops/sec  
> Baseline:99.45  Test:   101.97  Difference:   +2.5%
>  avg latency  
> Baseline: 723415530.75  Test: 705653061.86  Difference:   -2.5%
>   PartitionedPutAllBenchmark avg ops/sec  
> Baseline:  7921.61  Test:  7816.66  Difference:   -1.3%
>  avg latency  
> Baseline:  18172638.37  Test:  18416169.28  Difference:   +1.3%
>   PartitionedPutAllLongBenchmark avg ops/sec  
> Baseline:  1379.53  Test:  1169.16  Difference:  -15.2%
>  avg latency  
> Baseline: 105140260.44  Test: 123722914.94  Difference:  +17.7%
>  PartitionedPutBenchmark avg ops/sec  
> Baseline:474986.11  Test:467924.19  Difference:   -1.5%
>  

[jira] [Updated] (GEODE-9674) Re-auth seems to cause durable clients lose some events when server force the client to disconnect

2021-10-05 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao updated GEODE-9674:
---
Labels: GeodeOperationAPI  (was: )

> Re-auth seems to cause durable clients lose some events when server force the 
> client to disconnect
> --
>
> Key: GEODE-9674
> URL: https://issues.apache.org/jira/browse/GEODE-9674
> Project: Geode
>  Issue Type: Sub-task
>  Components: core
>Reporter: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI
>
> We have several intermittent test failures related to durable clients losing 
> data afternserver force the client to disconnect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9486) Serialized classes fail to deserialize when validate-serializable-objects is enabled

2021-10-05 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424567#comment-17424567
 ] 

Geode Integration commented on GEODE-9486:
--

Seen in [distributed-test-openjdk8 
#1912|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/1912]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-results/distributedTest/1633155328/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-mass-test-run/1.15.0-build.0536/test-artifacts/1633155328/distributedtestfiles-openjdk8-1.15.0-build.0536.tgz].

> Serialized classes fail to deserialize when validate-serializable-objects is 
> enabled
> 
>
> Key: GEODE-9486
> URL: https://issues.apache.org/jira/browse/GEODE-9486
> Project: Geode
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.12.0, 1.13.0, 1.14.0
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.12.5, 1.13.5, 1.14.1, 1.15.0
>
>
> Serialized classes in geode-serializable (and potentially other geode modules 
> without sanctioned serializable support) fail to deserialize when 
> {{validate-serializable-objects}} is enabled. This bug was caught by 
> {{SessionsAndCrashesDUnitTest}} in geode-apis-compatible-with-redis 
> (GEODE-9485):
> {noformat}
> [fatal 2021/08/04 13:50:57.548 UTC  tid=114] 
> Serialization filter is rejecting class 
> org.apache.geode.internal.serialization.DSFIDNotFoundException
> java.lang.Exception: 
>   at 
> org.apache.geode.internal.ObjectInputStreamFilterWrapper.lambda$createSerializationFilter$0(ObjectInputStreamFilterWrapper.java:234)
>   at com.sun.proxy.$Proxy26.checkInput(Unknown Source)
>   at 
> java.base/java.io.ObjectInputStream.filterCheck(ObjectInputStream.java:1336)
>   at 
> java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
>   at 
> java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862)
>   at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169)
>   at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
> {noformat}
> Any module with a class that may be serialized must implement 
> {{DistributedSystemService}} to provide the list of sanctioned serializables 
> as defined in {{sanctionedDataSerializables.txt}} and a concrete test 
> subclassing {{AnalyzeSerializablesJUnitTestBase}}.
> {{org.apache.geode.internal.serialization.DSFIDNotFoundException}} is in 
> geode-serialization which cannot depend on geode-core which owns 
> {{DistributedSystemService}}. Even if we remove the unused {{void 
> init(InternalDistributedSystem internalDistributedSystem)}} and move it to 
> geode-serialization, {{SerializationDistributedSystemService}} would need to 
> implement {{getSerializationAcceptlist()}} as:
> {noformat}
>   @Override
>   public Collection getSerializationAcceptlist() throws IOException {
> URL sanctionedSerializables = 
> ClassPathLoader.getLatest().getResource(getClass(),
> "sanctioned-geode-gfsh-serializables.txt");
> return InternalDataSerializer.loadClassNames(sanctionedSerializables);
>   }
> {noformat}
> ... which uses {{ClassPathLoader}} and {{InternalDataSerializer}} which live 
> in geode-core.
> This requires moving the classes {{ClassPathLoader}} and 
> {{InternalDataSerializer}} that need to be used within 
> {{getSerializationAcceptlist()}}. 
> {{ClassPathLoader}}  depends on geode deployment:
> {noformat}
> import org.apache.geode.internal.deployment.DeploymentServiceFactory;
> import org.apache.geode.internal.deployment.JarDeploymentService;
> {noformat}
> {{InternalDataSerializer}} gets even more complicated with many dependencies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9628) Documentation fix for setting credentials in a client

2021-10-05 Thread Dave Barnes (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Barnes resolved GEODE-9628.

Fix Version/s: 1.15.0
   1.14.1
   Resolution: Fixed

> Documentation fix for setting credentials in a client
> -
>
> Key: GEODE-9628
> URL: https://issues.apache.org/jira/browse/GEODE-9628
> Project: Geode
>  Issue Type: Bug
>  Components: docs
>Reporter: Nabarun Nag
>Assignee: Dave Barnes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.1, 1.15.0
>
>
> Current:
> {noformat}
> How a Client Cache Sets Its CredentialIn order to connect with a locator or a 
> server that does authentication, a client will need to set its credential, 
> composed of the two properties security-username and security-password. 
> Choose one of these two ways to accomplish this:{noformat}
> The last line needs to be changed to needing both the steps, instead of 
> choosing one of the two steps 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9628) Documentation fix for setting credentials in a client

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424560#comment-17424560
 ] 

ASF subversion and git services commented on GEODE-9628:


Commit 091ee88b4f1bd032ea5b1198a9bc3c73915d5bf9 in geode's branch 
refs/heads/develop from Dave Barnes
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=091ee88 ]

GEODE-9628: Documentation fix for setting credentials in client (#6928)

* GEODE-9568 Documentation fix for setting credentials in client (format & 
style)

> Documentation fix for setting credentials in a client
> -
>
> Key: GEODE-9628
> URL: https://issues.apache.org/jira/browse/GEODE-9628
> Project: Geode
>  Issue Type: Bug
>  Components: docs
>Reporter: Nabarun Nag
>Assignee: Dave Barnes
>Priority: Major
>  Labels: pull-request-available
>
> Current:
> {noformat}
> How a Client Cache Sets Its CredentialIn order to connect with a locator or a 
> server that does authentication, a client will need to set its credential, 
> composed of the two properties security-username and security-password. 
> Choose one of these two ways to accomplish this:{noformat}
> The last line needs to be changed to needing both the steps, instead of 
> choosing one of the two steps 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9628) Documentation fix for setting credentials in a client

2021-10-05 Thread Dave Barnes (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Barnes reassigned GEODE-9628:
--

Assignee: Dave Barnes  (was: Nabarun Nag)

> Documentation fix for setting credentials in a client
> -
>
> Key: GEODE-9628
> URL: https://issues.apache.org/jira/browse/GEODE-9628
> Project: Geode
>  Issue Type: Bug
>  Components: docs
>Reporter: Nabarun Nag
>Assignee: Dave Barnes
>Priority: Major
>  Labels: pull-request-available
>
> Current:
> {noformat}
> How a Client Cache Sets Its CredentialIn order to connect with a locator or a 
> server that does authentication, a client will need to set its credential, 
> composed of the two properties security-username and security-password. 
> Choose one of these two ways to accomplish this:{noformat}
> The last line needs to be changed to needing both the steps, instead of 
> choosing one of the two steps 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9568) Geode User Guide build script: allow doc to be retained

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424561#comment-17424561
 ] 

ASF subversion and git services commented on GEODE-9568:


Commit 091ee88b4f1bd032ea5b1198a9bc3c73915d5bf9 in geode's branch 
refs/heads/develop from Dave Barnes
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=091ee88 ]

GEODE-9628: Documentation fix for setting credentials in client (#6928)

* GEODE-9568 Documentation fix for setting credentials in client (format & 
style)

> Geode User Guide build script: allow doc to be retained
> ---
>
> Key: GEODE-9568
> URL: https://issues.apache.org/jira/browse/GEODE-9568
> Project: Geode
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 1.13.4
>Reporter: Dave Barnes
>Assignee: Alberto Bustamante Reyes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.1, 1.15.0
>
>
> The user guide build script allows the user to build and preview the user 
> guide.
> When the script completes, it always deletes the user guide.
> Allow the user to keep the compiled guide.
> Relevant files:
> dev-tools/docker/docs/preview-user-guide.sh
> geode-book/README.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9587) **MSETNX** command supported

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424526#comment-17424526
 ] 

ASF subversion and git services commented on GEODE-9587:


Commit 4e85e2966d1a6c7270319a24667a8948423a in geode's branch 
refs/heads/develop from Eric Zoerner
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=4e85e29 ]

GEODE-9587: Support MSETNX command (#6925)



> **MSETNX** command supported
> 
>
> Key: GEODE-9587
> URL: https://issues.apache.org/jira/browse/GEODE-9587
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Eric Zoerner
>Assignee: Eric Zoerner
>Priority: Major
>  Labels: pull-request-available
>
> Write unit/integration tests that run against both Geode Redis and native 
> Redis, and dunit tests which test multiple concurrent clients accessing 
> different servers for the following command:
> - MSETNX
> **A.C.**
>  - Unit/integration tests for both Geode and native Redis passing
>  - DUNIT tests passing
>  - README/redis_api_for_geode.html.md.erb updated to make command "supported"
> *or*
>  - Stories in the backlog to fix the identified issues (with JIRA tickets) 
> and problem tests ignoredMGET



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9673) CI Failure: AuthExpirationDUnitTest > newClient_registeredInterest_slowReAuth_policyNone_durableClient

2021-10-05 Thread Alexander Murmann (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-9673:
-
Labels:   (was: needsTriage)

> CI Failure: AuthExpirationDUnitTest > 
> newClient_registeredInterest_slowReAuth_policyNone_durableClient
> --
>
> Key: GEODE-9673
> URL: https://issues.apache.org/jira/browse/GEODE-9673
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Hale Bales
>Priority: Major
>
> AuthExpirationDUnitTest > 
> newClient_registeredInterest_slowReAuth_policyNone_durableClient failed with 
> the following stacktrace (abridged for readability):
> {code:java}
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.security.AuthExpirationDUnitTest$$Lambda$612/337713063.run 
> in VM 0 running on Host 
> heavy-lifter-2a774957-60f5-572d-bfa8-d0f3496af7cb.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:448)
> at 
> org.apache.geode.test.junit.rules.VMProvider.invoke(VMProvider.java:94)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.newClient_registeredInterest_slowReAuth_policyNone_durableClient(AuthExpirationDUnitTest.java:629)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.security.AuthExpirationDUnitTest that uses 
> org.apache.geode.cache.Region 
> Expected size: 100 but was: 88 in:
> ["key93",
> ...,
> "key55"] within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$newClient_registeredInterest_slowReAuth_policyNone_durableClient$bb17a952$2(AuthExpirationDUnitTest.java:631)
> Caused by:
> java.lang.AssertionError: 
> Expected size: 100 but was: 88 in:
> ["key93",
> ...,
> "key55"]
> at 
> org.apache.geode.security.AuthExpirationDUnitTest.lambda$null$17(AuthExpirationDUnitTest.java:632)
> {code}
> CI Failure: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/upgrade-test-openjdk8/builds/241#B



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9623) Implement the Redis command COMMAND

2021-10-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424443#comment-17424443
 ] 

ASF subversion and git services commented on GEODE-9623:


Commit 3448e82703281bf5e7ad9d623aaf96d4ee44e879 in geode's branch 
refs/heads/develop from Jens Deppe
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=3448e82 ]

GEODE-9623: Unsupported commands should not be returned by COMMAND when not 
enabled (#6933)

- If the `enable-unsupported-commands` system property is not enabled
  then the COMMAND command should not return information about
  unsupported commands in its response.


> Implement the Redis command COMMAND
> ---
>
> Key: GEODE-9623
> URL: https://issues.apache.org/jira/browse/GEODE-9623
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: Wayne
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Implement the Redis command named [COMMAND|https://redis.io/commands/command].
> +Acceptance Criteria+
>  
> The command has been implemented along with appropriate unit tests.
>  
> The  command has been added to the AbstractHitsMissesIntegrationTest.  The 
> command has been tested using the redis-cli tool and verified against native 
> redis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)