date:20210304

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295701#comment-17295701
 ] 

ASF GitHub Bot commented on GEODE-8943:
---

lgtm-com[bot] commented on pull request #6075:
URL: https://github.com/apache/geode/pull/6075#issuecomment-791098154


   This pull request **introduces 2 alerts** when merging 
af725f5aada29bcd43097bff9289ff9d2d58ad48 into 
46f90a58c234fe0bc2719c9d9c9f8091ad460917 - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/geode/rev/pr-a6de1f5a98e30342dda34413f0da6f7ba832551e)
   
   **new alerts:**
   
   * 2 for Dereferenced variable may be null



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Filterrs (Interest and CQ) is processed multiple times with transactional 
> operation on PR
> -
>
> Key: GEODE-8943
> URL: https://issues.apache.org/jira/browse/GEODE-8943
> Project: Geode
>  Issue Type: Bug
>  Components: cq, regions
>Affects Versions: 1.14.0
>Reporter: Anilkumar Gingade
>Assignee: Eric Shu
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>
> The Filters (interest and CQ) could be getting processed multiple times when 
> transactional operation is performed.
> By design on PR the filters are processed on primary bucket where the data is 
> applied/changed and adjunct message with interested clients are sent to 
> remote servers (where the subscription queues are hosted) and replicas; thus 
> performing filter processing once and sending the message to remote servers 
> only if the filters are satisfied; it looks like currently the filter 
> processing is happening both at the primary and secondary buckets for TX 
> operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8992) When a GatewaySenderEventImpl is serialized, its operationDetail field is not included



[ 
https://issues.apache.org/jira/browse/GEODE-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295686#comment-17295686
 ] 

ASF subversion and git services commented on GEODE-8992:


Commit 9f94cb62f72cc4577509d0fe6fe1162a6c64eb67 in geode's branch 
refs/heads/feature/GEODE-8992 from Barry Oglesby
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=9f94cb6 ]

GEODE-8992: Added version check for events from remote sites


> When a GatewaySenderEventImpl is serialized, its operationDetail field is not 
> included
> --
>
> Key: GEODE-8992
> URL: https://issues.apache.org/jira/browse/GEODE-8992
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Barrett Oglesby
>Assignee: Barrett Oglesby
>Priority: Major
>  Labels: blocks-1.15.0, pull-request-available
>
> This causes the operation to become less specific when the 
> {{GatewaySenderEventImpl}} is deserialized.
> Here is an example.
> If the original {{GatewaySenderEventImpl}} is a *PUTALL_CREATE* like:
> {noformat}
> GatewaySenderEventImpl[id=EventID[id=31 
> bytes;threadID=0x10063|1;sequenceID=0;bucketId=99];action=0;operation=PUTALL_CREATE;region=/data;key=0;value=0;...]
> {noformat}
> Then, when the {{GatewaySenderEventImpl}} is serialized and deserialized, its 
> operation becomes a *CREATE*:
> {noformat}
> GatewaySenderEventImpl[id=EventID[id=31 
> bytes;threadID=0x10063|1;sequenceID=0;bucketId=99];action=0;operation=CREATE;region=/data;key=0;value=0;...]
> {noformat}
> Thats because {{GatewaySenderEventImpl.getOperation}} uses both *action* and 
> *operationDetail* to determine its operation:
> {noformat}
> public Operation getOperation() {
>   Operation op = null;
>   switch (this.action) {
> case CREATE_ACTION:
>   switch (this.operationDetail) {
> case ...
> case OP_DETAIL_PUTALL:
>   op = Operation.PUTALL_CREATE;
>   break;
> default:
>   op = Operation.CREATE;
>   break;
>   }
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8953) User Guide - re-introduce transaction details regarding non-transactional changes



 [ 
https://issues.apache.org/jira/browse/GEODE-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-8953:
--
Labels: pull-request-available  (was: )

> User Guide - re-introduce transaction details regarding non-transactional 
> changes
> -
>
> Key: GEODE-8953
> URL: https://issues.apache.org/jira/browse/GEODE-8953
> Project: Geode
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 1.13.1
>Reporter: Dave Barnes
>Assignee: Dave Barnes
>Priority: Major
>  Labels: pull-request-available
>
> Community member @alberto.gomez requests that we re-introduce some verbiage 
> that was deleted in the fix for "GEODE-5509: Rewrite the docs on transaction."
> The passage to re-instate is:
> "If other, non-transactional sources update the keys the transaction is 
> modifying, the changes may intermingle with this transaction’s changes. The 
> other sources can include distributions from remote members, loading 
> activities, and other direct cache modification calls from the same member. 
> When this happens, after your commit finishes, the cache state may not be 
> what you expected."
> [~eshu] concurs, providing the background explanation:
> To achieve best performance, non-transactional operations do not acquire 
> DLock used to check conflicts in a transaction. So transaction will not be 
> able to detect the conflict caused by a non transactional operation. It is 
> expected that user application always uses transaction or no transaction at 
> all, unless user knows that certain regions or set of entries will not be 
> modified by operations outside of a transaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8953) User Guide - re-introduce transaction details regarding non-transactional changes

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295672#comment-17295672
 ] 

ASF GitHub Bot commented on GEODE-8953:
---

davebarnes97 opened a new pull request #6095:
URL: https://github.com/apache/geode/pull/6095


   Restore some cautionary text that was removed during an earlier re-write of 
the transaction section.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> User Guide - re-introduce transaction details regarding non-transactional 
> changes
> -
>
> Key: GEODE-8953
> URL: https://issues.apache.org/jira/browse/GEODE-8953
> Project: Geode
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 1.13.1
>Reporter: Dave Barnes
>Assignee: Dave Barnes
>Priority: Major
>
> Community member @alberto.gomez requests that we re-introduce some verbiage 
> that was deleted in the fix for "GEODE-5509: Rewrite the docs on transaction."
> The passage to re-instate is:
> "If other, non-transactional sources update the keys the transaction is 
> modifying, the changes may intermingle with this transaction’s changes. The 
> other sources can include distributions from remote members, loading 
> activities, and other direct cache modification calls from the same member. 
> When this happens, after your commit finishes, the cache state may not be 
> what you expected."
> [~eshu] concurs, providing the background explanation:
> To achieve best performance, non-transactional operations do not acquire 
> DLock used to check conflicts in a transaction. So transaction will not be 
> able to detect the conflict caused by a non transactional operation. It is 
> expected that user application always uses transaction or no transaction at 
> all, unless user knows that certain regions or set of entries will not be 
> modified by operations outside of a transaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8886) When performing a rolling upgrade to 1.14 from an older version some messages are not delivered



[ 
https://issues.apache.org/jira/browse/GEODE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295663#comment-17295663
 ] 

ASF subversion and git services commented on GEODE-8886:


Commit 46f90a58c234fe0bc2719c9d9c9f8091ad460917 in geode's branch 
refs/heads/develop from mhansonp
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=46f90a5 ]

GEODE-8886: upgradeTest sometimes fails with new test (#6091)



> When performing a rolling upgrade to 1.14 from an older version some messages 
> are not delivered
> ---
>
> Key: GEODE-8886
> URL: https://issues.apache.org/jira/browse/GEODE-8886
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> We are running a test where we are performing a rolling upgrade from an older 
> version to 1.14.0 and we are finding that for whatever reason in this test 
> that not all "updates" are being passed to all servers and clients while 
> using gateway senders. It is not clear that gateway senders are the cause, 
> but at this point, it is just an interesting point. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8886) When performing a rolling upgrade to 1.14 from an older version some messages are not delivered



[ 
https://issues.apache.org/jira/browse/GEODE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295659#comment-17295659
 ] 

ASF GitHub Bot commented on GEODE-8886:
---

mhansonp commented on a change in pull request #6091:
URL: https://github.com/apache/geode/pull/6091#discussion_r587935287



##
File path: 
geode-wan/src/upgradeTest/java/org/apache/geode/cache/wan/WANRollingUpgradeVerifyGatewaySenderProfile.java
##
@@ -23,23 +23,25 @@
 import org.apache.geode.distributed.internal.InternalLocator;
 import org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue;
 import org.apache.geode.test.dunit.DistributedTestUtils;
+import org.apache.geode.test.dunit.Host;
 import org.apache.geode.test.dunit.IgnoredException;
+import org.apache.geode.test.dunit.NetworkUtils;
 import org.apache.geode.test.dunit.VM;
 import org.apache.geode.test.version.VersionManager;
 
 public class WANRollingUpgradeVerifyGatewaySenderProfile extends 
WANRollingUpgradeDUnitTest {
   @Test
-
-  // This test verifies that a GatewaySenderProfile serializes properly 
between versions.
+  // Thigit s test verifies that a GatewaySenderProfile serializes properly 
between versions.

Review comment:
   Yup, but I am not inclined to fix it now. I will fix it when I check the 
test back in...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> When performing a rolling upgrade to 1.14 from an older version some messages 
> are not delivered
> ---
>
> Key: GEODE-8886
> URL: https://issues.apache.org/jira/browse/GEODE-8886
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> We are running a test where we are performing a rolling upgrade from an older 
> version to 1.14.0 and we are finding that for whatever reason in this test 
> that not all "updates" are being passed to all servers and clients while 
> using gateway senders. It is not clear that gateway senders are the cause, 
> but at this point, it is just an interesting point. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8886) When performing a rolling upgrade to 1.14 from an older version some messages are not delivered



[ 
https://issues.apache.org/jira/browse/GEODE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295662#comment-17295662
 ] 

ASF GitHub Bot commented on GEODE-8886:
---

mhansonp merged pull request #6091:
URL: https://github.com/apache/geode/pull/6091


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> When performing a rolling upgrade to 1.14 from an older version some messages 
> are not delivered
> ---
>
> Key: GEODE-8886
> URL: https://issues.apache.org/jira/browse/GEODE-8886
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> We are running a test where we are performing a rolling upgrade from an older 
> version to 1.14.0 and we are finding that for whatever reason in this test 
> that not all "updates" are being passed to all servers and clients while 
> using gateway senders. It is not clear that gateway senders are the cause, 
> but at this point, it is just an interesting point. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8886) When performing a rolling upgrade to 1.14 from an older version some messages are not delivered



[ 
https://issues.apache.org/jira/browse/GEODE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295660#comment-17295660
 ] 

ASF GitHub Bot commented on GEODE-8886:
---

mhansonp commented on a change in pull request #6091:
URL: https://github.com/apache/geode/pull/6091#discussion_r587935362



##
File path: 
geode-wan/src/upgradeTest/java/org/apache/geode/cache/wan/WANRollingUpgradeVerifyGatewaySenderProfile.java
##
@@ -23,23 +23,25 @@
 import org.apache.geode.distributed.internal.InternalLocator;
 import org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue;
 import org.apache.geode.test.dunit.DistributedTestUtils;
+import org.apache.geode.test.dunit.Host;
 import org.apache.geode.test.dunit.IgnoredException;
+import org.apache.geode.test.dunit.NetworkUtils;
 import org.apache.geode.test.dunit.VM;
 import org.apache.geode.test.version.VersionManager;
 
 public class WANRollingUpgradeVerifyGatewaySenderProfile extends 
WANRollingUpgradeDUnitTest {
   @Test
-
-  // This test verifies that a GatewaySenderProfile serializes properly 
between versions.
+  // Thigit s test verifies that a GatewaySenderProfile serializes properly 
between versions.

Review comment:
   Hopefully that is ok.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> When performing a rolling upgrade to 1.14 from an older version some messages 
> are not delivered
> ---
>
> Key: GEODE-8886
> URL: https://issues.apache.org/jira/browse/GEODE-8886
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> We are running a test where we are performing a rolling upgrade from an older 
> version to 1.14.0 and we are finding that for whatever reason in this test 
> that not all "updates" are being passed to all servers and clients while 
> using gateway senders. It is not clear that gateway senders are the cause, 
> but at this point, it is just an interesting point. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8998) setting thread-monitoring-enabled to false causes NullPointerException



 [ 
https://issues.apache.org/jira/browse/GEODE-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-8998:

Labels: GeodeOperationAPI blocks-1.14.0 pull-request-available  (was: 
GeodeOperationAPI pull-request-available)

> setting thread-monitoring-enabled to false causes NullPointerException
> --
>
> Key: GEODE-8998
> URL: https://issues.apache.org/jira/browse/GEODE-8998
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Darrel Schneider
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.14.0, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> If you set the geode property thread-monitoring-enabled to false then any 
> geode cluster messaging is broken. As cluster messages are read the p2p 
> reader thread throws a NullPointerException.
> This bug was introduced in GEODE-8521 so it has not yet been released.
> I have a test that reproduces the NPE and this fix will be simple.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system



[ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295637#comment-17295637
 ] 

ASF GitHub Bot commented on GEODE-9003:
---

gesterzhou commented on a change in pull request #6093:
URL: https://github.com/apache/geode/pull/6093#discussion_r587909709



##
File path: 
geode-core/src/main/java/org/apache/geode/internal/cache/persistence/PersistenceAdvisorImpl.java
##
@@ -533,7 +535,8 @@ public boolean 
checkMyStateOnMembers(Set replicates)
 String message = String.format(
 "Region %s remote member %s with persistent data %s was not part 
of the same distributed system as the local data from %s",
 regionPath, member, remoteId, myId);
-throw new ConflictingPersistentDataException(message);
+logger.warn(message);

Review comment:
   it's not an error. It's race condition caused by the member which is 
temporarily out of sync. Geode can auto fix it. We just don't need this 
out-of-sync member to be a GII provider candidate. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system



[ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295636#comment-17295636
 ] 

ASF GitHub Bot commented on GEODE-9003:
---

gesterzhou commented on a change in pull request #6093:
URL: https://github.com/apache/geode/pull/6093#discussion_r587909709



##
File path: 
geode-core/src/main/java/org/apache/geode/internal/cache/persistence/PersistenceAdvisorImpl.java
##
@@ -533,7 +535,8 @@ public boolean 
checkMyStateOnMembers(Set replicates)
 String message = String.format(
 "Region %s remote member %s with persistent data %s was not part 
of the same distributed system as the local data from %s",
 regionPath, member, remoteId, myId);
-throw new ConflictingPersistentDataException(message);
+logger.warn(message);

Review comment:
   it's not an error. It's race condition cause of the member to be 
temporarily out of sync. Geode can auto fix it. We just don't need this 
out-of-sync member to be a GII provider candidate. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system



[ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295635#comment-17295635
 ] 

ASF GitHub Bot commented on GEODE-9003:
---

gesterzhou commented on a change in pull request #6093:
URL: https://github.com/apache/geode/pull/6093#discussion_r587909087



##
File path: 
geode-core/src/main/java/org/apache/geode/internal/cache/persistence/PersistenceAdvisorImpl.java
##
@@ -533,7 +535,8 @@ public boolean 
checkMyStateOnMembers(Set replicates)
 String message = String.format(
 "Region %s remote member %s with persistent data %s was not part 
of the same distributed system as the local data from %s",
 regionPath, member, remoteId, myId);
-throw new ConflictingPersistentDataException(message);
+logger.warn(message);
+iterator.remove();

Review comment:
   it won't. This data structure is in reply message. Only the message 
processor will use it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8971) Batches with incomplete transactions when stopping the gateway sender



[ 
https://issues.apache.org/jira/browse/GEODE-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295631#comment-17295631
 ] 

ASF GitHub Bot commented on GEODE-8971:
---

davebarnes97 commented on a change in pull request #6052:
URL: https://github.com/apache/geode/pull/6052#discussion_r587900508



##
File path: 
geode-core/src/main/java/org/apache/geode/cache/wan/GatewaySender.java
##
@@ -174,7 +174,32 @@
*/
   int GET_TRANSACTION_EVENTS_FROM_QUEUE_RETRIES =
   Integer.getInteger(GeodeGlossary.GEMFIRE_PREFIX + 
"get-transaction-events-from-queue-retries",
-  10);
+  2);
+  /**
+   * Milliseconds to wait before retrying to get events for a transaction from 
the
+   * gateway sender queue when group-transaction-events is true.
+   */
+  int GET_TRANSACTION_EVENTS_FROM_QUEUE_WAIT_TIME_MS =
+  Integer.getInteger(
+  GeodeGlossary.GEMFIRE_PREFIX + 
"get-transaction-events-from-queue-wait-time-ms",
+  1);
+
+  /**
+   * When group-transaction-events is set to true and the gateway sender is 
stopped,
+   * there is a possibility that the stopping occurs such that for a 
transaction,
+   * not all events belonging to it reach the queue. The reason would be that
+   * some reach the queue right before the sender is stopped and the rest do 
not make
+   * it to the queue because the sender is just stopped.
+   * In order to prevent that the queue contains incomplete transactions
+   * due to the above circumstance, this parameter allows for a grace period
+   * of the number of milliseconds set in it before the gateway sender is
+   * actually stopped, in which only events to complete transactions are put 
in the queue.
+   * Other events received in this period would be dropped.

Review comment:
   Suggested rewrite for economy of language:
   When group-transaction-events is true and the gateway sender is stopped,
   addition to the queue of a group of transaction events might be interrupted.
   To ensure that the queue does not contain incomplete transactions, this 
parameter
   allows for a grace period, specified in milliseconds, before the gateway 
sender is actually
   stopped, allowing complete transaction event groups to be queued. Any event 
received
   during the grace period that is not part of a transaction event group is 
dropped.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Batches with incomplete transactions when stopping the gateway sender
> -
>
> Key: GEODE-8971
> URL: https://issues.apache.org/jira/browse/GEODE-8971
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Affects Versions: 1.14.0
>Reporter: Alberto Gomez
>Assignee: Alberto Gomez
>Priority: Major
>  Labels: pull-request-available
>
> When the gateway sender is stopped there is a high probability that batches 
> with incomplete transactions are sent even if group-transaction-events is 
> enabled.
> The reason is that once the stop command reaches the gateway sender, it 
> immediately stops queueing events, and this could happen in the middle of 
> receiving events for the same transaction. If this is the case, some events 
> for the transaction may have reached the queue right before the stop command 
> was received and the rest of events for that transaction would not make it to 
> the queue (they would be dropped) because they arrived right after the stop 
> command was received at the gateway sender.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8971) Batches with incomplete transactions when stopping the gateway sender



[ 
https://issues.apache.org/jira/browse/GEODE-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295630#comment-17295630
 ] 

ASF GitHub Bot commented on GEODE-8971:
---

davebarnes97 commented on a change in pull request #6052:
URL: https://github.com/apache/geode/pull/6052#discussion_r587900508



##
File path: 
geode-core/src/main/java/org/apache/geode/cache/wan/GatewaySender.java
##
@@ -174,7 +174,32 @@
*/
   int GET_TRANSACTION_EVENTS_FROM_QUEUE_RETRIES =
   Integer.getInteger(GeodeGlossary.GEMFIRE_PREFIX + 
"get-transaction-events-from-queue-retries",
-  10);
+  2);
+  /**
+   * Milliseconds to wait before retrying to get events for a transaction from 
the
+   * gateway sender queue when group-transaction-events is true.
+   */
+  int GET_TRANSACTION_EVENTS_FROM_QUEUE_WAIT_TIME_MS =
+  Integer.getInteger(
+  GeodeGlossary.GEMFIRE_PREFIX + 
"get-transaction-events-from-queue-wait-time-ms",
+  1);
+
+  /**
+   * When group-transaction-events is set to true and the gateway sender is 
stopped,
+   * there is a possibility that the stopping occurs such that for a 
transaction,
+   * not all events belonging to it reach the queue. The reason would be that
+   * some reach the queue right before the sender is stopped and the rest do 
not make
+   * it to the queue because the sender is just stopped.
+   * In order to prevent that the queue contains incomplete transactions
+   * due to the above circumstance, this parameter allows for a grace period
+   * of the number of milliseconds set in it before the gateway sender is
+   * actually stopped, in which only events to complete transactions are put 
in the queue.
+   * Other events received in this period would be dropped.

Review comment:
   Suggested rewrite for economy of language:
   When group-transaction-events is true and the gateway sender is stopped,
   addition to the queue of a group of transaction events might be interrupted.
   To ensure that the queue does not contain incomplete transactions, this 
parameter
   allows for a grace period, specified in milliseconds, before the gateway 
sender is actually
   stopped, allowing complete transaction event groups to be queued. Any event 
received
   during the grace period that are not part of a transaction event group are 
dropped.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Batches with incomplete transactions when stopping the gateway sender
> -
>
> Key: GEODE-8971
> URL: https://issues.apache.org/jira/browse/GEODE-8971
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Affects Versions: 1.14.0
>Reporter: Alberto Gomez
>Assignee: Alberto Gomez
>Priority: Major
>  Labels: pull-request-available
>
> When the gateway sender is stopped there is a high probability that batches 
> with incomplete transactions are sent even if group-transaction-events is 
> enabled.
> The reason is that once the stop command reaches the gateway sender, it 
> immediately stops queueing events, and this could happen in the middle of 
> receiving events for the same transaction. If this is the case, some events 
> for the transaction may have reached the queue right before the stop command 
> was received and the rest of events for that transaction would not make it to 
> the queue (they would be dropped) because they arrived right after the stop 
> command was received at the gateway sender.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8761) Add ServerConnection threads to ThreadMonitoring Service



 [ 
https://issues.apache.org/jira/browse/GEODE-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-8761:
--
Labels: GeodeOperationAPI pull-request-available  (was: GeodeOperationAPI)

> Add ServerConnection threads to ThreadMonitoring Service
> 
>
> Key: GEODE-8761
> URL: https://issues.apache.org/jira/browse/GEODE-8761
> Project: Geode
>  Issue Type: Improvement
>  Components: regions
>Affects Versions: 1.14.0
>Reporter: Anilkumar Gingade
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
>
> In Geode the Thread Monitoring Service (TMS) allows to monitor a thread to 
> see if a thread is stuck doing particular operation; it provides thread dump 
> of stuck thread after configured time. This ticket is to add ServerConnection 
> threads to be monitored by TMS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8761) Add ServerConnection threads to ThreadMonitoring Service



[ 
https://issues.apache.org/jira/browse/GEODE-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295629#comment-17295629
 ] 

ASF GitHub Bot commented on GEODE-8761:
---

dschneider-pivotal opened a new pull request #6094:
URL: https://github.com/apache/geode/pull/6094


   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build 
issues and
   submit an update to your PR as soon as possible. If you need help, please 
send an
   email to d...@geode.apache.org.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add ServerConnection threads to ThreadMonitoring Service
> 
>
> Key: GEODE-8761
> URL: https://issues.apache.org/jira/browse/GEODE-8761
> Project: Geode
>  Issue Type: Improvement
>  Components: regions
>Affects Versions: 1.14.0
>Reporter: Anilkumar Gingade
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI
>
> In Geode the Thread Monitoring Service (TMS) allows to monitor a thread to 
> see if a thread is stuck doing particular operation; it provides thread dump 
> of stuck thread after configured time. This ticket is to add ServerConnection 
> threads to be monitored by TMS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system



[ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295621#comment-17295621
 ] 

ASF GitHub Bot commented on GEODE-9003:
---

jchen21 commented on a change in pull request #6093:
URL: https://github.com/apache/geode/pull/6093#discussion_r587885015



##
File path: 
geode-core/src/main/java/org/apache/geode/internal/cache/persistence/PersistenceAdvisorImpl.java
##
@@ -533,7 +535,8 @@ public boolean 
checkMyStateOnMembers(Set replicates)
 String message = String.format(
 "Region %s remote member %s with persistent data %s was not part 
of the same distributed system as the local data from %s",
 regionPath, member, remoteId, myId);
-throw new ConflictingPersistentDataException(message);
+logger.warn(message);
+iterator.remove();

Review comment:
   Is synchronization needed here? The `stateOnPeers` maybe shared with 
other threads.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295618#comment-17295618
 ] 

ASF GitHub Bot commented on GEODE-9003:
---

jchen21 commented on a change in pull request #6093:
URL: https://github.com/apache/geode/pull/6093#discussion_r587885015



##
File path: 
geode-core/src/main/java/org/apache/geode/internal/cache/persistence/PersistenceAdvisorImpl.java
##
@@ -533,7 +535,8 @@ public boolean 
checkMyStateOnMembers(Set replicates)
 String message = String.format(
 "Region %s remote member %s with persistent data %s was not part 
of the same distributed system as the local data from %s",
 regionPath, member, remoteId, myId);
-throw new ConflictingPersistentDataException(message);
+logger.warn(message);
+iterator.remove();

Review comment:
   Is synchronization neededhere? The `stateOnPeers` maybe shared with 
other threads.

##
File path: 
geode-core/src/main/java/org/apache/geode/internal/cache/persistence/PersistenceAdvisorImpl.java
##
@@ -533,7 +535,8 @@ public boolean 
checkMyStateOnMembers(Set replicates)
 String message = String.format(
 "Region %s remote member %s with persistent data %s was not part 
of the same distributed system as the local data from %s",
 regionPath, member, remoteId, myId);
-throw new ConflictingPersistentDataException(message);
+logger.warn(message);

Review comment:
   How about `logger.error(message)`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system



[ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295615#comment-17295615
 ] 

ASF subversion and git services commented on GEODE-9003:


Commit fd67865667f24f154c758549056e3815203a66ae in geode's branch 
refs/heads/feature/GEODE-9003 from zhouxh
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=fd67865 ]

GEODE-9003: Remove the member from replicates as GII candidate if it's not part 
of the same distributed system


> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system



 [ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9003:
--
Labels: pull-request-available  (was: )

> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-9003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295614#comment-17295614
 ] 

ASF GitHub Bot commented on GEODE-9003:
---

gesterzhou opened a new pull request #6093:
URL: https://github.com/apache/geode/pull/6093


   …s not part of the same distributed system
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build 
issues and
   submit an update to your PR as soon as possible. If you need help, please 
send an
   email to d...@geode.apache.org.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove the member from replicates as GII candidate if it's not part of the 
> same distributed system
> --
>
> Key: GEODE-9003
> URL: https://issues.apache.org/jira/browse/GEODE-9003
> Project: Geode
>  Issue Type: Bug
>Reporter: Xiaojian Zhou
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (GEODE-9003) Remove the member from replicates as GII candidate if it's not part of the same distributed system

2021-03-04 Thread Xiaojian Zhou (Jira)

Xiaojian Zhou created GEODE-9003:


 Summary: Remove the member from replicates as GII candidate if 
it's not part of the same distributed system
 Key: GEODE-9003
 URL: https://issues.apache.org/jira/browse/GEODE-9003
 Project: Geode
  Issue Type: Bug
Reporter: Xiaojian Zhou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8994) Log line for EventId ctor potentially contains garbage string



[ 
https://issues.apache.org/jira/browse/GEODE-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295610#comment-17295610
 ] 

ASF subversion and git services commented on GEODE-8994:


Commit 6f82926ea4f9fb1ede66b8c67e911a1c1c66aeb3 in geode-native's branch 
refs/heads/develop from Blake Bender
[ https://gitbox.apache.org/repos/asf?p=geode-native.git;h=6f82926 ]

GEODE-8994: Fix log line in EventId ctor (#756)

- memId isn't a string, need to convert it first or we'll log garbage.

> Log line for EventId ctor potentially contains garbage string
> -
>
> Key: GEODE-8994
> URL: https://issues.apache.org/jira/browse/GEODE-8994
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Reporter: Blake Bender
>Assignee: Blake Bender
>Priority: Major
>  Labels: pull-request-available
>
> The following logging was recently added to geode-native for debugging:
> ```
> LOGDEBUG("EventId::EventId(%p) - memId=%s, memIdLen=%d, thr=%" PRId64
>  ", seq=%" PRId64, this, memId, memIdLen, thr, seq);
> ```
> The variable `memId` in this case is of type `char*` but is NOT a valid 
> string, rather just a byte buffer.  Logging it this way can potentially print 
> out garbage, causing errors in parsing tools like gnmsg or other apps when 
> attempting to read it.  `memId` needs to be decoded into a string 
> representation of the bytes, and that logged, rather than attempting to print 
> it out raw.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8994) Log line for EventId ctor potentially contains garbage string



[ 
https://issues.apache.org/jira/browse/GEODE-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295611#comment-17295611
 ] 

ASF GitHub Bot commented on GEODE-8994:
---

pdxcodemonkey merged pull request #756:
URL: https://github.com/apache/geode-native/pull/756


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Log line for EventId ctor potentially contains garbage string
> -
>
> Key: GEODE-8994
> URL: https://issues.apache.org/jira/browse/GEODE-8994
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Reporter: Blake Bender
>Assignee: Blake Bender
>Priority: Major
>  Labels: pull-request-available
>
> The following logging was recently added to geode-native for debugging:
> ```
> LOGDEBUG("EventId::EventId(%p) - memId=%s, memIdLen=%d, thr=%" PRId64
>  ", seq=%" PRId64, this, memId, memIdLen, thr, seq);
> ```
> The variable `memId` in this case is of type `char*` but is NOT a valid 
> string, rather just a byte buffer.  Logging it this way can potentially print 
> out garbage, causing errors in parsing tools like gnmsg or other apps when 
> attempting to read it.  `memId` needs to be decoded into a string 
> representation of the bytes, and that logged, rather than attempting to print 
> it out raw.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (GEODE-9000) NPE During Reconnect After Network Split

2021-03-04 Thread Ernest Burghardt (Jira)



 [ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ernest Burghardt reassigned GEODE-9000:
---

Assignee: Ernest Burghardt

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Assignee: Ernest Burghardt
>Priority: Major
>  Labels: blocks-1.14.0
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2315)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1239)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$forceDisconnect$0(GMSMembership.java:1951)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> [error 2021/03/04 03:32:02.747 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected problem starting up membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
>

[jira] [Commented] (GEODE-8886) When performing a rolling upgrade to 1.14 from an older version some messages are not delivered

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295605#comment-17295605
 ] 

ASF GitHub Bot commented on GEODE-8886:
---

gesterzhou commented on a change in pull request #6091:
URL: https://github.com/apache/geode/pull/6091#discussion_r587862139



##
File path: 
geode-wan/src/upgradeTest/java/org/apache/geode/cache/wan/WANRollingUpgradeVerifyGatewaySenderProfile.java
##
@@ -23,23 +23,25 @@
 import org.apache.geode.distributed.internal.InternalLocator;
 import org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue;
 import org.apache.geode.test.dunit.DistributedTestUtils;
+import org.apache.geode.test.dunit.Host;
 import org.apache.geode.test.dunit.IgnoredException;
+import org.apache.geode.test.dunit.NetworkUtils;
 import org.apache.geode.test.dunit.VM;
 import org.apache.geode.test.version.VersionManager;
 
 public class WANRollingUpgradeVerifyGatewaySenderProfile extends 
WANRollingUpgradeDUnitTest {
   @Test
-
-  // This test verifies that a GatewaySenderProfile serializes properly 
between versions.
+  // Thigit s test verifies that a GatewaySenderProfile serializes properly 
between versions.

Review comment:
   typo?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> When performing a rolling upgrade to 1.14 from an older version some messages 
> are not delivered
> ---
>
> Key: GEODE-8886
> URL: https://issues.apache.org/jira/browse/GEODE-8886
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> We are running a test where we are performing a rolling upgrade from an older 
> version to 1.14.0 and we are finding that for whatever reason in this test 
> that not all "updates" are being passed to all servers and clients while 
> using gateway senders. It is not clear that gateway senders are the cause, 
> but at this point, it is just an interesting point. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8998) setting thread-monitoring-enabled to false causes NullPointerException



[ 
https://issues.apache.org/jira/browse/GEODE-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295603#comment-17295603
 ] 

ASF subversion and git services commented on GEODE-8998:


Commit 547f8971cabaf965615312e90f1ba5420061eb28 in geode's branch 
refs/heads/support/1.14 from Darrel Schneider
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=547f897 ]

GEODE-8998: fix NPE caused by thread-monitor-enabled=false (#6083) (#6092)

If thread monitoring is enable, the Connection class will not get a 
DummyExecutor
instead of null thus preventing the NPE.

(cherry picked from commit c650095d74ca0b88a33a1089a0e0caa331b42ea0)

> setting thread-monitoring-enabled to false causes NullPointerException
> --
>
> Key: GEODE-8998
> URL: https://issues.apache.org/jira/browse/GEODE-8998
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Darrel Schneider
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> If you set the geode property thread-monitoring-enabled to false then any 
> geode cluster messaging is broken. As cluster messages are read the p2p 
> reader thread throws a NullPointerException.
> This bug was introduced in GEODE-8521 so it has not yet been released.
> I have a test that reproduces the NPE and this fix will be simple.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8998) setting thread-monitoring-enabled to false causes NullPointerException

2021-03-04 Thread Darrel Schneider (Jira)



 [ 
https://issues.apache.org/jira/browse/GEODE-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darrel Schneider updated GEODE-8998:

Fix Version/s: 1.14.0

> setting thread-monitoring-enabled to false causes NullPointerException
> --
>
> Key: GEODE-8998
> URL: https://issues.apache.org/jira/browse/GEODE-8998
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Darrel Schneider
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> If you set the geode property thread-monitoring-enabled to false then any 
> geode cluster messaging is broken. As cluster messages are read the p2p 
> reader thread throws a NullPointerException.
> This bug was introduced in GEODE-8521 so it has not yet been released.
> I have a test that reproduces the NPE and this fix will be simple.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8998) setting thread-monitoring-enabled to false causes NullPointerException



[ 
https://issues.apache.org/jira/browse/GEODE-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295602#comment-17295602
 ] 

ASF GitHub Bot commented on GEODE-8998:
---

dschneider-pivotal merged pull request #6092:
URL: https://github.com/apache/geode/pull/6092


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> setting thread-monitoring-enabled to false causes NullPointerException
> --
>
> Key: GEODE-8998
> URL: https://issues.apache.org/jira/browse/GEODE-8998
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Darrel Schneider
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.15.0
>
>
> If you set the geode property thread-monitoring-enabled to false then any 
> geode cluster messaging is broken. As cluster messages are read the p2p 
> reader thread throws a NullPointerException.
> This bug was introduced in GEODE-8521 so it has not yet been released.
> I have a test that reproduces the NPE and this fix will be simple.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-7309) Upgrade Lucene from 6.6.2 to 8.2.0



[ 
https://issues.apache.org/jira/browse/GEODE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295590#comment-17295590
 ] 

ASF GitHub Bot commented on GEODE-7309:
---

mkevo commented on a change in pull request #6076:
URL: https://github.com/apache/geode/pull/6076#discussion_r587841824



##
File path: 
geode-core/src/main/java/org/apache/geode/internal/cache/InternalCacheForClientAccess.java
##
@@ -1225,6 +1226,14 @@ public void unlockDiskStore(String diskStoreName) {
 
   }
 
+  @Override
+  public boolean hasMemberOlderThan(KnownVersion version) {
+return getMembers().stream()

Review comment:
   Thanks for the suggestion! 
   As both `GemFireCacheImpl` and `InternalCacheForClientAccess` implements 
`InternalCache`, I think that we need to add method body in both classes. 
   

##
File path: 
geode-lucene/src/upgradeTest/java/org/apache/geode/cache/lucene/RollingUpgradeQueryReturnsCorrectResultsAfterClientAndServersAreRolledOverAllBucketsCreated.java
##
@@ -31,6 +32,7 @@
 public class 
RollingUpgradeQueryReturnsCorrectResultsAfterClientAndServersAreRolledOverAllBucketsCreated
 extends LuceneSearchWithRollingUpgradeDUnit {
 
+  @Ignore

Review comment:
   Upgrading to the Lucene 7.1.0 this test is not supported as formatting 
is changed between versions. In that case we add note to the documentation so 
all cluster members must be on the same version in order to execute Lucene 
queries.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade Lucene from 6.6.2 to 8.2.0
> --
>
> Key: GEODE-7309
> URL: https://issues.apache.org/jira/browse/GEODE-7309
> Project: Geode
>  Issue Type: Sub-task
>Reporter: Mario Kevo
>Assignee: Mario Kevo
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest

2021-03-04 Thread Bruce J Schuchardt (Jira)



 [ 
https://issues.apache.org/jira/browse/GEODE-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8979.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> CI Failure: SSLSocketHostNameVerificationIntegrationTest
> 
>
> Key: GEODE-8979
> URL: https://issues.apache.org/jira/browse/GEODE-8979
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> This test failed in a CI IntegrationTest run with this exception:
> {noformat}
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > 
> nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] 
> FAILED
> org.apache.geode.GemFireIOException: exception closing SSL session
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409)
> at 
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216)
> Caused by:
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:51)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403)
> ... 1 more
> {noformat}
> It looks like the test needs to have a try/catch for IOException when closing 
> the NioSslEngine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8991) Cached regions are not cleaned up whenever the connection to the cluster is lost



[ 
https://issues.apache.org/jira/browse/GEODE-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295582#comment-17295582
 ] 

ASF GitHub Bot commented on GEODE-8991:
---

gaussianrecurrence opened a new pull request #757:
URL: https://github.com/apache/geode-native/pull/757


- Whenever subscription redundancy is lost cached regions are supposed
  to be cleaned up. This commit implements interest recovery to achieve
  precisely that.
- ITs implemented to verify the functionality.
   
   Labels: draft do-not-review



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Cached regions are not cleaned up whenever the connection to the cluster is 
> lost
> 
>
> Key: GEODE-8991
> URL: https://issues.apache.org/jira/browse/GEODE-8991
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Affects Versions: 1.15.0
>Reporter: Mario Salazar de Torres
>Assignee: Mario Salazar de Torres
>Priority: Major
>
> Under a cluster restart scenario it can't be guaranteed that all the entries 
> that existed before, exist after the restart. Hence if geode-native client 
> has any cached region registered it should be cleared after the connection is 
> lost towards the cluster in order to ensure cache consistency.
> This is happenig in the case of the Java client, as described in this part of 
> the documentation: 
> [https://geode.apache.org/docs/guide/12/developing/events/how_client_server_distribution_works.html#how_client_server_distribution_works__section_928BB60066414BEB9FAA7FB3120334A3]
> However, this is not the case for the native client



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8991) Cached regions are not cleaned up whenever the connection to the cluster is lost



 [ 
https://issues.apache.org/jira/browse/GEODE-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-8991:
--
Labels: pull-request-available  (was: )

> Cached regions are not cleaned up whenever the connection to the cluster is 
> lost
> 
>
> Key: GEODE-8991
> URL: https://issues.apache.org/jira/browse/GEODE-8991
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Affects Versions: 1.15.0
>Reporter: Mario Salazar de Torres
>Assignee: Mario Salazar de Torres
>Priority: Major
>  Labels: pull-request-available
>
> Under a cluster restart scenario it can't be guaranteed that all the entries 
> that existed before, exist after the restart. Hence if geode-native client 
> has any cached region registered it should be cleared after the connection is 
> lost towards the cluster in order to ensure cache consistency.
> This is happenig in the case of the Java client, as described in this part of 
> the documentation: 
> [https://geode.apache.org/docs/guide/12/developing/events/how_client_server_distribution_works.html#how_client_server_distribution_works__section_928BB60066414BEB9FAA7FB3120334A3]
> However, this is not the case for the native client



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (GEODE-8991) Cached regions are not cleaned up whenever the connection to the cluster is lost

2021-03-04 Thread Mario Salazar de Torres (Jira)



 [ 
https://issues.apache.org/jira/browse/GEODE-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mario Salazar de Torres reassigned GEODE-8991:
--

Assignee: Mario Salazar de Torres

> Cached regions are not cleaned up whenever the connection to the cluster is 
> lost
> 
>
> Key: GEODE-8991
> URL: https://issues.apache.org/jira/browse/GEODE-8991
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Affects Versions: 1.15.0
>Reporter: Mario Salazar de Torres
>Assignee: Mario Salazar de Torres
>Priority: Major
>
> Under a cluster restart scenario it can't be guaranteed that all the entries 
> that existed before, exist after the restart. Hence if geode-native client 
> has any cached region registered it should be cleared after the connection is 
> lost towards the cluster in order to ensure cache consistency.
> This is happenig in the case of the Java client, as described in this part of 
> the documentation: 
> [https://geode.apache.org/docs/guide/12/developing/events/how_client_server_distribution_works.html#how_client_server_distribution_works__section_928BB60066414BEB9FAA7FB3120334A3]
> However, this is not the case for the native client



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8671) Two threads calling get and retrieve the same PdxInstance, resulting in corruption



 [ 
https://issues.apache.org/jira/browse/GEODE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-8671:

Fix Version/s: 1.12.2

> Two threads calling get and retrieve the same PdxInstance, resulting in 
> corruption
> --
>
> Key: GEODE-8671
> URL: https://issues.apache.org/jira/browse/GEODE-8671
> Project: Geode
>  Issue Type: Improvement
>  Components: regions
>Reporter: Dan Smith
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.12.2, 1.13.2, 1.14.0, 1.15.0
>
>
> Even if copy-on-read is set to true, two threads calling get on a partitioned 
> region can end up with the same PdxInstance object.
> This is problematic because some PdxInstances methods are not thread safe. 
> Although the underlying bytes are immutatable, the PDXInstance has a 
> ByteSource with a position field that changes. That means two threads doing 
> serialization or calling toString on the PdxInstance could result in one or 
> more threads getting a corrupt read.
> It looks like they are ending up with the same instance because of the 
> behavior in LocalRegion.optimizedGetObject. We use futures to make sure there 
> is only 1 get that goes through, and both threads receive the same value.
> 
> Ending up in optimizedGetObject requires a race with the put, because if the 
> value was in the cache at the beginning of the get it would be returned 
> earlier in the get process.
> I put a test that reproduces this issue here -   
> https://github.com/upthewaterspout/geode/pull/new/feature/pdx-instances-shared



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8872) Add client option, to request locators internal host addresses



[ 
https://issues.apache.org/jira/browse/GEODE-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295567#comment-17295567
 ] 

ASF GitHub Bot commented on GEODE-8872:
---

DonalEvans commented on a change in pull request #5948:
URL: https://github.com/apache/geode/pull/5948#discussion_r587691115



##
File path: 
geode-core/src/distributedTest/java/org/apache/geode/cache/client/internal/AutoConnectionSourceDUnitTest.java
##
@@ -453,6 +454,80 @@ public void testClientMembershipListener() {
 Assert.assertEquals(0, serverListener.getJoins());
   }
 
+
+  @Test
+  public void testClientGetsLocatorListwithExternalAddress() throws Exception {

Review comment:
   Typo here, this should be "ListWithExternal"

##
File path: 
geode-core/src/distributedTest/java/org/apache/geode/cache/client/internal/AutoConnectionSourceDUnitTest.java
##
@@ -64,6 +64,7 @@
   @Override
   public final void postSetUp() {
 addIgnoredException("NoAvailableLocatorsException");
+addIgnoredException("SocketException");

Review comment:
   Is this IgnoredException needed? When I comment out this line, all the 
tests in the class still pass. Also, if it's required for only one test case, 
it should be added in only that test case, to avoid masking failures in other 
tests.

##
File path: 
geode-core/src/integrationTest/java/org/apache/geode/cache/client/ClientCacheFactoryJUnitTest.java
##
@@ -460,4 +460,22 @@ public void 
configuringPdxDiskStoreThroughXMLShouldLogWarningMessage() throws IO
   .contains("PDX persistence is not supported on client 
side.")).isTrue();
 }
   }
+
+  @Test
+  public void 
testDefaultPoolRequestLocatorInternalAddressEnabled_defaultvalue() throws 
Exception {
+clientCache = new 
ClientCacheFactory().setRequestLocatorInternalAddressEnabled(false)
+.addPoolServer(InetAddress.getLocalHost().getHostName(), 
).create();

Review comment:
   Is there a reason this port number is being used? Might it be better to 
use `AvailablePortHelper.getRandomAvailableTCPPort()` here to prevent possible 
port collisions?

##
File path: 
geode-core/src/integrationTest/java/org/apache/geode/cache/client/ClientCacheFactoryJUnitTest.java
##
@@ -460,4 +460,22 @@ public void 
configuringPdxDiskStoreThroughXMLShouldLogWarningMessage() throws IO
   .contains("PDX persistence is not supported on client 
side.")).isTrue();
 }
   }
+
+  @Test
+  public void 
testDefaultPoolRequestLocatorInternalAddressEnabled_defaultvalue() throws 
Exception {
+clientCache = new 
ClientCacheFactory().setRequestLocatorInternalAddressEnabled(false)
+.addPoolServer(InetAddress.getLocalHost().getHostName(), 
).create();
+Pool defaultPool = clientCache.getDefaultPool();
+assertThat(defaultPool.isRequestLocatorInternalAddressEnabled()).isFalse();
+  }
+
+  @Test
+  public void testDefaultPoolRequestLocatorInternalAddressEnabled() throws 
Exception {

Review comment:
   This test name could be more descriptive. Maybe something like 
"defaultPoolUsesValueOfRequestLocatorInternalAddressEnabledSetInClientCacheFactory"

##
File path: 
geode-core/src/distributedTest/java/org/apache/geode/cache/client/internal/AutoConnectionSourceDUnitTest.java
##
@@ -453,6 +454,80 @@ public void testClientMembershipListener() {
 Assert.assertEquals(0, serverListener.getJoins());
   }
 
+
+  @Test
+  public void testClientGetsLocatorListwithExternalAddress() throws Exception {
+final String hostName = getServerHostName();
+VM locator0VM = VM.getVM(0);
+VM locator1VM = VM.getVM(1);
+
+final int locator0Port =
+locator0VM.invoke("Start Locator1 ", () -> startLocator(hostName, "", 
"127.0.0.1"));
+final int locator1Port = locator1VM.invoke("Start Locator2 ",
+() -> startLocator(hostName, getLocatorString(hostName, locator0Port), 
"127.0.0.1"));
+assertThat(locator0Port).isGreaterThan(0);
+assertThat(locator1Port).isGreaterThan(0);
+
+startBridgeClient(null, hostName, locator0Port, false);
+InetSocketAddress locatorToWaitFor = new InetSocketAddress("127.0.0.1", 
locator1Port);
+MyLocatorCallback callback = (MyLocatorCallback) 
remoteObjects.get(CALLBACK_KEY);
+
+boolean discovered = callback.waitForDiscovery(locatorToWaitFor, MAX_WAIT);
+Assert.assertTrue(
+"Waited " + MAX_WAIT + " for " + locatorToWaitFor
++ " to be discovered on client. List is now: " + 
callback.getDiscovered(),
+discovered);
+
+InetSocketAddress[] initialLocators =
+new InetSocketAddress[] {new InetSocketAddress(hostName, 
locator0Port)};
+
+InetSocketAddress[] expectedLocators =
+new InetSocketAddress[] {new InetSocketAddress("127.0.0.1", 
locator0Port),
+new InetSocketAddress("127.0.0.1", locator1Port)};
+
+final Pool pool = PoolManager.find(POOL_NAME);
+

[jira] [Commented] (GEODE-8998) setting thread-monitoring-enabled to false causes NullPointerException

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295560#comment-17295560
 ] 

ASF GitHub Bot commented on GEODE-8998:
---

dschneider-pivotal opened a new pull request #6092:
URL: https://github.com/apache/geode/pull/6092


   If thread monitoring is enable, the Connection class will not get a 
DummyExecutor
   instead of null thus preventing the NPE.
   
   (cherry picked from commit c650095d74ca0b88a33a1089a0e0caa331b42ea0)
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build 
issues and
   submit an update to your PR as soon as possible. If you need help, please 
send an
   email to d...@geode.apache.org.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> setting thread-monitoring-enabled to false causes NullPointerException
> --
>
> Key: GEODE-8998
> URL: https://issues.apache.org/jira/browse/GEODE-8998
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Darrel Schneider
>Assignee: Darrel Schneider
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.15.0
>
>
> If you set the geode property thread-monitoring-enabled to false then any 
> geode cluster messaging is broken. As cluster messages are read the p2p 
> reader thread throws a NullPointerException.
> This bug was introduced in GEODE-8521 so it has not yet been released.
> I have a test that reproduces the NPE and this fix will be simple.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8671) Two threads calling get and retrieve the same PdxInstance, resulting in corruption



[ 
https://issues.apache.org/jira/browse/GEODE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295553#comment-17295553
 ] 

ASF subversion and git services commented on GEODE-8671:


Commit 6f5e9efb486b0f002b29b3ca40ec6441dfeed34a in geode's branch 
refs/heads/support/1.12 from Jianxia Chen
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=6f5e9ef ]

GEODE-8671: Two threads calling get and retrieve the same PdxInstance, 
resulting in corruption (#5925)

For PdxInstance, return a new reference in LocalRegion.optimizedGetObject(), 
instead of using the value in the Future. This is to avoid Pdx corruption when 
multiple threads share the same reference of PdxInstance.

(cherry picked from commit dabb610b74bb0b27603d7803ec3cdd1cbb16c43f)


> Two threads calling get and retrieve the same PdxInstance, resulting in 
> corruption
> --
>
> Key: GEODE-8671
> URL: https://issues.apache.org/jira/browse/GEODE-8671
> Project: Geode
>  Issue Type: Improvement
>  Components: regions
>Reporter: Dan Smith
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.13.2, 1.14.0, 1.15.0
>
>
> Even if copy-on-read is set to true, two threads calling get on a partitioned 
> region can end up with the same PdxInstance object.
> This is problematic because some PdxInstances methods are not thread safe. 
> Although the underlying bytes are immutatable, the PDXInstance has a 
> ByteSource with a position field that changes. That means two threads doing 
> serialization or calling toString on the PdxInstance could result in one or 
> more threads getting a corrupt read.
> It looks like they are ending up with the same instance because of the 
> behavior in LocalRegion.optimizedGetObject. We use futures to make sure there 
> is only 1 get that goes through, and both threads receive the same value.
> 
> Ending up in optimizedGetObject requires a race with the put, because if the 
> value was in the cache at the beginning of the get it would be returned 
> earlier in the get process.
> I put a test that reproduces this issue here -   
> https://github.com/upthewaterspout/geode/pull/new/feature/pdx-instances-shared



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295558#comment-17295558
 ] 

ASF GitHub Bot commented on GEODE-8979:
---

bschuchardt merged pull request #6079:
URL: https://github.com/apache/geode/pull/6079


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> CI Failure: SSLSocketHostNameVerificationIntegrationTest
> 
>
> Key: GEODE-8979
> URL: https://issues.apache.org/jira/browse/GEODE-8979
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
>
> This test failed in a CI IntegrationTest run with this exception:
> {noformat}
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > 
> nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] 
> FAILED
> org.apache.geode.GemFireIOException: exception closing SSL session
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409)
> at 
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216)
> Caused by:
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:51)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403)
> ... 1 more
> {noformat}
> It looks like the test needs to have a try/catch for IOException when closing 
> the NioSslEngine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest



[ 
https://issues.apache.org/jira/browse/GEODE-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295559#comment-17295559
 ] 

ASF subversion and git services commented on GEODE-8979:


Commit 52e74112317bdc0b25904eba5b1308e0c1691306 in geode's branch 
refs/heads/develop from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=52e7411 ]

GEODE-8979: CI Failure: SSLSocketHostNameVerificationIntegrationTest (#6079)

This test was closing a client socket before ensuring that a thread it
had created was finished.  If the socket is closed quickly enough it
could cause that thread to get an IOException and cause the test to
fail.

The fix is to ensure that the thread is finished before closing the
client socket.

> CI Failure: SSLSocketHostNameVerificationIntegrationTest
> 
>
> Key: GEODE-8979
> URL: https://issues.apache.org/jira/browse/GEODE-8979
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
>
> This test failed in a CI IntegrationTest run with this exception:
> {noformat}
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > 
> nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] 
> FAILED
> org.apache.geode.GemFireIOException: exception closing SSL session
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409)
> at 
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216)
> Caused by:
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:51)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403)
> ... 1 more
> {noformat}
> It looks like the test needs to have a try/catch for IOException when closing 
> the NioSslEngine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (GEODE-8907) CI Failure: ClientServerTransactionCCEDUnitTest.testTxRemoveAll

2021-03-04 Thread Louis R. Jacome (Jira)



 [ 
https://issues.apache.org/jira/browse/GEODE-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Louis R. Jacome resolved GEODE-8907.

Fix Version/s: 1.15.0
   Resolution: Fixed

> CI Failure: ClientServerTransactionCCEDUnitTest.testTxRemoveAll
> ---
>
> Key: GEODE-8907
> URL: https://issues.apache.org/jira/browse/GEODE-8907
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Louis R. Jacome
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/794]
>  reported a test failure 
>  
> {noformat}
> org.apache.geode.internal.cache.ClientServerTransactionCCEDUnitTest > 
> testTxRemoveAll FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.RemoteTransactionDUnitTest$34.call in VM 2 
> running on Host c44a375b0bbd with 4 VMs
> Caused by:
> java.lang.IllegalStateException: Thread does not have an active 
> transaction
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.RemoteTransactionDUnitTest$2.call in VM 0 
> running on Host c44a375b0bbd with 4 VMs
> Caused by:
> java.lang.AssertionError: Event never occurred after 3 ms:  
> {noformat}
>  
> Logs available at
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0655/test-results/distributedTest/1612232803/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0655/test-artifacts/1612232803/distributedtestfiles-OpenJDK8-1.14.0-build.0655.tgz
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8886) When performing a rolling upgrade to 1.14 from an older version some messages are not delivered

2021-03-04 Thread Geode Integration (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295545#comment-17295545
 ] 

Geode Integration commented on GEODE-8886:
--

Seen in [UpgradeTestOpenJDK8 
#46|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UpgradeTestOpenJDK8/builds/46]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0029/test-results/upgradeTest/1614826822/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-main/1.15.0-build.0029/test-artifacts/1614826822/upgradetestfiles-OpenJDK8-1.15.0-build.0029.tgz].

> When performing a rolling upgrade to 1.14 from an older version some messages 
> are not delivered
> ---
>
> Key: GEODE-8886
> URL: https://issues.apache.org/jira/browse/GEODE-8886
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> We are running a test where we are performing a rolling upgrade from an older 
> version to 1.14.0 and we are finding that for whatever reason in this test 
> that not all "updates" are being passed to all servers and clients while 
> using gateway senders. It is not clear that gateway senders are the cause, 
> but at this point, it is just an interesting point. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8886) When performing a rolling upgrade to 1.14 from an older version some messages are not delivered



[ 
https://issues.apache.org/jira/browse/GEODE-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295535#comment-17295535
 ] 

ASF GitHub Bot commented on GEODE-8886:
---

mhansonp opened a new pull request #6091:
URL: https://github.com/apache/geode/pull/6091


   Reverting my test out due to presumed interactions with tests in the test 
class. Working on fixing and re-adding at a later date



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> When performing a rolling upgrade to 1.14 from an older version some messages 
> are not delivered
> ---
>
> Key: GEODE-8886
> URL: https://issues.apache.org/jira/browse/GEODE-8886
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> We are running a test where we are performing a rolling upgrade from an older 
> version to 1.14.0 and we are finding that for whatever reason in this test 
> that not all "updates" are being passed to all servers and clients while 
> using gateway senders. It is not clear that gateway senders are the cause, 
> but at this point, it is just an interesting point. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-9002) Add Statistic for /proc/schedstat

2021-03-04 Thread Bill Burcham (Jira)

[
https://issues.apache.org/jira/browse/GEODE-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bill Burcham updated GEODE-9002:

Description:
Linux performance icon Brendan Gregg advocates the
[USE|http://www.brendangregg.com/usemethod.html] method of performance
analysis: Utilization Saturation and Errors.

When it comes to CPU, Geode captures a number of _utilization_ statistics. Some
are direct like LinuxSystemStats cpuIdle and cpuActive. Others are indirect
like:
* DistributionStats
** heartbeatsSent: you may see a gap in the every-five-seconds heartbeats
* StatSampler
** delayDuration: you may see a rise when CPU is scarce
** sampleCount: you may see an interruption in the regular once-per-second
sampling
* (G1GC collector)
** (various memory utilization statistics may indicate memory pressure which
in turn can give rise to long GC pauses)
* LinuxSystemStats
** cpuSteal: indicating that the virtualization environment has not given the
VM its share of CPU

But utilization statistics alone can't tell you when a resource (like CPU) is
_saturated_, i.e. when demand is higher than the servicing ability. If you're
just looking at utilization metrics, then a saturated system might look a lot
like a system just below saturation. In order to tell the difference,
saturation metrics are needed.

In the case of CPU, there is a conceptual queue in front of each processor.
Tasks (operating system threads) that are ready to run, enter a queue, and
after some delay, are given a time slice by an actual physical CPU.

You might think that Geode's LinuxSystemStats loadAverage1 and 5 and 15, might
fit this bill. Those statistics do provide some saturation information. The
problem is, they conflate CPU with I/O and other things (see [Linux Load
Averages: Solving the
Mystery|[http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html].)]

A better, more specific measure of CPU saturation is available through
statistics exposed via the /proc/schedstat virtual file.

When this ticket is complete, there will be a new statistic type called
LinuxThreadScheduler, with three associated statistics gathered directly from
/proc/schedstat or derived from data gathered from it:
* runningTimeNanos: sum of all time spent running by tasks on this processor
in nanoseconds
* queuedTimeNanos: sum of all time spent waiting to run by tasks on this
processor in nanoseconds
* tasksScheduledCount: # of tasks (not necessarily unique) given to the
processor
* meanTaskQueuedTimeNanos: average time that a ready-to-run task waited for a
CPU, since the last sample, in nanoseconds

One "statistic" will be gathered for each CPU. So a Geode process running on a
two-CPU system will capture two statistics, called "cpu0", "cpu1", each of this
new type.

By default Geode will not gather these new statistics. A TBD Java system
property will be used to enable gathering the new LinuxThreadScheduler
statistic.

was:
Linux performance icon Brendan Gregg advocates the
[USE|http://www.brendangregg.com/usemethod.html] method of performance
analysis: Utilization Saturation and Errors.

When it comes to CPU, Geode captures a number of _utilization_ statistics. Some
are direct like LinuxSystemStats cpuIdle and cpuActive. Others are indirect
like:

A better, more specific measure of CPU saturation is available through
statistics exposed via the /proc/schedstat virtual file.

[jira] [Updated] (GEODE-9002) Add Statistic for /proc/schedstat



 [ 
https://issues.apache.org/jira/browse/GEODE-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-9002:
--
Labels: pull-request-available  (was: )

> Add Statistic for /proc/schedstat
> -
>
> Key: GEODE-9002
> URL: https://issues.apache.org/jira/browse/GEODE-9002
> Project: Geode
>  Issue Type: New Feature
>  Components: statistics
>Reporter: Bill Burcham
>Assignee: Bill Burcham
>Priority: Major
>  Labels: pull-request-available
>
> Linux performance icon Brendan Gregg advocates the 
> [USE|http://www.brendangregg.com/usemethod.html] method of performance 
> analysis: Utilization Saturation and Errors.
> When it comes to CPU, Geode captures a number of _utilization_ statistics. 
> Some are direct like LinuxSystemStats cpuIdle and cpuActive. Others are 
> indirect like:
>  
> But utilization statistics alone can't tell you when a resource (like CPU) is 
> _saturated_, i.e. when  demand is higher than the servicing ability. If 
> you're just looking at utilization metrics, then a saturated system might 
> look a lot like a system just below saturation. In order to tell the 
> difference, saturation metrics are needed.
> In the case of CPU, there is a conceptual queue in front of each processor. 
> Tasks (operating system threads) that are ready to run, enter a queue, and 
> after some delay, are given a time slice by an actual physical CPU.
> You might think that Geode's LinuxSystemStats loadAverage1 and 5 and 15, 
> might fit this bill. Those statistics do provide some saturation information. 
> The problem is, they conflate CPU with I/O and other things (see [Linux Load 
> Averages: Solving the 
> Mystery|[http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html].)]
> A better, more specific measure of CPU saturation is available through 
> statistics exposed via the /proc/schedstat virtual file.
> When this ticket is complete, there will be a new statistic type called 
> LinuxThreadScheduler, with three associated statistics gathered directly from 
> /proc/schedstat or derived from data gathered from it:
>  * runningTimeNanos: sum of all time spent running by tasks on this processor 
> in nanoseconds
>  * queuedTimeNanos: sum of all time spent waiting to run by tasks on this 
> processor in nanoseconds
>  * tasksScheduledCount: # of tasks (not necessarily unique) given to the 
> processor
>  * meanTaskQueuedTimeNanos: average time that a ready-to-run task waited for 
> a CPU, since the last sample, in nanoseconds
> One "statistic" will be gathered for each CPU. So a Geode process running on 
> a two-CPU system will capture two statistics, called "cpu0", "cpu1", each of 
> this new type.
> By default Geode will not gather these new statistics. A TBD Java system 
> property will be used to enable gathering the new LinuxThreadScheduler 
> statistic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9002) Add Statistic for /proc/schedstat

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295516#comment-17295516
 ] 

ASF GitHub Bot commented on GEODE-9002:
---

Bill opened a new pull request #6090:
URL: https://github.com/apache/geode/pull/6090


   [GEODE-9002](https://issues.apache.org/jira/browse/GEODE-9002)
   
   See ticket for details.
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [x] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [x] Is your initial contribution a single, squashed commit?
   
   - [x] Does `gradlew build` run cleanly?
   
   - [x] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Statistic for /proc/schedstat
> -
>
> Key: GEODE-9002
> URL: https://issues.apache.org/jira/browse/GEODE-9002
> Project: Geode
>  Issue Type: New Feature
>  Components: statistics
>Reporter: Bill Burcham
>Assignee: Bill Burcham
>Priority: Major
>
> Linux performance icon Brendan Gregg advocates the 
> [USE|http://www.brendangregg.com/usemethod.html] method of performance 
> analysis: Utilization Saturation and Errors.
> When it comes to CPU, Geode captures a number of _utilization_ statistics. 
> Some are direct like LinuxSystemStats cpuIdle and cpuActive. Others are 
> indirect like:
>  
> But utilization statistics alone can't tell you when a resource (like CPU) is 
> _saturated_, i.e. when  demand is higher than the servicing ability. If 
> you're just looking at utilization metrics, then a saturated system might 
> look a lot like a system just below saturation. In order to tell the 
> difference, saturation metrics are needed.
> In the case of CPU, there is a conceptual queue in front of each processor. 
> Tasks (operating system threads) that are ready to run, enter a queue, and 
> after some delay, are given a time slice by an actual physical CPU.
> You might think that Geode's LinuxSystemStats loadAverage1 and 5 and 15, 
> might fit this bill. Those statistics do provide some saturation information. 
> The problem is, they conflate CPU with I/O and other things (see [Linux Load 
> Averages: Solving the 
> Mystery|[http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html].)]
> A better, more specific measure of CPU saturation is available through 
> statistics exposed via the /proc/schedstat virtual file.
> When this ticket is complete, there will be a new statistic type called 
> LinuxThreadScheduler, with three associated statistics gathered directly from 
> /proc/schedstat or derived from data gathered from it:
>  * runningTimeNanos: sum of all time spent running by tasks on this processor 
> in nanoseconds
>  * queuedTimeNanos: sum of all time spent waiting to run by tasks on this 
> processor in nanoseconds
>  * tasksScheduledCount: # of tasks (not necessarily unique) given to the 
> processor
>  * meanTaskQueuedTimeNanos: average time that a ready-to-run task waited for 
> a CPU, since the last sample, in nanoseconds
> One "statistic" will be gathered for each CPU. So a Geode process running on 
> a two-CPU system will capture two statistics, called "cpu0", "cpu1", each of 
> this new type.
> By default Geode will not gather these new statistics. A TBD Java system 
> property will be used to enable gathering the new LinuxThreadScheduler 
> statistic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (GEODE-9002) Add Statistic for /proc/schedstat

2021-03-04 Thread Bill Burcham (Jira)



 [ 
https://issues.apache.org/jira/browse/GEODE-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham reassigned GEODE-9002:
---

Assignee: Bill Burcham

> Add Statistic for /proc/schedstat
> -
>
> Key: GEODE-9002
> URL: https://issues.apache.org/jira/browse/GEODE-9002
> Project: Geode
>  Issue Type: New Feature
>  Components: statistics
>Reporter: Bill Burcham
>Assignee: Bill Burcham
>Priority: Major
>
> Linux performance icon Brendan Gregg advocates the 
> [USE|http://www.brendangregg.com/usemethod.html] method of performance 
> analysis: Utilization Saturation and Errors.
> When it comes to CPU, Geode captures a number of _utilization_ statistics. 
> Some are direct like LinuxSystemStats cpuIdle and cpuActive. Others are 
> indirect like:
>  
> But utilization statistics alone can't tell you when a resource (like CPU) is 
> _saturated_, i.e. when  demand is higher than the servicing ability. If 
> you're just looking at utilization metrics, then a saturated system might 
> look a lot like a system just below saturation. In order to tell the 
> difference, saturation metrics are needed.
> In the case of CPU, there is a conceptual queue in front of each processor. 
> Tasks (operating system threads) that are ready to run, enter a queue, and 
> after some delay, are given a time slice by an actual physical CPU.
> You might think that Geode's LinuxSystemStats loadAverage1 and 5 and 15, 
> might fit this bill. Those statistics do provide some saturation information. 
> The problem is, they conflate CPU with I/O and other things (see [Linux Load 
> Averages: Solving the 
> Mystery|[http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html].)]
> A better, more specific measure of CPU saturation is available through 
> statistics exposed via the /proc/schedstat virtual file.
> When this ticket is complete, there will be a new statistic type called 
> LinuxThreadScheduler, with three associated statistics gathered directly from 
> /proc/schedstat or derived from data gathered from it:
>  * runningTimeNanos: sum of all time spent running by tasks on this processor 
> in nanoseconds
>  * queuedTimeNanos: sum of all time spent waiting to run by tasks on this 
> processor in nanoseconds
>  * tasksScheduledCount: # of tasks (not necessarily unique) given to the 
> processor
>  * meanTaskQueuedTimeNanos: average time that a ready-to-run task waited for 
> a CPU, since the last sample, in nanoseconds
> One "statistic" will be gathered for each CPU. So a Geode process running on 
> a two-CPU system will capture two statistics, called "cpu0", "cpu1", each of 
> this new type.
> By default Geode will not gather these new statistics. A TBD Java system 
> property will be used to enable gathering the new LinuxThreadScheduler 
> statistic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (GEODE-9002) Add Statistic for /proc/schedstat

2021-03-04 Thread Bill Burcham (Jira)

Bill Burcham created GEODE-9002:
---

 Summary: Add Statistic for /proc/schedstat
 Key: GEODE-9002
 URL: https://issues.apache.org/jira/browse/GEODE-9002
 Project: Geode
  Issue Type: New Feature
  Components: statistics
Reporter: Bill Burcham


Linux performance icon Brendan Gregg advocates the 
[USE|http://www.brendangregg.com/usemethod.html] method of performance 
analysis: Utilization Saturation and Errors.

When it comes to CPU, Geode captures a number of _utilization_ statistics. Some 
are direct like LinuxSystemStats cpuIdle and cpuActive. Others are indirect 
like:

 

But utilization statistics alone can't tell you when a resource (like CPU) is 
_saturated_, i.e. when  demand is higher than the servicing ability. If you're 
just looking at utilization metrics, then a saturated system might look a lot 
like a system just below saturation. In order to tell the difference, 
saturation metrics are needed.

In the case of CPU, there is a conceptual queue in front of each processor. 
Tasks (operating system threads) that are ready to run, enter a queue, and 
after some delay, are given a time slice by an actual physical CPU.

You might think that Geode's LinuxSystemStats loadAverage1 and 5 and 15, might 
fit this bill. Those statistics do provide some saturation information. The 
problem is, they conflate CPU with I/O and other things (see [Linux Load 
Averages: Solving the 
Mystery|[http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html].)]

A better, more specific measure of CPU saturation is available through 
statistics exposed via the /proc/schedstat virtual file.

When this ticket is complete, there will be a new statistic type called 
LinuxThreadScheduler, with three associated statistics gathered directly from 
/proc/schedstat or derived from data gathered from it:
 * runningTimeNanos: sum of all time spent running by tasks on this processor 
in nanoseconds
 * queuedTimeNanos: sum of all time spent waiting to run by tasks on this 
processor in nanoseconds
 * tasksScheduledCount: # of tasks (not necessarily unique) given to the 
processor
 * meanTaskQueuedTimeNanos: average time that a ready-to-run task waited for a 
CPU, since the last sample, in nanoseconds

One "statistic" will be gathered for each CPU. So a Geode process running on a 
two-CPU system will capture two statistics, called "cpu0", "cpu1", each of this 
new type.

By default Geode will not gather these new statistics. A TBD Java system 
property will be used to enable gathering the new LinuxThreadScheduler 
statistic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295499#comment-17295499
 ] 

ASF subversion and git services commented on GEODE-8864:


Commit b2b31cec3301d976ef8fd4d647e5cffd5d11f176 in geode's branch 
refs/heads/develop from John Hutchison
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=b2b31ce ]

GEODE-8864:finish implementation of Redis HScan Command (#5954)

 -add test for hscan not returning value removed from hash
 -refactor some tests to better reflect use cases
 -create HScan DunitTest
 -Fix Concurrency issue for Hscan
 -add UUID to client
 -make hscan on redisHash store/retrieve  snapshot of entryset for each client
 -add Unit tests for Redishash.HscanSnapshot

Authored-by: john Hutchison 

> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command

2021-03-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295497#comment-17295497
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

jdeppe-pivotal merged pull request #5954:
URL: https://github.com/apache/geode/pull/5954


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8963) separate client/server compatibility from server/server version compatibility



[ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295494#comment-17295494
 ] 

ASF subversion and git services commented on GEODE-8963:


Commit ad5f3d12181fb17993e9cd4cd2c22b50f1784252 in geode's branch 
refs/heads/develop from Owen Nichols
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=ad5f3d1 ]

GEODE-8963: fix release scripts to maintain previous client serialization 
version when adding new minor


> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: serialization
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8963) separate client/server compatibility from server/server version compatibility



[ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295493#comment-17295493
 ] 

ASF GitHub Bot commented on GEODE-8963:
---

onichols-pivotal merged pull request #6081:
URL: https://github.com/apache/geode/pull/6081


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: serialization
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8907) CI Failure: ClientServerTransactionCCEDUnitTest.testTxRemoveAll



[ 
https://issues.apache.org/jira/browse/GEODE-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295457#comment-17295457
 ] 

ASF GitHub Bot commented on GEODE-8907:
---

pivotal-eshu merged pull request #6054:
URL: https://github.com/apache/geode/pull/6054


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> CI Failure: ClientServerTransactionCCEDUnitTest.testTxRemoveAll
> ---
>
> Key: GEODE-8907
> URL: https://issues.apache.org/jira/browse/GEODE-8907
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.14.0
>Reporter: Mark Hanson
>Assignee: Louis R. Jacome
>Priority: Major
>  Labels: pull-request-available
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/794]
>  reported a test failure 
>  
> {noformat}
> org.apache.geode.internal.cache.ClientServerTransactionCCEDUnitTest > 
> testTxRemoveAll FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.RemoteTransactionDUnitTest$34.call in VM 2 
> running on Host c44a375b0bbd with 4 VMs
> Caused by:
> java.lang.IllegalStateException: Thread does not have an active 
> transaction
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.RemoteTransactionDUnitTest$2.call in VM 0 
> running on Host c44a375b0bbd with 4 VMs
> Caused by:
> java.lang.AssertionError: Event never occurred after 3 ms:  
> {noformat}
>  
> Logs available at
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0655/test-results/distributedTest/1612232803/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0655/test-artifacts/1612232803/distributedtestfiles-OpenJDK8-1.14.0-build.0655.tgz
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8972) remove shunnedMembers collection from GMSMembership



 [ 
https://issues.apache.org/jira/browse/GEODE-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-8972:
--
Labels: pull-request-available  (was: )

> remove shunnedMembers collection from GMSMembership
> ---
>
> Key: GEODE-8972
> URL: https://issues.apache.org/jira/browse/GEODE-8972
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Kamilla Aslami
>Priority: Major
>  Labels: pull-request-available
>
> GMSMembership has a _shunnedMembers_ collection that is used to track the IDs 
> of nodes that are no longer part of the cluster.  This collection is no 
> longer needed since we can tell if a node is old by comparing the view ID in 
> its identifier to that of the current view (called _latestView_ in that 
> class.  Checks like this are already in place in some parts of the code.
> All uses of _shunnedMembers_ should be replaced with this check.
> MembershipView view = latestView;
> boolean shunned = memberId.getVmViewId() <= view.getViewId() && 
> !view.contains(memberId);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8972) remove shunnedMembers collection from GMSMembership



[ 
https://issues.apache.org/jira/browse/GEODE-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295424#comment-17295424
 ] 

ASF GitHub Bot commented on GEODE-8972:
---

kamilla1201 opened a new pull request #6089:
URL: https://github.com/apache/geode/pull/6089


   (cherry picked from commit 83e2ee1f167e62721bc4998f834776b49b946b31)
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build 
issues and
   submit an update to your PR as soon as possible. If you need help, please 
send an
   email to d...@geode.apache.org.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> remove shunnedMembers collection from GMSMembership
> ---
>
> Key: GEODE-8972
> URL: https://issues.apache.org/jira/browse/GEODE-8972
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Kamilla Aslami
>Priority: Major
>
> GMSMembership has a _shunnedMembers_ collection that is used to track the IDs 
> of nodes that are no longer part of the cluster.  This collection is no 
> longer needed since we can tell if a node is old by comparing the view ID in 
> its identifier to that of the current view (called _latestView_ in that 
> class.  Checks like this are already in place in some parts of the code.
> All uses of _shunnedMembers_ should be replaced with this check.
> MembershipView view = latestView;
> boolean shunned = memberId.getVmViewId() <= view.getViewId() && 
> !view.contains(memberId);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-9000) NPE During Reconnect After Network Split



 [ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-9000:

Labels: blocks-1.14.0  (was: )

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Priority: Major
>  Labels: blocks-1.14.0
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2315)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1239)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$forceDisconnect$0(GMSMembership.java:1951)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> [error 2021/03/04 03:32:02.747 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected problem starting up membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
>

[jira] [Created] (GEODE-9001) Update documents related to compatable-with-redis component for 1.14 release

2021-03-04 Thread John Hutchison (Jira)

John Hutchison created GEODE-9001:
-

 Summary: Update documents related to compatable-with-redis 
component  for 1.14 release
 Key: GEODE-9001
 URL: https://issues.apache.org/jira/browse/GEODE-9001
 Project: Geode
  Issue Type: Task
  Components: redis
Reporter: John Hutchison






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295389#comment-17295389
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

jhutchison commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587614479



##
File path: 
geode-redis/src/main/java/org/apache/geode/redis/internal/data/RedisHash.java
##
@@ -46,19 +53,88 @@
 public class RedisHash extends AbstractRedisData {
   public static final RedisHash NULL_REDIS_HASH = new NullRedisHash();
   private HashMap hash;
+  private ConcurrentHashMap> hScanSnapShots;
+  private ConcurrentHashMap hScanSnapShotCreationTimes;
+  private ScheduledExecutorService HSCANSnapshotExpirationExecutor = null;
+
+  private static int default_hscan_snapshots_expire_check_frequency =
+  Integer.getInteger("redis.hscan-snapshot-cleanup-interval", 3);
+
+  private static int default_hscan_snapshots_milliseconds_to_live =
+  Integer.getInteger("redis.hscan-snapshot-expiry", 3);
+
+  private int HSCAN_SNAPSHOTS_EXPIRE_CHECK_FREQUENCY_MILLISECONDS;
+  private int MINIMUM_MILLISECONDS_FOR_HSCAN_SNAPSHOTS_TO_LIVE;
+
+  @VisibleForTesting
+  public RedisHash(List fieldsToSet, int 
hscanSnapShotExpirationCheckFrequency,
+  int minimumLifeForHscanSnaphot) {
+this();
+
+this.HSCAN_SNAPSHOTS_EXPIRE_CHECK_FREQUENCY_MILLISECONDS =
+hscanSnapShotExpirationCheckFrequency;
+this.MINIMUM_MILLISECONDS_FOR_HSCAN_SNAPSHOTS_TO_LIVE = 
minimumLifeForHscanSnaphot;
 
-  public RedisHash(List fieldsToSet) {
-hash = new HashMap<>();
 Iterator iterator = fieldsToSet.iterator();
 while (iterator.hasNext()) {
   hashPut(iterator.next(), iterator.next());
 }
   }
 
+  public RedisHash(List fieldsToSet) {
+this(fieldsToSet,
+default_hscan_snapshots_expire_check_frequency,
+default_hscan_snapshots_milliseconds_to_live);
+  }
+
+  // for serialization
   public RedisHash() {
-// for serialization
+this.hash = new HashMap<>();
+this.hScanSnapShots = new ConcurrentHashMap<>();
+this.hScanSnapShotCreationTimes = new ConcurrentHashMap<>();
+
+this.HSCAN_SNAPSHOTS_EXPIRE_CHECK_FREQUENCY_MILLISECONDS =
+this.default_hscan_snapshots_expire_check_frequency;
+
+this.MINIMUM_MILLISECONDS_FOR_HSCAN_SNAPSHOTS_TO_LIVE =
+this.default_hscan_snapshots_milliseconds_to_live;
   }
 
+
+  private void expireHScanSnapshots() {
+
+this.hScanSnapShotCreationTimes.entrySet().forEach(entry -> {

Review comment:
   yeah sorry, posted responses on the wrong comments a couple of times :(
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295379#comment-17295379
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

jhutchison commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587609738



##
File path: 
geode-redis/src/integrationTest/java/org/apache/geode/redis/internal/executor/hash/HScanIntegrationTest.java
##
@@ -37,22 +44,58 @@ public int getPort() {
 return server.getPort();
   }
 
+
+  // Note: these tests will not pass native redis, so included here in 
concrete test class
+  @Test
+  public void givenCursorGreaterThanIntMaxValue_returnsCursorError() {
+int largestCursorValue = Integer.MAX_VALUE;
+
+BigInteger tooBigCursor =
+new 
BigInteger(String.valueOf(largestCursorValue)).add(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooBigCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+  @Test
+  public void givenCursorLessThanIntMinValue_returnsCursorError() {
+int smallestCursorValue = Integer.MIN_VALUE;
+
+BigInteger tooSmallCursor =
+new 
BigInteger(String.valueOf(smallestCursorValue)).subtract(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooSmallCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+
   @Test
-  public void 
givenDifferentCursorThanSpecifiedByPreviousHscan_returnsAllEntries() {
+  public void givenCount_shouldReturnsExpectedNumberOfEntries() {

Review comment:
   oops.  thanks





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295378#comment-17295378
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

sabbey37 commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587608110



##
File path: 
geode-redis/src/integrationTest/java/org/apache/geode/redis/internal/executor/hash/HScanIntegrationTest.java
##
@@ -37,22 +44,58 @@ public int getPort() {
 return server.getPort();
   }
 
+
+  // Note: these tests will not pass native redis, so included here in 
concrete test class
+  @Test
+  public void givenCursorGreaterThanIntMaxValue_returnsCursorError() {
+int largestCursorValue = Integer.MAX_VALUE;
+
+BigInteger tooBigCursor =
+new 
BigInteger(String.valueOf(largestCursorValue)).add(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooBigCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+  @Test
+  public void givenCursorLessThanIntMinValue_returnsCursorError() {
+int smallestCursorValue = Integer.MIN_VALUE;
+
+BigInteger tooSmallCursor =
+new 
BigInteger(String.valueOf(smallestCursorValue)).subtract(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooSmallCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+
   @Test
-  public void 
givenDifferentCursorThanSpecifiedByPreviousHscan_returnsAllEntries() {
+  public void givenCount_shouldReturnsExpectedNumberOfEntries() {
 Map entryMap = new HashMap<>();
-for (int i = 0; i < 10; i++) {
-  entryMap.put(String.valueOf(i), String.valueOf(i));
-}
-jedis.hmset("a", entryMap);
+entryMap.put("1", "yellow");
+entryMap.put("2", "green");
+entryMap.put("3", "orange");
+jedis.hmset("colors", entryMap);
+
+int COUNT_PARAM = 2;
 
 ScanParams scanParams = new ScanParams();
-scanParams.count(5);
-ScanResult> result = jedis.hscan("a", "0", 
scanParams);
-assertThat(result.isCompleteIteration()).isFalse();
+scanParams.count(COUNT_PARAM);
+ScanResult> result;
+
+List> allEntries = new ArrayList<>();
+String cursor = "0";
 
-result = jedis.hscan("a", "100");
+result = jedis.hscan("colors", cursor, scanParams);
+allEntries.addAll(result.getResult());
 
-assertThat(result.getResult()).hasSize(10);
-assertThat(new 
HashSet<>(result.getResult())).isEqualTo(entryMap.entrySet());
+List> allDistinctEntries =
+allEntries
+.stream()
+.distinct()
+.collect(Collectors.toList());
+
+assertThat(allDistinctEntries.size()).isEqualTo(COUNT_PARAM);

Review comment:
   If this were running with native Redis, it would return all entries 
because it is such a small amount of data.  In that case, removing duplicates 
would not have any effect (all elements would be returned at once without 
duplicates, so the amount of elements returned would always be equal to the 
size of the complete entry map rather than the count parameter [read `Why SCAN 
may return all the items of an aggregate data type in a single call?` in the 
[SCAN documentation](https://redis.io/commands/scan) or look at the Redis code 
for more on this]).  
   
   However, this test is only run with Geode's Redis compatibility API.  I 
thought we moved it to `HScanIntegrationTest.java` specifically so our method 
of returning with a count param (which only returns the exact number of 
elements specified by count) could be tested.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-9000) NPE During Reconnect After Network Split

2021-03-04 Thread Bruce J Schuchardt (Jira)



[ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295377#comment-17295377
 ] 

Bruce J Schuchardt commented on GEODE-9000:
---

The server was reconnecting and emptying out messages queued during quorum 
checks:

{noformat}
logsAndStats/gemfire-cluster-server-0-02-01.log: [info 2021/03/04 10:30:28.595 
GMT gemfire-cluster-server-0  tid=0x8c] Delivering 22 messages 
queued by quorum checker

logsAndStats/gemfire-cluster-server-0-02-01.log: [info 2021/03/04 10:30:28.596 
GMT gemfire-cluster-server-0  tid=0x8c] received suspect 
message from 10.4.2.34(:locator):41000 for 
10.4.3.19(gemfire-cluster-locator-0:1:locator):41000: Member isn't 
responding to heartbeat requests

[fatal 2021/03/04 10:30:28.596 GMT gemfire-cluster-server-0  
tid=0x8c] Unexpected exception while booting membership services
java.lang.NullPointerException
at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
{noformat}

The network-partition message was delivered during this time and was likely 
intended for the previous Membership service.  Adding a check for "isJoined" or 
a null currentView and ignoring the message is probably the right way to fix 
this problem.

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Priority: Major
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
>

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295376#comment-17295376
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

jhutchison commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587599646



##
File path: 
geode-redis/src/integrationTest/java/org/apache/geode/redis/internal/executor/hash/HScanIntegrationTest.java
##
@@ -37,22 +44,58 @@ public int getPort() {
 return server.getPort();
   }
 
+
+  // Note: these tests will not pass native redis, so included here in 
concrete test class
+  @Test
+  public void givenCursorGreaterThanIntMaxValue_returnsCursorError() {
+int largestCursorValue = Integer.MAX_VALUE;
+
+BigInteger tooBigCursor =
+new 
BigInteger(String.valueOf(largestCursorValue)).add(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooBigCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+  @Test
+  public void givenCursorLessThanIntMinValue_returnsCursorError() {
+int smallestCursorValue = Integer.MIN_VALUE;
+
+BigInteger tooSmallCursor =
+new 
BigInteger(String.valueOf(smallestCursorValue)).subtract(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooSmallCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+
   @Test
-  public void 
givenDifferentCursorThanSpecifiedByPreviousHscan_returnsAllEntries() {
+  public void givenCount_shouldReturnsExpectedNumberOfEntries() {
 Map entryMap = new HashMap<>();
-for (int i = 0; i < 10; i++) {
-  entryMap.put(String.valueOf(i), String.valueOf(i));
-}
-jedis.hmset("a", entryMap);
+entryMap.put("1", "yellow");
+entryMap.put("2", "green");
+entryMap.put("3", "orange");
+jedis.hmset("colors", entryMap);
+
+int COUNT_PARAM = 2;
 
 ScanParams scanParams = new ScanParams();
-scanParams.count(5);
-ScanResult> result = jedis.hscan("a", "0", 
scanParams);
-assertThat(result.isCompleteIteration()).isFalse();
+scanParams.count(COUNT_PARAM);
+ScanResult> result;
+
+List> allEntries = new ArrayList<>();
+String cursor = "0";
 
-result = jedis.hscan("a", "100");
+result = jedis.hscan("colors", cursor, scanParams);
+allEntries.addAll(result.getResult());
 
-assertThat(result.getResult()).hasSize(10);
-assertThat(new 
HashSet<>(result.getResult())).isEqualTo(entryMap.entrySet());
+List> allDistinctEntries =
+allEntries
+.stream()
+.distinct()
+.collect(Collectors.toList());
+
+assertThat(allDistinctEntries.size()).isEqualTo(COUNT_PARAM);

Review comment:
   Yeah, I'm hesitant to read too much into the exactness of what the count 
parameter does.  as per docs (as I'm sure you've read) " Basically with COUNT 
the user specified the amount of work that should be done at every call in 
order to retrieve elements from the collection. This is just a hint for the 
implementation, however generally speaking this is what you could expect most 
of the times from the implementation."  which, I think, is what makes it a 
little confusing to know how to test.   I'll remove the removal of the 
duplicates, though





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295373#comment-17295373
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

jhutchison commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587599646



##
File path: 
geode-redis/src/integrationTest/java/org/apache/geode/redis/internal/executor/hash/HScanIntegrationTest.java
##
@@ -37,22 +44,58 @@ public int getPort() {
 return server.getPort();
   }
 
+
+  // Note: these tests will not pass native redis, so included here in 
concrete test class
+  @Test
+  public void givenCursorGreaterThanIntMaxValue_returnsCursorError() {
+int largestCursorValue = Integer.MAX_VALUE;
+
+BigInteger tooBigCursor =
+new 
BigInteger(String.valueOf(largestCursorValue)).add(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooBigCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+  @Test
+  public void givenCursorLessThanIntMinValue_returnsCursorError() {
+int smallestCursorValue = Integer.MIN_VALUE;
+
+BigInteger tooSmallCursor =
+new 
BigInteger(String.valueOf(smallestCursorValue)).subtract(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooSmallCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+
   @Test
-  public void 
givenDifferentCursorThanSpecifiedByPreviousHscan_returnsAllEntries() {
+  public void givenCount_shouldReturnsExpectedNumberOfEntries() {
 Map entryMap = new HashMap<>();
-for (int i = 0; i < 10; i++) {
-  entryMap.put(String.valueOf(i), String.valueOf(i));
-}
-jedis.hmset("a", entryMap);
+entryMap.put("1", "yellow");
+entryMap.put("2", "green");
+entryMap.put("3", "orange");
+jedis.hmset("colors", entryMap);
+
+int COUNT_PARAM = 2;
 
 ScanParams scanParams = new ScanParams();
-scanParams.count(5);
-ScanResult> result = jedis.hscan("a", "0", 
scanParams);
-assertThat(result.isCompleteIteration()).isFalse();
+scanParams.count(COUNT_PARAM);
+ScanResult> result;
+
+List> allEntries = new ArrayList<>();
+String cursor = "0";
 
-result = jedis.hscan("a", "100");
+result = jedis.hscan("colors", cursor, scanParams);
+allEntries.addAll(result.getResult());
 
-assertThat(result.getResult()).hasSize(10);
-assertThat(new 
HashSet<>(result.getResult())).isEqualTo(entryMap.entrySet());
+List> allDistinctEntries =
+allEntries
+.stream()
+.distinct()
+.collect(Collectors.toList());
+
+assertThat(allDistinctEntries.size()).isEqualTo(COUNT_PARAM);

Review comment:
   Yeah, I'm hesitant to read too much into the exactness of what the count 
parameter does.  as per docs (as I'm sure you've read) " Basically with COUNT 
the user specified the amount of work that should be done at every call in 
order to retrieve elements from the collection. This is just a hint for the 
implementation, however generally speaking this is what you could expect most 
of the times from the implementation."   I'll remove the removal of the 
duplicates, though





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (GEODE-8924) Add VM restart tests for Redis and Spring sessions

2021-03-04 Thread Jens Deppe (Jira)



 [ 
https://issues.apache.org/jira/browse/GEODE-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jens Deppe resolved GEODE-8924.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> Add VM restart tests for Redis and Spring sessions
> --
>
> Key: GEODE-8924
> URL: https://issues.apache.org/jira/browse/GEODE-8924
> Project: Geode
>  Issue Type: Test
>  Components: redis
>Reporter: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Adding tests similar to some of our closed-source tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295366#comment-17295366
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

jhutchison commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587588110



##
File path: 
geode-redis/src/main/java/org/apache/geode/redis/internal/executor/CommandFunction.java
##
@@ -257,11 +258,13 @@ protected Object compute(ByteArrayWrapper key, Object[] 
args) {
   case HSCAN: {
 Pattern pattern = (Pattern) args[1];
 int count = (int) args[2];
-BigInteger cursor = (BigInteger) args[3];
-return hashCommands.hscan(key, pattern, count, cursor);
+int cursor = Integer.valueOf(args[3].toString());

Review comment:
   geez, I guess you're right-  looked at that too quickly.  thanks





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295364#comment-17295364
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

sabbey37 commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587553014



##
File path: 
geode-redis/src/main/java/org/apache/geode/redis/internal/executor/CommandFunction.java
##
@@ -257,11 +258,13 @@ protected Object compute(ByteArrayWrapper key, Object[] 
args) {
   case HSCAN: {
 Pattern pattern = (Pattern) args[1];
 int count = (int) args[2];
-BigInteger cursor = (BigInteger) args[3];
-return hashCommands.hscan(key, pattern, count, cursor);
+int cursor = Integer.valueOf(args[3].toString());

Review comment:
   `args` are an array of Objects (`Object[]`), not strings.  `args[3]` can 
be cast in the same way that we cast `args[2]` for the `count` variable on the 
line above it and throughout the `CommandFunction` class:
   ```
   int cursor = (int) args[3];
   ```
   
   (This argument was already verified and passed down as an `int` from 
`HScanExecutor`)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295346#comment-17295346
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

sabbey37 commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587566195



##
File path: 
geode-redis/src/integrationTest/java/org/apache/geode/redis/internal/executor/hash/HScanIntegrationTest.java
##
@@ -37,22 +44,58 @@ public int getPort() {
 return server.getPort();
   }
 
+
+  // Note: these tests will not pass native redis, so included here in 
concrete test class
+  @Test
+  public void givenCursorGreaterThanIntMaxValue_returnsCursorError() {
+int largestCursorValue = Integer.MAX_VALUE;
+
+BigInteger tooBigCursor =
+new 
BigInteger(String.valueOf(largestCursorValue)).add(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooBigCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+  @Test
+  public void givenCursorLessThanIntMinValue_returnsCursorError() {
+int smallestCursorValue = Integer.MIN_VALUE;
+
+BigInteger tooSmallCursor =
+new 
BigInteger(String.valueOf(smallestCursorValue)).subtract(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooSmallCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+
   @Test
-  public void 
givenDifferentCursorThanSpecifiedByPreviousHscan_returnsAllEntries() {
+  public void givenCount_shouldReturnsExpectedNumberOfEntries() {
 Map entryMap = new HashMap<>();
-for (int i = 0; i < 10; i++) {
-  entryMap.put(String.valueOf(i), String.valueOf(i));
-}
-jedis.hmset("a", entryMap);
+entryMap.put("1", "yellow");
+entryMap.put("2", "green");
+entryMap.put("3", "orange");
+jedis.hmset("colors", entryMap);
+
+int COUNT_PARAM = 2;
 
 ScanParams scanParams = new ScanParams();
-scanParams.count(5);
-ScanResult> result = jedis.hscan("a", "0", 
scanParams);
-assertThat(result.isCompleteIteration()).isFalse();
+scanParams.count(COUNT_PARAM);
+ScanResult> result;
+
+List> allEntries = new ArrayList<>();
+String cursor = "0";
 
-result = jedis.hscan("a", "100");
+result = jedis.hscan("colors", cursor, scanParams);
+allEntries.addAll(result.getResult());
 
-assertThat(result.getResult()).hasSize(10);
-assertThat(new 
HashSet<>(result.getResult())).isEqualTo(entryMap.entrySet());
+List> allDistinctEntries =
+allEntries
+.stream()
+.distinct()
+.collect(Collectors.toList());
+
+assertThat(allDistinctEntries.size()).isEqualTo(COUNT_PARAM);

Review comment:
   Thank for thinking of this test!  Because we are making sure that we 
return the same number of entries as the `COUNT_PARAM`, it doesn't make sense 
to take out duplicates (we should return the same number of entries as the 
`COUNT_PARAM`, including duplicates). In that case, lines 90-96 could be 
removed and we could just assert the following:
   `assertThat(result.getResult().size()).isEqualTo(COUNT_PARAM)`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295344#comment-17295344
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

sabbey37 commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587561626



##
File path: 
geode-redis/src/integrationTest/java/org/apache/geode/redis/internal/executor/hash/HScanIntegrationTest.java
##
@@ -37,22 +44,58 @@ public int getPort() {
 return server.getPort();
   }
 
+
+  // Note: these tests will not pass native redis, so included here in 
concrete test class
+  @Test
+  public void givenCursorGreaterThanIntMaxValue_returnsCursorError() {
+int largestCursorValue = Integer.MAX_VALUE;
+
+BigInteger tooBigCursor =
+new 
BigInteger(String.valueOf(largestCursorValue)).add(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooBigCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+  @Test
+  public void givenCursorLessThanIntMinValue_returnsCursorError() {
+int smallestCursorValue = Integer.MIN_VALUE;
+
+BigInteger tooSmallCursor =
+new 
BigInteger(String.valueOf(smallestCursorValue)).subtract(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooSmallCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+
   @Test
-  public void 
givenDifferentCursorThanSpecifiedByPreviousHscan_returnsAllEntries() {
+  public void givenCount_shouldReturnsExpectedNumberOfEntries() {

Review comment:
   If you end up making other changes, the grammar here could be corrected 
(either changed to `givenCount_shouldReturnExpectedNumberOfEntries` or (more 
consistent with the tests above it) `givenCount_returnsExpectedNumberOfEntries`.

##
File path: 
geode-redis/src/integrationTest/java/org/apache/geode/redis/internal/executor/hash/HScanIntegrationTest.java
##
@@ -37,22 +44,58 @@ public int getPort() {
 return server.getPort();
   }
 
+
+  // Note: these tests will not pass native redis, so included here in 
concrete test class
+  @Test
+  public void givenCursorGreaterThanIntMaxValue_returnsCursorError() {
+int largestCursorValue = Integer.MAX_VALUE;
+
+BigInteger tooBigCursor =
+new 
BigInteger(String.valueOf(largestCursorValue)).add(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooBigCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+  @Test
+  public void givenCursorLessThanIntMinValue_returnsCursorError() {
+int smallestCursorValue = Integer.MIN_VALUE;
+
+BigInteger tooSmallCursor =
+new 
BigInteger(String.valueOf(smallestCursorValue)).subtract(BigInteger.valueOf(1));
+
+assertThatThrownBy(() -> jedis.hscan("a", tooSmallCursor.toString()))
+.hasMessageContaining(ERROR_CURSOR);
+  }
+
+
   @Test
-  public void 
givenDifferentCursorThanSpecifiedByPreviousHscan_returnsAllEntries() {
+  public void givenCount_shouldReturnsExpectedNumberOfEntries() {
 Map entryMap = new HashMap<>();
-for (int i = 0; i < 10; i++) {
-  entryMap.put(String.valueOf(i), String.valueOf(i));
-}
-jedis.hmset("a", entryMap);
+entryMap.put("1", "yellow");
+entryMap.put("2", "green");
+entryMap.put("3", "orange");
+jedis.hmset("colors", entryMap);
+
+int COUNT_PARAM = 2;
 
 ScanParams scanParams = new ScanParams();
-scanParams.count(5);
-ScanResult> result = jedis.hscan("a", "0", 
scanParams);
-assertThat(result.isCompleteIteration()).isFalse();
+scanParams.count(COUNT_PARAM);
+ScanResult> result;
+
+List> allEntries = new ArrayList<>();
+String cursor = "0";
 
-result = jedis.hscan("a", "100");
+result = jedis.hscan("colors", cursor, scanParams);
+allEntries.addAll(result.getResult());
 
-assertThat(result.getResult()).hasSize(10);
-assertThat(new 
HashSet<>(result.getResult())).isEqualTo(entryMap.entrySet());
+List> allDistinctEntries =
+allEntries
+.stream()
+.distinct()
+.collect(Collectors.toList());
+
+assertThat(allDistinctEntries.size()).isEqualTo(COUNT_PARAM);

Review comment:
   I like this test. Thanks for thinking of it!  Because we are making sure 
that we return the same number of entries as the `COUNT_PARAM`, it doesn't make 
sense to take out duplicates (we should return the same number of entries as 
the `COUNT_PARAM`, including duplicates). In that case, lines 90-96 could be 
removed and we could just assert the following:
   `assertThat(result.getResult().size()).isEqualTo(COUNT_PARAM)`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service,

[jira] [Commented] (GEODE-8864) finish implementation of Redis HScan Command



[ 
https://issues.apache.org/jira/browse/GEODE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295331#comment-17295331
 ] 

ASF GitHub Bot commented on GEODE-8864:
---

sabbey37 commented on a change in pull request #5954:
URL: https://github.com/apache/geode/pull/5954#discussion_r587553014



##
File path: 
geode-redis/src/main/java/org/apache/geode/redis/internal/executor/CommandFunction.java
##
@@ -257,11 +258,13 @@ protected Object compute(ByteArrayWrapper key, Object[] 
args) {
   case HSCAN: {
 Pattern pattern = (Pattern) args[1];
 int count = (int) args[2];
-BigInteger cursor = (BigInteger) args[3];
-return hashCommands.hscan(key, pattern, count, cursor);
+int cursor = Integer.valueOf(args[3].toString());

Review comment:
   `args` are an array of Objects (`Object[]`), not strings.  `args[3]` can 
be cast in the same way that we cast `args[2]` for the `count` variable on the 
line above it and throughout the `CommandFunction` class:
   ```
   int cursor = (int) args[3];
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> finish implementation of Redis HScan Command
> 
>
> Key: GEODE-8864
> URL: https://issues.apache.org/jira/browse/GEODE-8864
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: John Hutchison
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-8877) Configurable membership bind address



[ 
https://issues.apache.org/jira/browse/GEODE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295321#comment-17295321
 ] 

ASF GitHub Bot commented on GEODE-8877:
---

alb3rtobr commented on a change in pull request #5970:
URL: https://github.com/apache/geode/pull/5970#discussion_r587539499



##
File path: 
geode-core/src/main/java/org/apache/geode/distributed/internal/direct/DirectChannel.java
##
@@ -142,7 +143,23 @@ public DirectChannel(Membership 
mgr,
   props.setProperty("membership_port_range_start", "" + range[0]);
   props.setProperty("membership_port_range_end", "" + range[1]);
 
-  this.conduit = new TCPConduit(mgr, port, address, isBindAddress, this, 
bufferPool, props);
+  InetAddress conduitAddress = address;
+  if (!dc.getMembershipBindAddress().isEmpty()) {
+try {
+  if (dc.getMembershipBindAddress().equals("*")) {

Review comment:
   I received some feedback from @Bill and he thinks that changing the 
default value introduces a security hole, so I have closed the PR where I was 
testing that.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Configurable membership bind address
> 
>
> Key: GEODE-8877
> URL: https://issues.apache.org/jira/browse/GEODE-8877
> Project: Geode
>  Issue Type: Improvement
>Reporter: Alberto Bustamante Reyes
>Assignee: Alberto Bustamante Reyes
>Priority: Major
>  Labels: pull-request-available
>
> Geode binds the locator and server traffic port by default to 0.0.0.0, but 
> the membership ports are bound to the local address.
> There is a use case that needs this binding to be configurable ([link to the 
> conversation in the dev list|http://markmail.org/thread/7dwtygtgfcitboy3]):
> We would like to use Istio with Geode. For that, a sidecar container (Envoy) 
> has to be added in each Geode pod. That sidecar container intercepts and 
> handles all incoming and outgoing traffic for that pod. One of the 
> requirements set by Istio towards applications trying to integrate with it is 
> that the application listening ports need to be bound to either localhost or 
> 0.0.0.0 address (which listens on all interfaces).
>  
> Geode binds the locator and server traffic port by default to 0.0.0.0, but 
> the membership ports are bound to the pod IP.
>  And with Envoy listening on the pod IP for incoming traffic and proxying 
> everything towards localhost, applications binding to pod IPs won't receive 
> any traffic.
>  We have tried using the "bind-address" parameter, but that doesn't work for 
> our case. Geode binds the listening ports to the configured address, but it 
> also shares that same address to other members in the system as the address 
> to be used to reach it. If we configure that address to localhost, it just 
> won't work.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-7309) Upgrade Lucene from 6.6.2 to 8.2.0



[ 
https://issues.apache.org/jira/browse/GEODE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295258#comment-17295258
 ] 

ASF GitHub Bot commented on GEODE-7309:
---

mkevo commented on a change in pull request #6076:
URL: https://github.com/apache/geode/pull/6076#discussion_r587439661



##
File path: 
geode-lucene/src/upgradeTest/java/org/apache/geode/cache/lucene/LuceneQueryWithDifferentVersions.java
##
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for additional 
information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software 
distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 
KIND, either express
+ * or implied. See the License for the specific language governing permissions 
and limitations under
+ * the License.
+ */
+package org.apache.geode.cache.lucene;
+
+import static org.apache.geode.test.awaitility.GeodeAwaitility.await;
+import static org.assertj.core.api.Assertions.assertThat;
+import static org.assertj.core.api.Assertions.catchThrowable;
+import static org.junit.Assert.assertTrue;
+
+import org.junit.Test;
+
+import org.apache.geode.cache.RegionShortcut;
+import org.apache.geode.distributed.internal.InternalLocator;
+import org.apache.geode.internal.AvailablePortHelper;
+import org.apache.geode.test.dunit.DistributedTestUtils;
+import org.apache.geode.test.dunit.Host;
+import org.apache.geode.test.dunit.NetworkUtils;
+import org.apache.geode.test.dunit.VM;
+
+public class LuceneQueryWithDifferentVersions
+extends LuceneSearchWithRollingUpgradeDUnit {
+
+  // 2 locator, 2 servers
+  @Test
+  public void luceneQueryCannotBeExecuted()
+  throws Exception {
+final Host host = Host.getHost(0);
+VM locator1 = host.getVM(oldVersion, 0);
+VM locator2 = host.getVM(oldVersion, 1);
+VM server1 = host.getVM(oldVersion, 2);
+VM server2 = host.getVM(oldVersion, 3);
+
+final String regionName = "aRegion";
+RegionShortcut shortcut = RegionShortcut.PARTITION_REDUNDANT;
+String regionType = "partitionedRedundant";
+
+int[] locatorPorts = AvailablePortHelper.getRandomAvailableTCPPorts(2);
+locator1.invoke(() -> 
DistributedTestUtils.deleteLocatorStateFile(locatorPorts));
+locator2.invoke(() -> 
DistributedTestUtils.deleteLocatorStateFile(locatorPorts));
+
+String hostName = NetworkUtils.getServerHostName(host);
+String locatorString = getLocatorString(locatorPorts);
+try {
+  locator1.invoke(
+  invokeStartLocator(hostName, locatorPorts[0], 
getLocatorPropertiesPre91(locatorString)));
+  locator2.invoke(
+  invokeStartLocator(hostName, locatorPorts[1], 
getLocatorPropertiesPre91(locatorString)));
+  
invokeRunnableInVMs(invokeCreateCache(getSystemProperties(locatorPorts)), 
server1, server2);
+
+  // Locators before 1.4 handled configuration asynchronously.
+  // We must wait for configuration configuration to be ready, or confirm 
that it is disabled.
+  locator1.invoke(
+  () -> await()

Review comment:
   Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade Lucene from 6.6.2 to 8.2.0
> --
>
> Key: GEODE-7309
> URL: https://issues.apache.org/jira/browse/GEODE-7309
> Project: Geode
>  Issue Type: Sub-task
>Reporter: Mario Kevo
>Assignee: Mario Kevo
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-7309) Upgrade Lucene from 6.6.2 to 8.2.0



[ 
https://issues.apache.org/jira/browse/GEODE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295255#comment-17295255
 ] 

ASF GitHub Bot commented on GEODE-7309:
---

mkevo commented on a change in pull request #6076:
URL: https://github.com/apache/geode/pull/6076#discussion_r587433351



##
File path: geode-docs/tools_modules/lucene_integration.html.md.erb
##
@@ -334,6 +334,7 @@ Lucene indexes created for all members.
 # Requirements and Caveats
 
 - Join queries between regions are not supported.
+- Executing queries is possible after all members are at the same version.

Review comment:
   Yes, you are right! Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade Lucene from 6.6.2 to 8.2.0
> --
>
> Key: GEODE-7309
> URL: https://issues.apache.org/jira/browse/GEODE-7309
> Project: Geode
>  Issue Type: Sub-task
>Reporter: Mario Kevo
>Assignee: Mario Kevo
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-9000) NPE During Reconnect After Network Split



 [ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Ramos updated GEODE-9000:
--
Labels:   (was: blocks-1.14.0)

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Priority: Major
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2315)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1239)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$forceDisconnect$0(GMSMembership.java:1951)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> [error 2021/03/04 03:32:02.747 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected problem starting up membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at

[jira] [Updated] (GEODE-9000) NPE During Reconnect After Network Split



 [ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Ramos updated GEODE-9000:
--
Priority: Blocker  (was: Major)

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Priority: Blocker
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2315)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1239)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$forceDisconnect$0(GMSMembership.java:1951)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> [error 2021/03/04 03:32:02.747 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected problem starting up membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at

[jira] [Updated] (GEODE-9000) NPE During Reconnect After Network Split



 [ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Ramos updated GEODE-9000:
--
Labels: blocks-1.14.0  (was: )

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Priority: Major
>  Labels: blocks-1.14.0
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2315)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1239)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$forceDisconnect$0(GMSMembership.java:1951)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> [error 2021/03/04 03:32:02.747 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected problem starting up membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
>

[jira] [Updated] (GEODE-9000) NPE During Reconnect After Network Split



 [ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Ramos updated GEODE-9000:
--
Priority: Major  (was: Blocker)

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Priority: Major
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2315)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1239)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$forceDisconnect$0(GMSMembership.java:1951)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> [error 2021/03/04 03:32:02.747 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected problem starting up membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at

[jira] [Created] (GEODE-9000) NPE During Reconnect After Network Split

Juan Ramos created GEODE-9000:
-

 Summary: NPE During Reconnect After Network Split
 Key: GEODE-9000
 URL: https://issues.apache.org/jira/browse/GEODE-9000
 Project: Geode
  Issue Type: Bug
  Components: membership
Affects Versions: 1.14.0
Reporter: Juan Ramos


During a full network split when all members get shutdown by a partition, one 
of the servers continually fails to reconnect due to a 
{{NullPointerException}}. When using persistent regions, this also prevents the 
remaining members from correctly start up as they might be waiting for the 
stuck member to recover the latest data.
The issue itself has been introduced by the fix for GEODE-8901, the new 
implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't have 
a {{currentView}} installed during the reconnect phase ({{getView() == null}}) 
and the following is shown in the logs:

{noformat}
[fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
tid=0x8a] Unexpected exception while booting membership services
java.lang.NullPointerException
at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
at 
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
at 
org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
at 
org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2315)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1239)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$forceDisconnect$0(GMSMembership.java:1951)
at java.base/java.lang.Thread.run(Thread.java:834)

[error 2021/03/04 03:32:02.747 GMT gemfire-cluster-server-0  
tid=0x8a] Unexpected problem starting up membership services
java.lang.NullPointerException
at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
at 
org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
at 
org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
at 
org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
at

[jira] [Commented] (GEODE-8877) Configurable membership bind address



[ 
https://issues.apache.org/jira/browse/GEODE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295133#comment-17295133
 ] 

ASF GitHub Bot commented on GEODE-8877:
---

alb3rtobr closed pull request #6066:
URL: https://github.com/apache/geode/pull/6066


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Configurable membership bind address
> 
>
> Key: GEODE-8877
> URL: https://issues.apache.org/jira/browse/GEODE-8877
> Project: Geode
>  Issue Type: Improvement
>Reporter: Alberto Bustamante Reyes
>Assignee: Alberto Bustamante Reyes
>Priority: Major
>  Labels: pull-request-available
>
> Geode binds the locator and server traffic port by default to 0.0.0.0, but 
> the membership ports are bound to the local address.
> There is a use case that needs this binding to be configurable ([link to the 
> conversation in the dev list|http://markmail.org/thread/7dwtygtgfcitboy3]):
> We would like to use Istio with Geode. For that, a sidecar container (Envoy) 
> has to be added in each Geode pod. That sidecar container intercepts and 
> handles all incoming and outgoing traffic for that pod. One of the 
> requirements set by Istio towards applications trying to integrate with it is 
> that the application listening ports need to be bound to either localhost or 
> 0.0.0.0 address (which listens on all interfaces).
>  
> Geode binds the locator and server traffic port by default to 0.0.0.0, but 
> the membership ports are bound to the pod IP.
>  And with Envoy listening on the pod IP for incoming traffic and proxying 
> everything towards localhost, applications binding to pod IPs won't receive 
> any traffic.
>  We have tried using the "bind-address" parameter, but that doesn't work for 
> our case. Geode binds the listening ports to the configured address, but it 
> also shares that same address to other members in the system as the address 
> to be used to reach it. If we configure that address to localhost, it just 
> won't work.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8933) Report max memory setting in Geode Redis statistics



 [ 
https://issues.apache.org/jira/browse/GEODE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-8933:

Fix Version/s: 1.15.0

> Report max memory setting in Geode Redis statistics
> ---
>
> Key: GEODE-8933
> URL: https://issues.apache.org/jira/browse/GEODE-8933
> Project: Geode
>  Issue Type: New Feature
>  Components: redis
>Reporter: Raymond Ingles
>Assignee: Raymond Ingles
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> In order to implement eviction for Redis data, the INFO command will need to 
> report at least one new field, "maxmemory".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8865) Create additional dunit and integration tests for Redis HMGET



 [ 
https://issues.apache.org/jira/browse/GEODE-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-8865:

Fix Version/s: 1.15.0

> Create additional dunit and integration tests for Redis HMGET
> -
>
> Key: GEODE-8865
> URL: https://issues.apache.org/jira/browse/GEODE-8865
> Project: Geode
>  Issue Type: Test
>  Components: redis
>Reporter: Jens Deppe
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> Write integration tests for the following command:
>  * HMGET
> *A.C.*
>  * NativeRedisAcceptanceTest file present to run our tests against native 
> Redis
>  * Tests are passing, _or_
>  * Stories in the backlog to fix the identified issues (with JIRA tickets) 
> and problem tests ignored



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8894) Allow individual deltas to trigger bucket size recalculation



 [ 
https://issues.apache.org/jira/browse/GEODE-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-8894:

Fix Version/s: 1.15.0

> Allow individual deltas to trigger bucket size recalculation
> 
>
> Key: GEODE-8894
> URL: https://issues.apache.org/jira/browse/GEODE-8894
> Project: Geode
>  Issue Type: New Feature
>  Components: core, serialization
>Affects Versions: 1.14.0
>Reporter: Raymond Ingles
>Assignee: Raymond Ingles
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> The Redis subsystem uses Deltas heavily, but by default deltas do not trigger 
> an update to the size of their buckets. This leads to incorrect memory usage 
> accounting over the long term, especially with the use of Redis commands like 
> "APPEND".
> It is possible to set the system property "DELTAS_RECALCULATE_SIZE", but this 
> is a global value that would affect the processing of all deltas, including 
> non-Redis operations.
> Instead, we will add a new default method to the Delta interface, that can be 
> overridden by individual Delta implementations (such as Redis). This will 
> trigger the same behavior as DELTAS_RECALCULATE_SIZE, but on a per-delta 
> basis. Thus, other Geode operations will not force bucket size recalculations 
> unless the global property is set, but Redis statistics will be correct.
> Other types of delta operations may find this useful in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8624) Improve INCRBYFLOAT accuracy for very large values



 [ 
https://issues.apache.org/jira/browse/GEODE-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-8624:

Fix Version/s: 1.15.0

> Improve INCRBYFLOAT accuracy for very large values
> --
>
> Key: GEODE-8624
> URL: https://issues.apache.org/jira/browse/GEODE-8624
> Project: Geode
>  Issue Type: Improvement
>  Components: redis
>Reporter: Jens Deppe
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> Currently native redis appears to be able to apply {{INCRBYFLOAT}} on values 
> that are below the max of unsigned long long (18446744073709551615). However, 
> since we're treating numbers as {{double}}s we can lose precision for very 
> large values. For example:
> {noformat}
> set val 18446744073709551614
> incrbyfloat val 1{noformat}
> incorrectly returns {{18446744073709552000}}
> Native redis produces a correct result.
> We should consider switching to using {{BigInteger}} for all commands which 
> perform calculations: {{INCR, INCR, INCRBYFLOAT, HINCRBY, HINCRBYFLOAT}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GEODE-8097) ClientClusterManagementServiceTest expects 1 callback but gets 295



 [ 
https://issues.apache.org/jira/browse/GEODE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols updated GEODE-8097:

Fix Version/s: 1.14.0

> ClientClusterManagementServiceTest expects 1 callback but gets 295
> --
>
> Key: GEODE-8097
> URL: https://issues.apache.org/jira/browse/GEODE-8097
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Bill Burcham
>Assignee: Jinmei Liao
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> CI test failed here: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/WindowsUnitTestOpenJDK8/builds/138#A
> {code}
> org.apache.geode.management.internal.ClientClusterManagementServiceTest > 
> getOperationCallsSubmitMessageAndReturnsFuture FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.management.internal.ClientClusterManagementServiceTest 
> clusterManagementServiceTransport.submitMessageForGetOperation(
> 
> same(org.apache.geode.management.operation.RebalanceOperation@6d6b2db0),
> same("opId")
> );
> Wanted 1 time:
> -> at 
> org.apache.geode.management.internal.ClientClusterManagementServiceTest.lambda$getOperationCallsSubmitMessageAndReturnsFuture$1(ClientClusterManagementServiceTest.java:170)
> But was 295 times:
> -> at 
> org.apache.geode.management.internal.ClientClusterManagementService.get(ClientClusterManagementService.java:114)
> -> at 
> org.apache.geode.management.internal.ClientClusterManagementService.get(ClientClusterManagementService.java:114)
> -> at 
> org.apache.geode.management.internal.ClientClusterManagementService.get(ClientClusterManagementService.java:114)
> …
> {code}
> Ran it 100 times in IntelliJ with no failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (GEODE-8671) Two threads calling get and retrieve the same PdxInstance, resulting in corruption



 [ 
https://issues.apache.org/jira/browse/GEODE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols resolved GEODE-8671.
-
Fix Version/s: 1.15.0
   1.14.0
   1.13.2
   Resolution: Fixed

> Two threads calling get and retrieve the same PdxInstance, resulting in 
> corruption
> --
>
> Key: GEODE-8671
> URL: https://issues.apache.org/jira/browse/GEODE-8671
> Project: Geode
>  Issue Type: Improvement
>  Components: regions
>Reporter: Dan Smith
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.13.2, 1.14.0, 1.15.0
>
>
> Even if copy-on-read is set to true, two threads calling get on a partitioned 
> region can end up with the same PdxInstance object.
> This is problematic because some PdxInstances methods are not thread safe. 
> Although the underlying bytes are immutatable, the PDXInstance has a 
> ByteSource with a position field that changes. That means two threads doing 
> serialization or calling toString on the PdxInstance could result in one or 
> more threads getting a corrupt read.
> It looks like they are ending up with the same instance because of the 
> behavior in LocalRegion.optimizedGetObject. We use futures to make sure there 
> is only 1 get that goes through, and both threads receive the same value.
> 
> Ending up in optimizedGetObject requires a race with the put, because if the 
> value was in the cache at the beginning of the get it would be returned 
> earlier in the get process.
> I put a test that reproduces this issue here -   
> https://github.com/upthewaterspout/geode/pull/new/feature/pdx-instances-shared



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (GEODE-8958) Tombstone expiration: in the event that a version timestamp is too far in the future, expire the tombstone



 [ 
https://issues.apache.org/jira/browse/GEODE-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen Nichols resolved GEODE-8958.
-
Fix Version/s: 1.15.0
   1.14.0
   1.13.2
   1.12.2
   Resolution: Fixed

> Tombstone expiration: in the event that a version timestamp is too far in the 
> future, expire the tombstone
> --
>
> Key: GEODE-8958
> URL: https://issues.apache.org/jira/browse/GEODE-8958
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: blocks-1.14.0, pull-request-available
> Fix For: 1.12.2, 1.13.2, 1.14.0, 1.15.0
>
>
> We are seeing a bug where for some reason, the version timestamp on the 
> tombstone is too far into the future to be realistic.
>  
> In such a case, we are going to expire the tombstone as soon as we see it.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (GEODE-8996) Rebalance GFSH commands and restore redundancy commands were not backward compatible