[jira] [Resolved] (GEODE-10306) CacheServerImpl should stop the acceptor immediately after stop is called

2022-05-17 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-10306.
-
Fix Version/s: 1.16.0
   Resolution: Fixed

Moved up the acceptor close 

> CacheServerImpl should stop the acceptor immediately after stop is called
> -
>
> Key: GEODE-10306
> URL: https://issues.apache.org/jira/browse/GEODE-10306
> Project: Geode
>  Issue Type: Bug
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Currently, after cache server stop is called, it takes a while for the 
> acceptor to stop taking new data, which can be a problem because the bigger 
> the window of time, the greater the risk of data loss. 
>  
> {noformat}
> public synchronized void stop() {
>   if (!isRunning()) {
> return;
>   }
>   RuntimeException firstException = null;
>   try {
> if (loadMonitor != null) {
>   loadMonitor.stop();
> }
>   } catch (RuntimeException e) {
> logger.warn("CacheServer - Error closing load monitor", e);
> firstException = e;
>   }
>   try {
> if (advisor != null) {
>   advisor.close();
> }
>   } catch (RuntimeException e) {
> logger.warn("CacheServer - Error closing advisor", e);
> firstException = e;
>   }
> PROBLEM ->  try {
> if (acceptor != null) {
>   acceptor.close();
> }
>   } catch (RuntimeException e) {
> logger.warn("CacheServer - Error closing acceptor monitor", e);
> if (firstException != null) {
>   firstException = e;
> }
>   } {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (GEODE-10306) CacheServerImpl should stop the acceptor immediately after stop is called

2022-05-13 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-10306:
---

Assignee: Mark Hanson

> CacheServerImpl should stop the acceptor immediately after stop is called
> -
>
> Key: GEODE-10306
> URL: https://issues.apache.org/jira/browse/GEODE-10306
> Project: Geode
>  Issue Type: Bug
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>
> Currently, after cache server stop is called, it takes a while for the 
> acceptor to stop taking new data, which can be a problem because the bigger 
> the window of time, the greater the risk of data loss. 
>  
> {noformat}
> public synchronized void stop() {
>   if (!isRunning()) {
> return;
>   }
>   RuntimeException firstException = null;
>   try {
> if (loadMonitor != null) {
>   loadMonitor.stop();
> }
>   } catch (RuntimeException e) {
> logger.warn("CacheServer - Error closing load monitor", e);
> firstException = e;
>   }
>   try {
> if (advisor != null) {
>   advisor.close();
> }
>   } catch (RuntimeException e) {
> logger.warn("CacheServer - Error closing advisor", e);
> firstException = e;
>   }
> PROBLEM ->  try {
> if (acceptor != null) {
>   acceptor.close();
> }
>   } catch (RuntimeException e) {
> logger.warn("CacheServer - Error closing acceptor monitor", e);
> if (firstException != null) {
>   firstException = e;
> }
>   } {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (GEODE-10306) CacheServerImpl should stop the acceptor immediately after stop is called

2022-05-12 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-10306:
---

 Summary: CacheServerImpl should stop the acceptor immediately 
after stop is called
 Key: GEODE-10306
 URL: https://issues.apache.org/jira/browse/GEODE-10306
 Project: Geode
  Issue Type: Bug
Reporter: Mark Hanson


Currently, after cache server stop is called, it takes a while for the acceptor 
to stop taking new data, which can be a problem because the bigger the window 
of time, the greater the risk of data loss. 

 
{noformat}
public synchronized void stop() {
  if (!isRunning()) {
return;
  }

  RuntimeException firstException = null;

  try {
if (loadMonitor != null) {
  loadMonitor.stop();
}
  } catch (RuntimeException e) {
logger.warn("CacheServer - Error closing load monitor", e);
firstException = e;
  }

  try {
if (advisor != null) {
  advisor.close();
}
  } catch (RuntimeException e) {
logger.warn("CacheServer - Error closing advisor", e);
firstException = e;
  }

PROBLEM ->  try {
if (acceptor != null) {
  acceptor.close();
}
  } catch (RuntimeException e) {
logger.warn("CacheServer - Error closing acceptor monitor", e);
if (firstException != null) {
  firstException = e;
}
  } {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.

2022-04-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-10265:

Description: 
This test uses a hardcoded cache.xml with a server port inside that is 
hardcoded. Basically, the second test started in parallel will have a bind 
error because the port is already in use. We should consider generating the 
file rather than using a static one.

 

Stress-new-test failure.
[https://concourse.apachegeode-ci.info/builds/48751343]

 

This issue was discovered as part of the stress-new-test of GEODE-10228's PR
{noformat}


The Problem >  < The Problem





org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil$ControlListener


 {noformat}
 
{noformat}
DurableClientSimpleDUnitTest > 
testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED
    org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple 
Failures (2 failures)
    org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call
 in VM 0 running on Host 
heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal 
with 4 VMs
    java.lang.AssertionError: Suspicious strings were written to the log 
during this run.
    Fix the strings or use IgnoredException.addIgnoredException to ignore.
    ---
    Found suspect string in 'dunit_suspect-vm0.log' at line 450


    [error 2022/04/28 00:39:54.901 UTC  
tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = 
false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = 
true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because:
    org.apache.geode.GemFireIOException: While starting cache server 
CacheServer on port=10188 client subscription config policy=entry client 
subscription config capacity=1000 client subscription config overflow 
directory=.
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801)
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600)
    at 
org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158)
    at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
    at 
org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253)
    at 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
    at 
org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
    at sun.rmi.transport.Transport$1.run(Transport.java:200)
    at sun.rmi.transport.Transport$1.run(Transport.java:197)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
    at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
    at java.security.AccessController.doPrivileged(Native Method)
    at 

[jira] [Updated] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.

2022-04-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-10265:

Description: 
This test uses a hardcoded cache.xml with a server port inside that is 
hardcoded. Bascially, the second test started in parallel will have a bind 
error because the port is already in use. We should consider generating the 
file rather than using a static one.

 

Stress-new-test failure.
[https://concourse.apachegeode-ci.info/builds/48751343]

 

This issue was discovered as part of the stress-new-test of GEODE-10228's PR
{noformat}


The Problem >  < The Problem





org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil$ControlListener


 {noformat}
 
{noformat}
DurableClientSimpleDUnitTest > 
testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED
    org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple 
Failures (2 failures)
    org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call
 in VM 0 running on Host 
heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal 
with 4 VMs
    java.lang.AssertionError: Suspicious strings were written to the log 
during this run.
    Fix the strings or use IgnoredException.addIgnoredException to ignore.
    ---
    Found suspect string in 'dunit_suspect-vm0.log' at line 450


    [error 2022/04/28 00:39:54.901 UTC  
tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = 
false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = 
true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because:
    org.apache.geode.GemFireIOException: While starting cache server 
CacheServer on port=10188 client subscription config policy=entry client 
subscription config capacity=1000 client subscription config overflow 
directory=.
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801)
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600)
    at 
org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158)
    at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
    at 
org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253)
    at 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
    at 
org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
    at sun.rmi.transport.Transport$1.run(Transport.java:200)
    at sun.rmi.transport.Transport$1.run(Transport.java:197)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
    at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
    at java.security.AccessController.doPrivileged(Native Method)
    at 

[jira] [Updated] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.

2022-04-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-10265:

Description: 
This test uses a hardcoded cache.xml with a server port inside that is 
hardcoded. Bascially the second test started in parallel will have a bind error 
because the port is already in use.

 

Stress-new-test failure.
[https://concourse.apachegeode-ci.info/builds/48751343]

 

This issue was discovered as part of the stress-new-test of GEODE-10228's PR

 
{noformat}
DurableClientSimpleDUnitTest > 
testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED
    org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple 
Failures (2 failures)
    org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call
 in VM 0 running on Host 
heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal 
with 4 VMs
    java.lang.AssertionError: Suspicious strings were written to the log 
during this run.
    Fix the strings or use IgnoredException.addIgnoredException to ignore.
    ---
    Found suspect string in 'dunit_suspect-vm0.log' at line 450


    [error 2022/04/28 00:39:54.901 UTC  
tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = 
false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = 
true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because:
    org.apache.geode.GemFireIOException: While starting cache server 
CacheServer on port=10188 client subscription config policy=entry client 
subscription config capacity=1000 client subscription config overflow 
directory=.
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801)
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600)
    at 
org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158)
    at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
    at 
org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253)
    at 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
    at 
org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
    at sun.rmi.transport.Transport$1.run(Transport.java:200)
    at sun.rmi.transport.Transport$1.run(Transport.java:197)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
    at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
    at java.security.AccessController.doPrivileged(Native Method)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at 

[jira] [Created] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.

2022-04-28 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-10265:
---

 Summary: 
DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML
 cannot be run in parallel with itself.
 Key: GEODE-10265
 URL: https://issues.apache.org/jira/browse/GEODE-10265
 Project: Geode
  Issue Type: Bug
  Components: tests
Reporter: Mark Hanson


This test uses a hardcoded cache.xml with a server port inside that is 
hardcoded. Bascially the second test started in parallel will have a bind error 
because the port is already in use.

 

Stress-new-test failure.
https://concourse.apachegeode-ci.info/builds/48751343

 
{noformat}
DurableClientSimpleDUnitTest > 
testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED
    org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple 
Failures (2 failures)
    org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call
 in VM 0 running on Host 
heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal 
with 4 VMs
    java.lang.AssertionError: Suspicious strings were written to the log 
during this run.
    Fix the strings or use IgnoredException.addIgnoredException to ignore.
    ---
    Found suspect string in 'dunit_suspect-vm0.log' at line 450


    [error 2022/04/28 00:39:54.901 UTC  
tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = 
false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = 
true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because:
    org.apache.geode.GemFireIOException: While starting cache server 
CacheServer on port=10188 client subscription config policy=entry client 
subscription config capacity=1000 client subscription config overflow 
directory=.
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801)
    at 
org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600)
    at 
org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620)
    at 
org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191)
    at 
org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158)
    at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142)
    at 
org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253)
    at 
org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
    at 
org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
    at sun.rmi.transport.Transport$1.run(Transport.java:200)
    at sun.rmi.transport.Transport$1.run(Transport.java:197)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
    at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
    at java.security.AccessController.doPrivileged(Native Method)
    at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
    at 

[jira] [Commented] (GEODE-10228) CI Failure: DurableClientTestCase > testDurableHAFailover times out in await for failover

2022-04-28 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529520#comment-17529520
 ] 

Mark Hanson commented on GEODE-10228:
-

Tracked down a new issue found during stress-new-test of PR for GEODE-10228. 
The basic problem is this test uses a hard coded port in the cache.xml for the 
test. That means that the test cannot be run in parallel with itself, which is 
what stress-new-test was doing. If we want to fix this test, (I suggest it 
should be a low priority). We should not use a static cache.xml and shift to a 
dynamically generated cache.xml. I am submitting a new bug for that particular 
issue and merging the PR as the test in question is not new.

> CI Failure: DurableClientTestCase > testDurableHAFailover times out in await 
> for failover
> -
>
> Key: GEODE-10228
> URL: https://issues.apache.org/jira/browse/GEODE-10228
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, tests
>Affects Versions: 1.15.0
>Reporter: Kirk Lund
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> {{testDurableHAFailover}} has a history of flakiness, thought the stacks do 
> seem to have changed some since the older versions of the but were resolved.
> {noformat}
> urableClientTestCase > testDurableHAFailover FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.test.dunit.internal.IdentifiableRunnable.run in VM 2 running 
> on Host 
> heavy-lifter-7bbf0b58-8bc0-5ca8-840d-7bcf83293b6d.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:435)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.durableFailover(DurableClientTestCase.java:520)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.testDurableHAFailover(DurableClientTestCase.java:439)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase 
> expected: null
>  but was: "0"="0" within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$durableFailover$3f73998b$1(DurableClientTestCase.java:521)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> expected: null
>  but was: "0"="0"
> at 
> sun.reflect.GeneratedConstructorAccessor199.newInstance(Unknown Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$null$2(DurableClientTestCase.java:525)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (GEODE-10228) CI Failure: DurableClientTestCase > testDurableHAFailover times out in await for failover

2022-04-26 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528380#comment-17528380
 ] 

Mark Hanson commented on GEODE-10228:
-

Awaiting PR review approvals at this point. The initial code change was put in 
as well as a bunch of reviewer comment changes.

> CI Failure: DurableClientTestCase > testDurableHAFailover times out in await 
> for failover
> -
>
> Key: GEODE-10228
> URL: https://issues.apache.org/jira/browse/GEODE-10228
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, tests
>Affects Versions: 1.15.0
>Reporter: Kirk Lund
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> {{testDurableHAFailover}} has a history of flakiness, thought the stacks do 
> seem to have changed some since the older versions of the but were resolved.
> {noformat}
> urableClientTestCase > testDurableHAFailover FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.test.dunit.internal.IdentifiableRunnable.run in VM 2 running 
> on Host 
> heavy-lifter-7bbf0b58-8bc0-5ca8-840d-7bcf83293b6d.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:435)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.durableFailover(DurableClientTestCase.java:520)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.testDurableHAFailover(DurableClientTestCase.java:439)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase 
> expected: null
>  but was: "0"="0" within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$durableFailover$3f73998b$1(DurableClientTestCase.java:521)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> expected: null
>  but was: "0"="0"
> at 
> sun.reflect.GeneratedConstructorAccessor199.newInstance(Unknown Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$null$2(DurableClientTestCase.java:525)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string

2022-04-26 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-10248.
-
Fix Version/s: 1.15.0
   Resolution: Fixed

> CI: DeployToMultiGroupDUnitTest encountered suspect string
> --
>
> Key: GEODE-10248
> URL: https://issues.apache.org/jira/browse/GEODE-10248
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Xiaojian Zhou
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> > Task :geode-assembly:distributedTest
> DeployToMultiGroupDUnitTest > executionError FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 571
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 592
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
> at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
> at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
> at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> 

[jira] [Commented] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string

2022-04-26 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528379#comment-17528379
 ] 

Mark Hanson commented on GEODE-10248:
-

The core problem was that the test was outputting a string the suspicious 
string parser didn't like. So I added a special case for "Management Request: " 
followed by packet data,  which is what that log statement outputs.

 

I think we should not be logging like this at the info level.

> CI: DeployToMultiGroupDUnitTest encountered suspect string
> --
>
> Key: GEODE-10248
> URL: https://issues.apache.org/jira/browse/GEODE-10248
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Xiaojian Zhou
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> > Task :geode-assembly:distributedTest
> DeployToMultiGroupDUnitTest > executionError FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 571
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 592
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
> at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
> at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
> at 
> 

[jira] [Commented] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string

2022-04-20 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525157#comment-17525157
 ] 

Mark Hanson commented on GEODE-10248:
-

Not sure why this is failing now. This is a log statement that has been in the 
code for a while. There is nothing wrong here. 

 

I have changed the default to a property  "geode.management.request.logging" 
and the log statement goes away.

 

> CI: DeployToMultiGroupDUnitTest encountered suspect string
> --
>
> Key: GEODE-10248
> URL: https://issues.apache.org/jira/browse/GEODE-10248
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Xiaojian Zhou
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> > Task :geode-assembly:distributedTest
> DeployToMultiGroupDUnitTest > executionError FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 571
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 592
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
> at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
> at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
> at 
> 

[jira] [Assigned] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string

2022-04-20 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-10248:
---

Assignee: Mark Hanson

> CI: DeployToMultiGroupDUnitTest encountered suspect string
> --
>
> Key: GEODE-10248
> URL: https://issues.apache.org/jira/browse/GEODE-10248
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Xiaojian Zhou
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> > Task :geode-assembly:distributedTest
> DeployToMultiGroupDUnitTest > executionError FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 571
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> ---
> Found suspect string in 'dunit_suspect-vm0.log' at line 592
> 
> $?? 
> ???PK???L?Tk??6??Class1.classPK???L?T{6}?
> ?timestampPK??u?
> --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1
> Content-Disposition: form-data; name="config"
> Content-Type: application/json
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
> at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
> at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
> at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
> at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> 

[jira] [Assigned] (GEODE-10228) CI Failure: DurableClientTestCase > testDurableHAFailover times out in await for failover

2022-04-12 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-10228:
---

Assignee: Mark Hanson

> CI Failure: DurableClientTestCase > testDurableHAFailover times out in await 
> for failover
> -
>
> Key: GEODE-10228
> URL: https://issues.apache.org/jira/browse/GEODE-10228
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, tests
>Reporter: Kirk Lund
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
> Fix For: 1.15.0
>
>
> {{testDurableHAFailover}} has a history of flakiness, thought the stacks do 
> seem to have changed some since the older versions of the but were resolved.
> {noformat}
> urableClientTestCase > testDurableHAFailover FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.test.dunit.internal.IdentifiableRunnable.run in VM 2 running 
> on Host 
> heavy-lifter-7bbf0b58-8bc0-5ca8-840d-7bcf83293b6d.c.apachegeode-ci.internal 
> with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:435)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.durableFailover(DurableClientTestCase.java:520)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.testDurableHAFailover(DurableClientTestCase.java:439)
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase 
> expected: null
>  but was: "0"="0" within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$durableFailover$3f73998b$1(DurableClientTestCase.java:521)
> Caused by:
> org.opentest4j.AssertionFailedError: 
> expected: null
>  but was: "0"="0"
> at 
> sun.reflect.GeneratedConstructorAccessor199.newInstance(Unknown Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$null$2(DurableClientTestCase.java:525)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2022-04-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9704.

Fix Version/s: 1.15.0
   Resolution: Fixed

> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available
> Fix For: 1.15.0
>
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>   - clears the region of keys of interest
>   - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-10195) MicrometerBinderTest > processorMetricsBinderExists FAILED

2022-03-29 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-10195:
---

 Summary: MicrometerBinderTest > processorMetricsBinderExists FAILED
 Key: GEODE-10195
 URL: https://issues.apache.org/jira/browse/GEODE-10195
 Project: Geode
  Issue Type: Bug
  Components: core
Reporter: Mark Hanson


windows-acceptance-test-openjdk11 failed with the following error.

 
{noformat}
MicrometerBinderTest > processorMetricsBinderExists FAILED
org.apache.geode.cache.client.ServerOperationException: remote server on 
heavy-lifter-ceacbfa8-6147-51ca-affd-b497cd16e2ef(4420:loner):54545:7074b0d7: 
Function named CheckIfMeterExistsFunction is not registered to FunctionService
at 
org.apache.geode.cache.client.internal.ExecuteFunctionOp$ExecuteFunctionOpImpl.processResponse(ExecuteFunctionOp.java:394)
at 
org.apache.geode.cache.client.internal.AbstractOp.processResponse(AbstractOp.java:234)
at 
org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:209)
at 
org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:394)
at 
org.apache.geode.cache.client.internal.AbstractOpWithTimeout.attempt(AbstractOpWithTimeout.java:45)
at 
org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:284)
at 
org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:358)
at 
org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760)
at 
org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:151)
at 
org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:820)
at 
org.apache.geode.cache.client.internal.ExecuteFunctionOp.execute(ExecuteFunctionOp.java:100)
at 
org.apache.geode.internal.cache.execute.ServerFunctionExecutor.executeOnServer(ServerFunctionExecutor.java:217)
at 
org.apache.geode.internal.cache.execute.ServerFunctionExecutor.executeFunction(ServerFunctionExecutor.java:104)
at 
org.apache.geode.internal.cache.execute.ServerFunctionExecutor.execute(ServerFunctionExecutor.java:368)
at 
org.apache.geode.internal.cache.execute.ServerFunctionExecutor.execute(ServerFunctionExecutor.java:377)
at 
org.apache.geode.metrics.MicrometerBinderTest.processorMetricsBinderExists(MicrometerBinderTest.java:152)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-10153) Benchmarks: PartitionedPutAllBenchmark: net.schmizz.sshj.transport.TransportException: Connection reset

2022-03-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-10153.
-
Resolution: Duplicate

> Benchmarks: PartitionedPutAllBenchmark:  
> net.schmizz.sshj.transport.TransportException: Connection reset
> 
>
> Key: GEODE-10153
> URL: https://issues.apache.org/jira/browse/GEODE-10153
> Project: Geode
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> This looks like GEODE-10147
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/223]
>  
> {noformat}
> 2022-03-23 04:26:18.839 ERROR RemoteJVMFactory - Launching 
> /usr/lib/jvm/bellsoft-java8-amd64/jre/bin/java -classpath 
> .geode-performance/lib/SERVER-2/* 
> -Djava.library.path=/home/geode/META-INF/native -DRMI_HOST=172.31.40.73 
> -DRMI_PORT=3 -DJVM_ID=2 -DROLE=SERVER -DOUTPUT_DIR=output/SERVER-2 
> -server -Djava.awt.headless=true 
> -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 
> -Dgemfire.OSProcess.ENABLE_OUTPUT_REDIRECTION=true 
> -Dgemfire.launcher.registerSignalHandlers=true -XX:+DisableExplicitGC -Xmx8g 
> -Xms8g -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly 
> -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark 
> -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseNUMA -XX:+ScavengeBeforeFullGC 
> -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=32768 
> -Dbenchmark.withSslProtocols= -Dbenchmark.withSslCiphers= 
> org.apache.geode.perftest.jvms.rmi.ChildJVM on 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure$SshNode@3cad46e8Failed.
> 21:26:18net.schmizz.sshj.transport.TransportException: Connection reset
> 21:26:18  at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194)
> 21:26:18  at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793)
> 21:26:18  at net.schmizz.sshj.SocketClient.connect(SocketClient.java:178)
> 21:26:18  at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74)
> 21:26:18  at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.onNode(SshInfrastructure.java:86)
> 21:26:18  at 
> org.apache.geode.perftest.jvms.JVMLauncher$1.run(JVMLauncher.java:68)
> 21:26:18Caused by: java.net.SocketException: Connection reset
> 21:26:18  at java.net.SocketInputStream.read(SocketInputStream.java:210)
> 21:26:18  at java.net.SocketInputStream.read(SocketInputStream.java:141)
> 21:26:18  at java.net.SocketInputStream.read(SocketInputStream.java:224)
> 21:26:18  at 
> net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211)
> 21:26:18  at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187)
> 21:26:18  ... 5 more
> 21:31:18
> 21:31:18PartitionedPutAllBenchmark > run() FAILED
> 21:31:18java.lang.IllegalStateException: Workers failed to start in 5 
> minute
> 21:31:18at 
> org.apache.geode.perftest.jvms.RemoteJVMFactory.launch(RemoteJVMFactory.java:133)
> 21:31:18at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:97)
> 21:31:18at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65)
> 21:31:18at 
> org.apache.geode.benchmark.tests.PartitionedPutAllBenchmark.run(PartitionedPutAllBenchmark.java:52)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-10173) CI failure: P2pPartitionedGetBenchmark > run()

2022-03-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-10173.
-
Resolution: Duplicate

> CI failure: P2pPartitionedGetBenchmark > run()
> --
>
> Key: GEODE-10173
> URL: https://issues.apache.org/jira/browse/GEODE-10173
> Project: Geode
>  Issue Type: Bug
>Reporter: Jianxia Chen
>Priority: Major
>  Labels: needsTriage
>
> {code:java}
> org.apache.geode.benchmark.tests.P2pPartitionedGetBenchmark > run() FAILED
> 15:49:10net.schmizz.sshj.transport.TransportException: Connection reset
> 15:49:10at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:181)
> 15:49:10at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:771)
> 15:49:10at 
> net.schmizz.sshj.SocketClient.connect(SocketClient.java:150)
> 15:49:10at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:75)
> 15:49:10at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.copyFromNode(SshInfrastructure.java:186)
> 15:49:10at 
> org.apache.geode.perftest.jvms.RemoteJVMs.copyResults(RemoteJVMs.java:87)
> 15:49:10at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:136)
> 15:49:10at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:68)
> 15:49:10at 
> org.apache.geode.benchmark.tests.P2pPartitionedGetBenchmark.run(P2pPartitionedGetBenchmark.java:44)
> 15:49:10
> 15:49:10Caused by:
> 15:49:10java.net.SocketException: Connection reset
> 15:49:10at 
> java.net.SocketInputStream.read(SocketInputStream.java:210)
> 15:49:10at 
> java.net.SocketInputStream.read(SocketInputStream.java:141)
> 15:49:10at 
> java.net.SocketInputStream.read(SocketInputStream.java:224)
> 15:49:10at 
> net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:198)
> 15:49:10at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:174)
> 15:49:10... 8 more {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-10178) CI Failure: PartitionedGetLongBenchmark > run()

2022-03-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-10178.
-
Resolution: Duplicate

> CI Failure: PartitionedGetLongBenchmark > run()
> ---
>
> Key: GEODE-10178
> URL: https://issues.apache.org/jira/browse/GEODE-10178
> Project: Geode
>  Issue Type: Bug
>Reporter: Jianxia Chen
>Priority: Major
>  Labels: needsTriage
>
> {code:java}
> PartitionedGetLongBenchmark > run() FAILED
> 01:08:25net.schmizz.sshj.transport.TransportException: Connection reset
> 01:08:25at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194)
> 01:08:25at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793)
> 01:08:25at 
> net.schmizz.sshj.SocketClient.connect(SocketClient.java:178)
> 01:08:25at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74)
> 01:08:25at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.copyFromNode(SshInfrastructure.java:185)
> 01:08:25at 
> org.apache.geode.perftest.jvms.RemoteJVMs.copyResults(RemoteJVMs.java:87)
> 01:08:25at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:112)
> 01:08:25at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65)
> 01:08:25at 
> org.apache.geode.benchmark.tests.PartitionedGetLongBenchmark.run(PartitionedGetLongBenchmark.java:45)
> 01:08:25
> 01:08:25Caused by:
> 01:08:25java.net.SocketException: Connection reset
> 01:08:25at 
> java.net.SocketInputStream.read(SocketInputStream.java:210)
> 01:08:25at 
> java.net.SocketInputStream.read(SocketInputStream.java:141)
> 01:08:25at 
> java.net.SocketInputStream.read(SocketInputStream.java:224)
> 01:08:25at 
> net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211)
> 01:08:25at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187)
> 01:08:25... 8 more {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-10154) Benchmarks: PartitionedIndexedQueryBenchmark: net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange

2022-03-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-10154.
-
Resolution: Fixed

This is benchmark job path issue fixed in concourse.

> Benchmarks: PartitionedIndexedQueryBenchmark: 
> net.schmizz.sshj.transport.TransportException: Server closed connection 
> during identification exchange
> 
>
> Key: GEODE-10154
> URL: https://issues.apache.org/jira/browse/GEODE-10154
> Project: Geode
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-with-security-manager/builds/224]
> This same framework is reporting an error in GEODE-10153 and GEODE-10147
> {noformat}
> 02:36:47PartitionedIndexedQueryBenchmark > run() FAILED
> 02:36:47java.util.concurrent.CompletionException: 
> java.io.UncheckedIOException: net.schmizz.sshj.transport.TransportException: 
> Server closed connection during identification exchange
> 02:36:47at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
> 02:36:47at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
> 02:36:47at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643)
> 02:36:47at 
> java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1632)
> 02:36:47at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> 02:36:47at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> 02:36:47at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> 02:36:47at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> 02:36:47
> 02:36:47Caused by:
> 02:36:47java.io.UncheckedIOException: 
> net.schmizz.sshj.transport.TransportException: Server closed connection 
> during identification exchange
> 02:36:47at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:176)
> 02:36:47at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
> 02:36:47... 5 more
> 02:36:47
> 02:36:47Caused by:
> 02:36:47net.schmizz.sshj.transport.TransportException: Server 
> closed connection during identification exchange
> 02:36:47at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194)
> 02:36:47at 
> net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793)
> 02:36:47at 
> net.schmizz.sshj.SocketClient.connect(SocketClient.java:178)
> 02:36:47at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74)
> 02:36:47at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:158)
> 02:36:47... 6 more
> 02:36:47
> 02:36:47Caused by:
> 02:36:47net.schmizz.sshj.transport.TransportException: Server 
> closed connection during identification exchange
> 02:36:47at 
> net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:214)
> 02:36:47at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187)
> 02:36:47... 10 more {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-10153) Benchmarks: PartitionedPutAllBenchmark: net.schmizz.sshj.transport.TransportException: Connection reset

2022-03-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-10153:
---

Assignee: Mark Hanson

> Benchmarks: PartitionedPutAllBenchmark:  
> net.schmizz.sshj.transport.TransportException: Connection reset
> 
>
> Key: GEODE-10153
> URL: https://issues.apache.org/jira/browse/GEODE-10153
> Project: Geode
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> This looks like GEODE-10147
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/223]
>  
> {noformat}
> 2022-03-23 04:26:18.839 ERROR RemoteJVMFactory - Launching 
> /usr/lib/jvm/bellsoft-java8-amd64/jre/bin/java -classpath 
> .geode-performance/lib/SERVER-2/* 
> -Djava.library.path=/home/geode/META-INF/native -DRMI_HOST=172.31.40.73 
> -DRMI_PORT=3 -DJVM_ID=2 -DROLE=SERVER -DOUTPUT_DIR=output/SERVER-2 
> -server -Djava.awt.headless=true 
> -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 
> -Dgemfire.OSProcess.ENABLE_OUTPUT_REDIRECTION=true 
> -Dgemfire.launcher.registerSignalHandlers=true -XX:+DisableExplicitGC -Xmx8g 
> -Xms8g -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly 
> -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark 
> -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseNUMA -XX:+ScavengeBeforeFullGC 
> -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=32768 
> -Dbenchmark.withSslProtocols= -Dbenchmark.withSslCiphers= 
> org.apache.geode.perftest.jvms.rmi.ChildJVM on 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure$SshNode@3cad46e8Failed.
> 21:26:18net.schmizz.sshj.transport.TransportException: Connection reset
> 21:26:18  at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194)
> 21:26:18  at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793)
> 21:26:18  at net.schmizz.sshj.SocketClient.connect(SocketClient.java:178)
> 21:26:18  at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74)
> 21:26:18  at 
> org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.onNode(SshInfrastructure.java:86)
> 21:26:18  at 
> org.apache.geode.perftest.jvms.JVMLauncher$1.run(JVMLauncher.java:68)
> 21:26:18Caused by: java.net.SocketException: Connection reset
> 21:26:18  at java.net.SocketInputStream.read(SocketInputStream.java:210)
> 21:26:18  at java.net.SocketInputStream.read(SocketInputStream.java:141)
> 21:26:18  at java.net.SocketInputStream.read(SocketInputStream.java:224)
> 21:26:18  at 
> net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211)
> 21:26:18  at 
> net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187)
> 21:26:18  ... 5 more
> 21:31:18
> 21:31:18PartitionedPutAllBenchmark > run() FAILED
> 21:31:18java.lang.IllegalStateException: Workers failed to start in 5 
> minute
> 21:31:18at 
> org.apache.geode.perftest.jvms.RemoteJVMFactory.launch(RemoteJVMFactory.java:133)
> 21:31:18at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:97)
> 21:31:18at 
> org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65)
> 21:31:18at 
> org.apache.geode.benchmark.tests.PartitionedPutAllBenchmark.run(PartitionedPutAllBenchmark.java:52)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2022-03-25 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512627#comment-17512627
 ] 

Mark Hanson commented on GEODE-9704:


PR 7442 is available. I have made  changes to fix the behavior that was causing 
the problems.  The core of the problem was that registerinterst should be 
called before readyforevents. It was reversed effectively, so that has been 
corrected. 

LocalRegionUpdateTest.java was created to house two unit tests for the new code.

AuthExpirationDUnitTest has a test by Jinmei that has been uncommented that 
would typically be flaky, but with this fix, no longer fails.

I believe this bug is done with the exception of the review phase of the PR and 
associated changes.

> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>   - clears the region of keys of interest
>   - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-5564) Flaky test ConcurrentIndexInitOnOverflowRegionDUnitTest > testIndexUpdateWithRegionClear

2022-03-23 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-5564:
---
Affects Version/s: 1.14.0

> Flaky test ConcurrentIndexInitOnOverflowRegionDUnitTest > 
> testIndexUpdateWithRegionClear
> 
>
> Key: GEODE-5564
> URL: https://issues.apache.org/jira/browse/GEODE-5564
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.8.0, 1.14.0
>Reporter: Jacob Barrett
>Priority: Major
>  Labels: flaky
>
> {noformat}
> org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest
>  > testIndexUpdateWithRegionClear FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest$12.run
>  in VM 0 running on Host 92f89c21d1b0 with 4 VMs
> at org.apache.geode.test.dunit.VM.invoke(VM.java:443)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:412)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:355)
> at 
> org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest.testIndexUpdateWithRegionClear(ConcurrentIndexInitOnOverflowRegionDUnitTest.java:411)
> Caused by:
> java.lang.AssertionError: After clear region size is supposed to be 
> zero as all index updates are blocked. Current region size is: 13
> at org.junit.Assert.fail(Assert.java:88)
> at 
> org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest$12.run2(ConcurrentIndexInitOnOverflowRegionDUnitTest.java:430)
> {noformat}
> Failing: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/pr-develop/jobs/DistributedTest/builds/556
> Passing: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/pr-develop/jobs/DistributedTest/builds/547



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-10154) Benchmarks: PartitionedIndexedQueryBenchmark: net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange

2022-03-23 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-10154:
---

 Summary: Benchmarks: PartitionedIndexedQueryBenchmark: 
net.schmizz.sshj.transport.TransportException: Server closed connection during 
identification exchange
 Key: GEODE-10154
 URL: https://issues.apache.org/jira/browse/GEODE-10154
 Project: Geode
  Issue Type: Bug
  Components: benchmarks
Affects Versions: 1.15.0
Reporter: Mark Hanson


[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-with-security-manager/builds/224]

This same framework is reporting an error in GEODE-10153 and GEODE-10147
{noformat}
02:36:47PartitionedIndexedQueryBenchmark > run() FAILED
02:36:47java.util.concurrent.CompletionException: 
java.io.UncheckedIOException: net.schmizz.sshj.transport.TransportException: 
Server closed connection during identification exchange
02:36:47at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
02:36:47at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
02:36:47at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643)
02:36:47at 
java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1632)
02:36:47at 
java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
02:36:47at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
02:36:47at 
java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
02:36:47at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
02:36:47
02:36:47Caused by:
02:36:47java.io.UncheckedIOException: 
net.schmizz.sshj.transport.TransportException: Server closed connection during 
identification exchange
02:36:47at 
org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:176)
02:36:47at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
02:36:47... 5 more
02:36:47
02:36:47Caused by:
02:36:47net.schmizz.sshj.transport.TransportException: Server 
closed connection during identification exchange
02:36:47at 
net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194)
02:36:47at 
net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793)
02:36:47at 
net.schmizz.sshj.SocketClient.connect(SocketClient.java:178)
02:36:47at 
org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74)
02:36:47at 
org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:158)
02:36:47... 6 more
02:36:47
02:36:47Caused by:
02:36:47net.schmizz.sshj.transport.TransportException: Server 
closed connection during identification exchange
02:36:47at 
net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:214)
02:36:47at 
net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187)
02:36:47... 10 more {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-10153) Benchmarks: PartitionedPutAllBenchmark: net.schmizz.sshj.transport.TransportException: Connection reset

2022-03-23 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-10153:
---

 Summary: Benchmarks: PartitionedPutAllBenchmark:  
net.schmizz.sshj.transport.TransportException: Connection reset
 Key: GEODE-10153
 URL: https://issues.apache.org/jira/browse/GEODE-10153
 Project: Geode
  Issue Type: Bug
  Components: benchmarks
Affects Versions: 1.15.0
Reporter: Mark Hanson


This looks like GEODE-10147

[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/223]

 
{noformat}
2022-03-23 04:26:18.839 ERROR RemoteJVMFactory - Launching 
/usr/lib/jvm/bellsoft-java8-amd64/jre/bin/java -classpath 
.geode-performance/lib/SERVER-2/* 
-Djava.library.path=/home/geode/META-INF/native -DRMI_HOST=172.31.40.73 
-DRMI_PORT=3 -DJVM_ID=2 -DROLE=SERVER -DOUTPUT_DIR=output/SERVER-2 -server 
-Djava.awt.headless=true -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 
-Dgemfire.OSProcess.ENABLE_OUTPUT_REDIRECTION=true 
-Dgemfire.launcher.registerSignalHandlers=true -XX:+DisableExplicitGC -Xmx8g 
-Xms8g -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly 
-XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark 
-XX:CMSInitiatingOccupancyFraction=60 -XX:+UseNUMA -XX:+ScavengeBeforeFullGC 
-XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=32768 
-Dbenchmark.withSslProtocols= -Dbenchmark.withSslCiphers= 
org.apache.geode.perftest.jvms.rmi.ChildJVM on 
org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure$SshNode@3cad46e8Failed.
21:26:18net.schmizz.sshj.transport.TransportException: Connection reset
21:26:18at 
net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194)
21:26:18at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793)
21:26:18at net.schmizz.sshj.SocketClient.connect(SocketClient.java:178)
21:26:18at 
org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74)
21:26:18at 
org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.onNode(SshInfrastructure.java:86)
21:26:18at 
org.apache.geode.perftest.jvms.JVMLauncher$1.run(JVMLauncher.java:68)
21:26:18Caused by: java.net.SocketException: Connection reset
21:26:18at java.net.SocketInputStream.read(SocketInputStream.java:210)
21:26:18at java.net.SocketInputStream.read(SocketInputStream.java:141)
21:26:18at java.net.SocketInputStream.read(SocketInputStream.java:224)
21:26:18at 
net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211)
21:26:18at 
net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187)
21:26:18... 5 more
21:31:18
21:31:18PartitionedPutAllBenchmark > run() FAILED
21:31:18java.lang.IllegalStateException: Workers failed to start in 5 minute
21:31:18at 
org.apache.geode.perftest.jvms.RemoteJVMFactory.launch(RemoteJVMFactory.java:133)
21:31:18at 
org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:97)
21:31:18at 
org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65)
21:31:18at 
org.apache.geode.benchmark.tests.PartitionedPutAllBenchmark.run(PartitionedPutAllBenchmark.java:52)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2022-03-22 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reopened GEODE-9704:


> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>   - clears the region of keys of interest
>   - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2022-03-22 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9704:
---
Labels: GeodeOperationAPI blocks-1.15.1 pull-request-available  (was: 
GeodeOperationAPI blocks-1.15.0 pull-request-available)

> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>   - clears the region of keys of interest
>   - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2022-03-22 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9704:
---
Labels: GeodeOperationAPI blocks-1.15.0 pull-request-available  (was: 
GeodeOperationAPI blocks-1.15.1 pull-request-available)

> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0, pull-request-available
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>   - clears the region of keys of interest
>   - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-10039) BucketProfiles can be stale in rare cases.

2022-02-10 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-10039:
---

 Summary: BucketProfiles can be stale in rare cases.
 Key: GEODE-10039
 URL: https://issues.apache.org/jira/browse/GEODE-10039
 Project: Geode
  Issue Type: Bug
  Components: core
Affects Versions: 1.15.0
Reporter: Mark Hanson


In the case when a server is starting as a member of a partitioned region 
during a rebalance, it is possible for the  the starting server to not get a 
profile removal for a bucket that has been relocated.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone

2022-01-12 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9815.

Fix Version/s: 1.15.0
   Resolution: Fixed

The solution to this was to change the logic to deal with the situation where 
there were multiple copies in the same redundancy zone by deleting the extra 
copy in the zone, but also ensure a new copy gets made in another zone.

> Recovering persistent members can result in extra copies of a bucket or two 
> copies in the same redundancy zone
> --
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0​, needsTriage, 
> pull-request-available
> Fix For: 1.15.0
>
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach.
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
> and B1
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2022-01-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9704:
---
Description: 
This is the old Geode behavior, but may or may not be the correct behavior.

When durable clients recovers, there is a queueTimer thread that runs 
`QueueManagerImp.recoverPrimary` method,  it 
 * makes new connection to server

 - sends readyForEvents (which will cause the server to start sending the 
queued events)
 - recovers interest
  - clears the region of keys of interest
  - re-registers interest

It sends readyForEvents before it clears region of keys of interest, if server 
sends some events of those keys in between, it will clear them, thus it seems 
to the user that the client region doesn't have those keys. 

 

Run geode-core distributedTest 
AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(),
 change the InterestResultPolicy to NONE, you would see the test would fail 
occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
`createNewPrimary` and `recoverInterest` would make the test fail more 
consistently.

  was:
This is the old Geode behavior, but may or may not be the correct behavior.

When durable clients recovers, there is a queueTimer thread that runs 
`QueueManagerImp.recoverPrimary` method,  it 
 * makes new connection to server

 - sends readyForEvents (which will cause the server to start sending the 
queued events)
 - recovers interest
   - clears the region of keys of interest
   - re-registers interest

It sends readyForEvents before it clears region of keys of interest, if server 
sends some events of those keys in between, it will clear them, thus it seems 
to the user that the client region doesn't have those keys. 

 

Run geode-core distributedTest 
AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(),
 change the InterestResultPolicy to NONE, you would see the test would fail 
occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
`createNewPrimary` and `recoverInterest` would make the test fail more 
consistently.


> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.1
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>   - clears the region of keys of interest
>   - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

2022-01-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9704:
--

Assignee: Mark Hanson  (was: Kirk Lund)

> When durable clients recovers, it sends "ready for event" signal before 
> register for interest, this might cause problem for caching_proxy regions
> -
>
> Key: GEODE-9704
> URL: https://issues.apache.org/jira/browse/GEODE-9704
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Jinmei Liao
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.1
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs 
> `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the 
> queued events)
>  - recovers interest
>    - clears the region of keys of interest
>    - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if 
> server sends some events of those keys in between, it will clear them, thus 
> it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest 
> AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(),
>  change the InterestResultPolicy to NONE, you would see the test would fail 
> occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between 
> `createNewPrimary` and `recoverInterest` would make the test fail more 
> consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-06 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9920.

Resolution: Won't Fix

This is a resource issue where we saw a 10 second delay on wakeup for stats. 
That indicates is we ran out of CPU. There really isn't anything to fix. We 
could reduce the number of concurrent tests, but for 1.12. There is no point.

> CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and 
> RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port 
> conflict
> ---
>
> Key: GEODE-9920
> URL: https://issues.apache.org/jira/browse/GEODE-9920
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.12.8
>Reporter: Hale Bales
>Assignee: Mark Hanson
>Priority: Major
>  Labels: CI, needsTriage
>
> StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
> AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
> failed with a suspicious string with a failure to respond to heartbeats. They 
> are in the same CI run so it seems like this is a port conflict where there 
> is overlap between the two tests as one is shutting down and the other is 
> starting up.
>  
> Updated: This is part of the long standing problem with port binding and the 
> imperfection in handling default ports in tests. In this case 41000.
> {code:java}
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest 
> > testWithInvalidMemberID FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"Member Count : 1
>   Name| Id
> - | --
> locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
> ">
> to contain:
>  <"locatorToStop"> 
> at 
> org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
> at 
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
> {code}
> {code:java}
> org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
> testLimitedAccess FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
> Host 07d663f91562 with 4 VMs
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> This connection to a distributed system has been disconnected., caused by 
> org.apache.geode.ForcedDisconnectException: Member isn't responding to 
> heartbeat requests
> Caused by:
> org.apache.geode.ForcedDisconnectException: Member isn't 
> responding to heartbeat requests
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1125
> [fatal 2022/01/04 01:04:33.305 GMT  
> tid=100] Membership service failure: Member isn't responding to heartbeat 
> requests
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Member isn't responding to heartbeat requests
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> 

[jira] [Commented] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-06 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17470138#comment-17470138
 ] 

Mark Hanson commented on GEODE-9920:


Our initial diagnosis was that this was a port issue. This was incorrect.  
Further investigation led us to a system load issue. As an aside, 1.12 still 
uses Docker which means port collisions are super unlikely.  

> CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and 
> RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port 
> conflict
> ---
>
> Key: GEODE-9920
> URL: https://issues.apache.org/jira/browse/GEODE-9920
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.12.8
>Reporter: Hale Bales
>Assignee: Mark Hanson
>Priority: Major
>  Labels: CI, needsTriage
>
> StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
> AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
> failed with a suspicious string with a failure to respond to heartbeats. They 
> are in the same CI run so it seems like this is a port conflict where there 
> is overlap between the two tests as one is shutting down and the other is 
> starting up.
>  
> Updated: This is part of the long standing problem with port binding and the 
> imperfection in handling default ports in tests. In this case 41000.
> {code:java}
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest 
> > testWithInvalidMemberID FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"Member Count : 1
>   Name| Id
> - | --
> locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
> ">
> to contain:
>  <"locatorToStop"> 
> at 
> org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
> at 
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
> {code}
> {code:java}
> org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
> testLimitedAccess FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
> Host 07d663f91562 with 4 VMs
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> This connection to a distributed system has been disconnected., caused by 
> org.apache.geode.ForcedDisconnectException: Member isn't responding to 
> heartbeat requests
> Caused by:
> org.apache.geode.ForcedDisconnectException: Member isn't 
> responding to heartbeat requests
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1125
> [fatal 2022/01/04 01:04:33.305 GMT  
> tid=100] Membership service failure: Member isn't responding to heartbeat 
> requests
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Member isn't responding to heartbeat requests
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
>  

[jira] [Commented] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-05 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469595#comment-17469595
 ] 

Mark Hanson commented on GEODE-9920:


closed it, but then we started relooking at the problem after some discussion.

> CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and 
> RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port 
> conflict
> ---
>
> Key: GEODE-9920
> URL: https://issues.apache.org/jira/browse/GEODE-9920
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.12.8
>Reporter: Hale Bales
>Assignee: Mark Hanson
>Priority: Major
>  Labels: CI, needsTriage
>
> StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
> AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
> failed with a suspicious string with a failure to respond to heartbeats. They 
> are in the same CI run so it seems like this is a port conflict where there 
> is overlap between the two tests as one is shutting down and the other is 
> starting up.
>  
> Updated: This is part of the long standing problem with port binding and the 
> imperfection in handling default ports in tests. In this case 41000.
> {code:java}
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest 
> > testWithInvalidMemberID FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"Member Count : 1
>   Name| Id
> - | --
> locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
> ">
> to contain:
>  <"locatorToStop"> 
> at 
> org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
> at 
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
> {code}
> {code:java}
> org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
> testLimitedAccess FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
> Host 07d663f91562 with 4 VMs
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> This connection to a distributed system has been disconnected., caused by 
> org.apache.geode.ForcedDisconnectException: Member isn't responding to 
> heartbeat requests
> Caused by:
> org.apache.geode.ForcedDisconnectException: Member isn't 
> responding to heartbeat requests
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1125
> [fatal 2022/01/04 01:04:33.305 GMT  
> tid=100] Membership service failure: Member isn't responding to heartbeat 
> requests
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Member isn't responding to heartbeat requests
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70)
>   at 

[jira] [Closed] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-05 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson closed GEODE-9920.
--

> CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and 
> RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port 
> conflict
> ---
>
> Key: GEODE-9920
> URL: https://issues.apache.org/jira/browse/GEODE-9920
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.12.8
>Reporter: Hale Bales
>Assignee: Mark Hanson
>Priority: Major
>  Labels: CI, needsTriage
>
> StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
> AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
> failed with a suspicious string with a failure to respond to heartbeats. They 
> are in the same CI run so it seems like this is a port conflict where there 
> is overlap between the two tests as one is shutting down and the other is 
> starting up.
>  
> Updated: This is part of the long standing problem with port binding and the 
> imperfection in handling default ports in tests. In this case 41000.
> {code:java}
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest 
> > testWithInvalidMemberID FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"Member Count : 1
>   Name| Id
> - | --
> locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
> ">
> to contain:
>  <"locatorToStop"> 
> at 
> org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
> at 
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
> {code}
> {code:java}
> org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
> testLimitedAccess FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
> Host 07d663f91562 with 4 VMs
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> This connection to a distributed system has been disconnected., caused by 
> org.apache.geode.ForcedDisconnectException: Member isn't responding to 
> heartbeat requests
> Caused by:
> org.apache.geode.ForcedDisconnectException: Member isn't 
> responding to heartbeat requests
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1125
> [fatal 2022/01/04 01:04:33.305 GMT  
> tid=100] Membership service failure: Member isn't responding to heartbeat 
> requests
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Member isn't responding to heartbeat requests
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70)
>   at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
>   at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
>   

[jira] [Resolved] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-05 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9920.

Resolution: Won't Fix

This port issue has been fixed by Dale's changes on develop, that are now in 
1.14 and 1.15.  It was decided that we were not going to backport that change 
to 1.12

> CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and 
> RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port 
> conflict
> ---
>
> Key: GEODE-9920
> URL: https://issues.apache.org/jira/browse/GEODE-9920
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.12.8
>Reporter: Hale Bales
>Assignee: Mark Hanson
>Priority: Major
>  Labels: CI, needsTriage
>
> StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
> AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
> failed with a suspicious string with a failure to respond to heartbeats. They 
> are in the same CI run so it seems like this is a port conflict where there 
> is overlap between the two tests as one is shutting down and the other is 
> starting up.
>  
> Updated: This is part of the long standing problem with port binding and the 
> imperfection in handling default ports in tests. In this case 41000.
> {code:java}
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest 
> > testWithInvalidMemberID FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"Member Count : 1
>   Name| Id
> - | --
> locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
> ">
> to contain:
>  <"locatorToStop"> 
> at 
> org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
> at 
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
> {code}
> {code:java}
> org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
> testLimitedAccess FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
> Host 07d663f91562 with 4 VMs
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> This connection to a distributed system has been disconnected., caused by 
> org.apache.geode.ForcedDisconnectException: Member isn't responding to 
> heartbeat requests
> Caused by:
> org.apache.geode.ForcedDisconnectException: Member isn't 
> responding to heartbeat requests
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1125
> [fatal 2022/01/04 01:04:33.305 GMT  
> tid=100] Membership service failure: Member isn't responding to heartbeat 
> requests
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Member isn't responding to heartbeat requests
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
>   at 
> 

[jira] [Updated] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-05 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9920:
---
Description: 
StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
failed with a suspicious string with a failure to respond to heartbeats. They 
are in the same CI run so it seems like this is a port conflict where there is 
overlap between the two tests as one is shutting down and the other is starting 
up.

 

Updated: This is part of the long standing problem with port binding and the 
imperfection in handling default ports in tests. In this case 41000.
{code:java}
org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > 
testWithInvalidMemberID FAILED
java.lang.AssertionError: 
Expecting:
 <"Member Count : 1
  Name| Id
- | --
locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
">
to contain:
 <"locatorToStop"> 
at 
org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
at 
org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
{code}
{code:java}
org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
testLimitedAccess FAILED
org.apache.geode.test.dunit.RMIException: While invoking 
org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
Host 07d663f91562 with 4 VMs

Caused by:
org.apache.geode.distributed.DistributedSystemDisconnectedException: 
This connection to a distributed system has been disconnected., caused by 
org.apache.geode.ForcedDisconnectException: Member isn't responding to 
heartbeat requests

Caused by:
org.apache.geode.ForcedDisconnectException: Member isn't responding 
to heartbeat requests

java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
Fix the strings or use IgnoredException.addIgnoredException to ignore.
---
Found suspect string in log4j at line 1125

[fatal 2022/01/04 01:04:33.305 GMT  
tid=100] Membership service failure: Member isn't responding to heartbeat 
requests

org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
 Member isn't responding to heartbeat requests
  at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
  at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
  at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
  at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
  at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
  at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
  at org.jgroups.JChannel.up(JChannel.java:741)
  at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
  at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
  at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
  at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
  at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
  at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
  at 
org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
  at 
org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70)
  at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
  at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
  at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
  at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789)
  at org.jgroups.protocols.TP.receive(TP.java:1714)
  at 
org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:160)
  at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701)
  at java.lang.Thread.run(Thread.java:748)

---
Found suspect string in log4j at line 1191

[error 2022/01/04 01:04:34.715 GMT  
tid=33] Cache initialization for GemFireCache[id = 1852143676; isClosing = 
false; isShutDownAll = false; created = Tue Jan 04 01:04:20 GMT 2022; server = 
false; copyOnRead = false; lockLease = 120; lockTimeout = 

[jira] [Commented] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-05 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469554#comment-17469554
 ] 

Mark Hanson commented on GEODE-9920:


1.12 does not have [~demery]'s changes to make or port reservation system less 
likely to hit bind issues. This looks like an interaction between two tests 
using the same port.

I tend to think we should just close this as not going to fix. Thoughts?

 

> CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and 
> RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port 
> conflict
> ---
>
> Key: GEODE-9920
> URL: https://issues.apache.org/jira/browse/GEODE-9920
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.12.8
>Reporter: Hale Bales
>Assignee: Mark Hanson
>Priority: Major
>  Labels: CI, needsTriage
>
> StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
> AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
> failed with a suspicious string with a failure to respond to heartbeats. They 
> are in the same CI run so it seems like this is a port conflict where there 
> is overlap between the two tests as one is shutting down and the other is 
> starting up.
> {code:java}
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest 
> > testWithInvalidMemberID FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"Member Count : 1
>   Name| Id
> - | --
> locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
> ">
> to contain:
>  <"locatorToStop"> 
> at 
> org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
> at 
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
> {code}
> {code:java}
> org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
> testLimitedAccess FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
> Host 07d663f91562 with 4 VMs
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> This connection to a distributed system has been disconnected., caused by 
> org.apache.geode.ForcedDisconnectException: Member isn't responding to 
> heartbeat requests
> Caused by:
> org.apache.geode.ForcedDisconnectException: Member isn't 
> responding to heartbeat requests
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1125
> [fatal 2022/01/04 01:04:33.305 GMT  
> tid=100] Membership service failure: Member isn't responding to heartbeat 
> requests
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Member isn't responding to heartbeat requests
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70)

[jira] [Assigned] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict

2022-01-04 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9920:
--

Assignee: Mark Hanson

> CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and 
> RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port 
> conflict
> ---
>
> Key: GEODE-9920
> URL: https://issues.apache.org/jira/browse/GEODE-9920
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.12.8
>Reporter: Hale Bales
>Assignee: Mark Hanson
>Priority: Major
>  Labels: CI, needsTriage
>
> StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with 
> AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess 
> failed with a suspicious string with a failure to respond to heartbeats. They 
> are in the same CI run so it seems like this is a port conflict where there 
> is overlap between the two tests as one is shutting down and the other is 
> starting up.
> {code:java}
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest 
> > testWithInvalidMemberID FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"Member Count : 1
>   Name| Id
> - | --
> locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator]
> ">
> to contain:
>  <"locatorToStop"> 
> at 
> org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87)
> at 
> org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240)
> {code}
> {code:java}
> org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > 
> testLimitedAccess FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on 
> Host 07d663f91562 with 4 VMs
> Caused by:
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> This connection to a distributed system has been disconnected., caused by 
> org.apache.geode.ForcedDisconnectException: Member isn't responding to 
> heartbeat requests
> Caused by:
> org.apache.geode.ForcedDisconnectException: Member isn't 
> responding to heartbeat requests
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 1125
> [fatal 2022/01/04 01:04:33.305 GMT  
> tid=100] Membership service failure: Member isn't responding to heartbeat 
> requests
> 
> org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException:
>  Member isn't responding to heartbeat requests
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083)
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70)
>   at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
>   at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
>   at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
>   at 

[jira] [Commented] (GEODE-9885) StringsDUnitTest.givenBucketsMoveDuringAppend_thenDataIsNotLost fails with duplicated append

2022-01-04 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468855#comment-17468855
 ] 

Mark Hanson commented on GEODE-9885:


I just took a look at the test and this doesn't look like a test problem. We 
should probably take a deeper look at this.

> StringsDUnitTest.givenBucketsMoveDuringAppend_thenDataIsNotLost fails with 
> duplicated append
> 
>
> Key: GEODE-9885
> URL: https://issues.apache.org/jira/browse/GEODE-9885
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Ray Ingles
>Priority: Major
>  Labels: needsTriage
>
> The test appends a lot of strings to a key. It wound up adding (at least one) 
> extra string to the stored string:
>  
> {{java.util.concurrent.ExecutionException: java.lang.AssertionError: 
> unexpected -\{append0}-key-3-27680- at index 27681 iterationCount=61995 in 
> string}}
>  
> The string "\{append0}-key-3-27680-" appeared twice in sequence.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone

2022-01-04 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468755#comment-17468755
 ] 

Mark Hanson commented on GEODE-9815:


I have addressed all of the concerns with a solution that I am happy with. 
There is a PR out for review.

> Recovering persistent members can result in extra copies of a bucket or two 
> copies in the same redundancy zone
> --
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach.
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
> and B1
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9878) PostgresJdbcLoaderIntegrationTest. initializationError failed

2021-12-07 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9878:
--

 Summary: PostgresJdbcLoaderIntegrationTest. initializationError 
failed
 Key: GEODE-9878
 URL: https://issues.apache.org/jira/browse/GEODE-9878
 Project: Geode
  Issue Type: Bug
  Components: jdbc, tests
Affects Versions: 1.15.0
Reporter: Mark Hanson


[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk11/builds/42.1]
 failed with the stack trace shown below for PostgresJdbcLoaderIntegrationTest. 
initializationError
{noformat}
org.testcontainers.containers.ContainerLaunchException: Container startup failed
at 
org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:330)
at 
org.testcontainers.containers.GenericContainer.start(GenericContainer.java:311)
at 
org.testcontainers.containers.ContainerisedDockerCompose.invoke(DockerComposeContainer.java:646)
at 
org.testcontainers.containers.DockerComposeContainer.runWithCompose(DockerComposeContainer.java:309)
at 
org.testcontainers.containers.DockerComposeContainer.createServices(DockerComposeContainer.java:233)
at 
org.testcontainers.containers.DockerComposeContainer.start(DockerComposeContainer.java:177)
at 
org.apache.geode.connectors.jdbc.test.junit.rules.SqlDatabaseConnectionRule$1.evaluate(SqlDatabaseConnectionRule.java:57)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at java.util.Iterator.forEachRemaining(Iterator.java:133)
at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
at 
org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
at 
org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
at 
org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:566)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 

[jira] [Created] (GEODE-9877) GeodeRedisServerStartupDUnitTest. startupFailsGivenPortAlreadyInUse failed

2021-12-07 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9877:
--

 Summary: GeodeRedisServerStartupDUnitTest. 
startupFailsGivenPortAlreadyInUse failed
 Key: GEODE-9877
 URL: https://issues.apache.org/jira/browse/GEODE-9877
 Project: Geode
  Issue Type: Bug
  Components: redis
Affects Versions: 1.15.0
Reporter: Mark Hanson


[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/43]
 failed with 

GeodeRedisServerStartupDUnitTest. startupFailsGivenPortAlreadyInUse
{noformat}
java.net.BindException: Address already in use (Bind failed)
at java.net.PlainSocketImpl.socketBind(Native Method)
at 
java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
at java.net.Socket.bind(Socket.java:662)
at 
org.apache.geode.redis.GeodeRedisServerStartupDUnitTest.startupFailsGivenPortAlreadyInUse(GeodeRedisServerStartupDUnitTest.java:115)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:138)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at 
org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
at 

[jira] [Commented] (GEODE-9876) SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED

2021-12-07 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454911#comment-17454911
 ] 

Mark Hanson commented on GEODE-9876:


Hi Mario,

This looks like an error related to GEODE-8202. Can you take a look?

Thanks,

Mark

> SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > 
> testSerialGatewaySenderThreadsConnectToSameReceiver FAILED
> -
>
> Key: GEODE-9876
> URL: https://issues.apache.org/jira/browse/GEODE-9876
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Mario Kevo
>Priority: Major
>
>  
> {noformat}
> java.lang.AssertionError: Error parsing gfsh output. 'Senders Connected' 
> column header not found
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertNotNull(Assert.java:713)
>   at 
> org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.parseSendersConnectedFromGfshOutput(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:236)
>   at 
> org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.allDispatchersConnectedToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:208)
>   at 
> org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.testSerialGatewaySenderThreadsConnectToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:176)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.apache.geode.rules.DockerComposeRule$1.evaluate(DockerComposeRule.java:104)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at java.util.Iterator.forEachRemaining(Iterator.java:116)
>   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>   at 

[jira] [Assigned] (GEODE-9876) SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED

2021-12-07 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9876:
--

Assignee: Mario Kevo

> SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > 
> testSerialGatewaySenderThreadsConnectToSameReceiver FAILED
> -
>
> Key: GEODE-9876
> URL: https://issues.apache.org/jira/browse/GEODE-9876
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Mario Kevo
>Priority: Major
>
>  
> {noformat}
> java.lang.AssertionError: Error parsing gfsh output. 'Senders Connected' 
> column header not found
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertNotNull(Assert.java:713)
>   at 
> org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.parseSendersConnectedFromGfshOutput(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:236)
>   at 
> org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.allDispatchersConnectedToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:208)
>   at 
> org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.testSerialGatewaySenderThreadsConnectToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:176)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.apache.geode.rules.DockerComposeRule$1.evaluate(DockerComposeRule.java:104)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at java.util.Iterator.forEachRemaining(Iterator.java:116)
>   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>   at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
> 

[jira] [Created] (GEODE-9876) SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED

2021-12-07 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9876:
--

 Summary: 
SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > 
testSerialGatewaySenderThreadsConnectToSameReceiver FAILED
 Key: GEODE-9876
 URL: https://issues.apache.org/jira/browse/GEODE-9876
 Project: Geode
  Issue Type: Bug
  Components: wan
Affects Versions: 1.15.0
Reporter: Mark Hanson


 
{noformat}
java.lang.AssertionError: Error parsing gfsh output. 'Senders Connected' column 
header not found
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertNotNull(Assert.java:713)
at 
org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.parseSendersConnectedFromGfshOutput(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:236)
at 
org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.allDispatchersConnectedToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:208)
at 
org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.testSerialGatewaySenderThreadsConnectToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.apache.geode.rules.DockerComposeRule$1.evaluate(DockerComposeRule.java:104)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
at 

[jira] [Commented] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone

2021-12-01 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452113#comment-17452113
 ] 

Mark Hanson commented on GEODE-9815:


I have addressed the two cases above. I am not particularly satisfied with my 
implementation of change 1. I need to get a code review from [~upthewaterspout] 
.



A draft PR is in place.

> Recovering persistent members can result in extra copies of a bucket or two 
> copies in the same redundancy zone
> --
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach.
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
> and B1
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.

2021-11-29 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450699#comment-17450699
 ] 

Mark Hanson commented on GEODE-9856:


Seems related. Should be confirmed.

> SMoveNativeRedisAcceptanceTest is failing with cluster is down.
> ---
>
> Key: GEODE-9856
> URL: https://issues.apache.org/jira/browse/GEODE-9856
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED
> 12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN 
> The cluster is down
> 12:05:00at 
> redis.clients.jedis.Protocol.processError(Protocol.java:125)
> 12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169)
> 12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223)
> 12:05:00at 
> redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
> 12:05:00at 
> redis.clients.jedis.Connection.getIntegerReply(Connection.java:294)
> 12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391)
> 12:05:00at 
> redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973)
> 12:05:00at 
> redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970)
> 12:05:00at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121)
> 12:05:00at 
> redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45)
> 12:05:00at 
> redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975)
> 12:05:00at 
> org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112)
> 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 12:05:00at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 12:05:00at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 12:05:00at java.lang.reflect.Method.invoke(Method.java:498)
> 12:05:00at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 12:05:00at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 12:05:00at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 12:05:00at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 12:05:00at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 12:05:00at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 12:05:00at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 12:05:00at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 12:05:00at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 12:05:00at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> 12:05:00at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> 12:05:00at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> 12:05:00at 
> org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118)
> 12:05:00at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 12:05:00at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> 12:05:00at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> 12:05:00at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> 12:05:00at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> 12:05:00at 

[jira] [Comment Edited] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.

2021-11-29 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450699#comment-17450699
 ] 

Mark Hanson edited comment on GEODE-9856 at 11/29/21, 7:45 PM:
---

GEODE-9428 Seems related. Should be confirmed.


was (Author: mhansonp):
Seems related. Should be confirmed.

> SMoveNativeRedisAcceptanceTest is failing with cluster is down.
> ---
>
> Key: GEODE-9856
> URL: https://issues.apache.org/jira/browse/GEODE-9856
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED
> 12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN 
> The cluster is down
> 12:05:00at 
> redis.clients.jedis.Protocol.processError(Protocol.java:125)
> 12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169)
> 12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223)
> 12:05:00at 
> redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
> 12:05:00at 
> redis.clients.jedis.Connection.getIntegerReply(Connection.java:294)
> 12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391)
> 12:05:00at 
> redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973)
> 12:05:00at 
> redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970)
> 12:05:00at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121)
> 12:05:00at 
> redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45)
> 12:05:00at 
> redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975)
> 12:05:00at 
> org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112)
> 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 12:05:00at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 12:05:00at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 12:05:00at java.lang.reflect.Method.invoke(Method.java:498)
> 12:05:00at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 12:05:00at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 12:05:00at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 12:05:00at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 12:05:00at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 12:05:00at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 12:05:00at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 12:05:00at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 12:05:00at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 12:05:00at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> 12:05:00at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> 12:05:00at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> 12:05:00at 
> org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118)
> 12:05:00at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 12:05:00at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> 12:05:00at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> 12:05:00at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> 12:05:00at 
> 

[jira] [Created] (GEODE-9861) Windows: ResultModelTest.serializeFileToDownload Failed

2021-11-29 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9861:
--

 Summary: Windows: ResultModelTest.serializeFileToDownload Failed
 Key: GEODE-9861
 URL: https://issues.apache.org/jira/browse/GEODE-9861
 Project: Geode
  Issue Type: Bug
  Components: tests
Affects Versions: 1.12.5
Reporter: Mark Hanson


{noformat}
org.apache.geode.management.internal.cli.result.model.ResultModelTest > 
serializeFileToDownload FAILED
    java.io.IOException: Access is denied
        at java.io.WinNTFileSystem.createFileExclusively(Native Method)
        at java.io.File.createNewFile(File.java:1035)
        at org.junit.rules.TemporaryFolder.newFile(TemporaryFolder.java:67)
        at 
org.apache.geode.management.internal.cli.result.model.ResultModelTest.serializeFileToDownload(ResultModelTest.java:176)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9859) Mass-Test-Run: WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED

2021-11-29 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9859:
---
Summary: Mass-Test-Run: 
WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) 
[0] FAILED  (was: 
WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) 
[0] FAILED)

> Mass-Test-Run: 
> WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
> 
>
> Key: GEODE-9859
> URL: https://issues.apache.org/jira/browse/GEODE-9859
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Alberto Gomez
>Priority: Major
>
> Looks like this might be failing from the original PR. I have linked to 
> GEODE-9369 as the most likely origination.
>  
> {noformat}
> WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
>     java.lang.AssertionError: 
>     Expecting elements:
>       ["Execution failed. Error: 
> org.apache.geode.cache.EntryDestroyedException: 937"]
>     to have exactly 1 times execution error
>         at 
> org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9860) NativeRedisRenameRedirectionsDUnitTest. initializationError

2021-11-29 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9860:
--

 Summary: NativeRedisRenameRedirectionsDUnitTest. 
initializationError
 Key: GEODE-9860
 URL: https://issues.apache.org/jira/browse/GEODE-9860
 Project: Geode
  Issue Type: Bug
  Components: redis
Affects Versions: 1.15.0
Reporter: Mark Hanson


{noformat}
NativeRedisRenameRedirectionsDUnitTest > initializationError FAILED
    java.lang.RuntimeException: java.lang.NullPointerException
        at org.rnorth.ducttape.timeouts.Timeouts.callFuture(Timeouts.java:68)
        at org.rnorth.ducttape.timeouts.Timeouts.doWithTimeout(Timeouts.java:60)
        at 
org.testcontainers.containers.wait.strategy.WaitAllStrategy.waitUntilReady(WaitAllStrategy.java:53)
        at 
org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:285)
        at 
java.util.concurrent.ConcurrentHashMap.forEach(ConcurrentHashMap.java:1597)
        at 
org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:265)
        at 
org.testcontainers.containers.DockerComposeContainer.start(DockerComposeContainer.java:179)
        at 
org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:84)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
        at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
        at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.Iterator.forEachRemaining(Iterator.java:116)
        at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
        at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
        at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at 
java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
        at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
        at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
        at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
        at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
        at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
        at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
        at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
        at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
        at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
        at 
org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
        at 
org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
        at 
org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
        at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
        at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
        at 

[jira] [Assigned] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED

2021-11-29 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9859:
--

Assignee: Alberto Gomez

> WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
> -
>
> Key: GEODE-9859
> URL: https://issues.apache.org/jira/browse/GEODE-9859
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Alberto Gomez
>Priority: Major
>
> Looks like this might be failing from the original PR. I have linked to 
> GEODE-9369 as the most likely origination.
>  
> {noformat}
> WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
>     java.lang.AssertionError: 
>     Expecting elements:
>       ["Execution failed. Error: 
> org.apache.geode.cache.EntryDestroyedException: 937"]
>     to have exactly 1 times execution error
>         at 
> org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED

2021-11-29 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9859:
---
Description: 
Looks like this might be failing from the original PR. I have linked to 
GEODE-9369 as the most likely origination.

 
{noformat}
WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, 
false) [0] FAILED
    java.lang.AssertionError: 
    Expecting elements:
      ["Execution failed. Error: 
org.apache.geode.cache.EntryDestroyedException: 937"]
    to have exactly 1 times execution error
        at 
org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450)
 {noformat}

  was:
{noformat}
WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, 
false) [0] FAILED
    java.lang.AssertionError: 
    Expecting elements:
      ["Execution failed. Error: 
org.apache.geode.cache.EntryDestroyedException: 937"]
    to have exactly 1 times execution error
        at 
org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450)
 {noformat}


> WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
> -
>
> Key: GEODE-9859
> URL: https://issues.apache.org/jira/browse/GEODE-9859
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>
> Looks like this might be failing from the original PR. I have linked to 
> GEODE-9369 as the most likely origination.
>  
> {noformat}
> WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
>     java.lang.AssertionError: 
>     Expecting elements:
>       ["Execution failed. Error: 
> org.apache.geode.cache.EntryDestroyedException: 937"]
>     to have exactly 1 times execution error
>         at 
> org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED

2021-11-29 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450670#comment-17450670
 ] 

Mark Hanson commented on GEODE-9859:


This test was failing under windows on the original PR.

> WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
> -
>
> Key: GEODE-9859
> URL: https://issues.apache.org/jira/browse/GEODE-9859
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>
> {noformat}
> WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, 
> false) [0] FAILED
>     java.lang.AssertionError: 
>     Expecting elements:
>       ["Execution failed. Error: 
> org.apache.geode.cache.EntryDestroyedException: 937"]
>     to have exactly 1 times execution error
>         at 
> org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED

2021-11-29 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9859:
--

 Summary: 
WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) 
[0] FAILED
 Key: GEODE-9859
 URL: https://issues.apache.org/jira/browse/GEODE-9859
 Project: Geode
  Issue Type: Bug
  Components: wan
Affects Versions: 1.15.0
Reporter: Mark Hanson


{noformat}
WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, 
false) [0] FAILED
    java.lang.AssertionError: 
    Expecting elements:
      ["Execution failed. Error: 
org.apache.geode.cache.EntryDestroyedException: 937"]
    to have exactly 1 times execution error
        at 
org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (GEODE-9858) Mass-Test-Run failure PingOpDistributedTest. memberShouldCorrectlyRedirectPingMessage

2021-11-29 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9858:
--

 Summary: Mass-Test-Run failure PingOpDistributedTest. 
memberShouldCorrectlyRedirectPingMessage
 Key: GEODE-9858
 URL: https://issues.apache.org/jira/browse/GEODE-9858
 Project: Geode
  Issue Type: Bug
  Components: core
Affects Versions: 1.15.0
Reporter: Mark Hanson


{noformat}
java.lang.AssertionError: 
Expecting actual:
  1638052119621L
to be greater than:
  1638052119621L

at 
org.apache.geode.internal.cache.tier.sockets.PingOpDistributedTest.memberShouldCorrectlyRedirectPingMessage(PingOpDistributedTest.java:205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59)
at 
org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59)
at 
org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59)
at 
org.apache.geode.test.junit.rules.serializable.SerializableTemporaryFolder$1.evaluate(SerializableTemporaryFolder.java:130)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
at 

[jira] [Created] (GEODE-9857) ShowMissingDiskStoreCommandDUnitTest. stopAllMembersAndStart2ndLocator

2021-11-29 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9857:
--

 Summary: ShowMissingDiskStoreCommandDUnitTest. 
stopAllMembersAndStart2ndLocator
 Key: GEODE-9857
 URL: https://issues.apache.org/jira/browse/GEODE-9857
 Project: Geode
  Issue Type: Bug
  Components: tests
Affects Versions: 1.15.0
Reporter: Mark Hanson


{noformat}
ShowMissingDiskStoreCommandDUnitTest > stopAllMembersAndStart2ndLocator FAILED
    org.awaitility.core.ConditionTimeoutException: Assertion condition defined 
as a lambda expression in 
org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest
 
    Expecting value to be true but was false within 5 minutes.
        at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
        at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
        at 
org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
        at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
        at 
org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
        at 
org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.stopAllMembersAndStart2ndLocator(ShowMissingDiskStoreCommandDUnitTest.java:201)
        Caused by:
        org.opentest4j.AssertionFailedError: 
        Expecting value to be true but was false
            at sun.reflect.GeneratedConstructorAccessor23.newInstance(Unknown 
Source)
            at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
            at 
org.apache.geode.test.junit.rules.GfshCommandRule.connectAndVerify(GfshCommandRule.java:153)
            at 
org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.lambda$stopAllMembersAndStart2ndLocator$3(ShowMissingDiskStoreCommandDUnitTest.java:201)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9857) ShowMissingDiskStoreCommandDUnitTest. stopAllMembersAndStart2ndLocator

2021-11-29 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9857:
---
Labels: needsTriage  (was: )

> ShowMissingDiskStoreCommandDUnitTest. stopAllMembersAndStart2ndLocator
> --
>
> Key: GEODE-9857
> URL: https://issues.apache.org/jira/browse/GEODE-9857
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> ShowMissingDiskStoreCommandDUnitTest > stopAllMembersAndStart2ndLocator FAILED
>     org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest
>  
>     Expecting value to be true but was false within 5 minutes.
>         at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
>         at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
>         at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
>         at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
>         at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
>         at 
> org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.stopAllMembersAndStart2ndLocator(ShowMissingDiskStoreCommandDUnitTest.java:201)
>         Caused by:
>         org.opentest4j.AssertionFailedError: 
>         Expecting value to be true but was false
>             at sun.reflect.GeneratedConstructorAccessor23.newInstance(Unknown 
> Source)
>             at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>             at 
> org.apache.geode.test.junit.rules.GfshCommandRule.connectAndVerify(GfshCommandRule.java:153)
>             at 
> org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.lambda$stopAllMembersAndStart2ndLocator$3(ShowMissingDiskStoreCommandDUnitTest.java:201)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.

2021-11-29 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9856:
---
Labels: needsTriage  (was: )

> SMoveNativeRedisAcceptanceTest is failing with cluster is down.
> ---
>
> Key: GEODE-9856
> URL: https://issues.apache.org/jira/browse/GEODE-9856
> Project: Geode
>  Issue Type: Bug
>  Components: redis
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> {noformat}
> SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED
> 12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN 
> The cluster is down
> 12:05:00at 
> redis.clients.jedis.Protocol.processError(Protocol.java:125)
> 12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169)
> 12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223)
> 12:05:00at 
> redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
> 12:05:00at 
> redis.clients.jedis.Connection.getIntegerReply(Connection.java:294)
> 12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391)
> 12:05:00at 
> redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973)
> 12:05:00at 
> redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970)
> 12:05:00at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121)
> 12:05:00at 
> redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45)
> 12:05:00at 
> redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975)
> 12:05:00at 
> org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112)
> 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 12:05:00at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 12:05:00at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 12:05:00at java.lang.reflect.Method.invoke(Method.java:498)
> 12:05:00at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 12:05:00at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 12:05:00at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 12:05:00at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 12:05:00at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 12:05:00at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 12:05:00at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 12:05:00at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 12:05:00at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 12:05:00at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 12:05:00at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> 12:05:00at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> 12:05:00at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> 12:05:00at 
> org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118)
> 12:05:00at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 12:05:00at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> 12:05:00at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> 12:05:00at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> 12:05:00at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> 12:05:00at java.util.Iterator.forEachRemaining(Iterator.java:116)
> 12:05:00

[jira] [Created] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.

2021-11-29 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9856:
--

 Summary: SMoveNativeRedisAcceptanceTest is failing with cluster is 
down.
 Key: GEODE-9856
 URL: https://issues.apache.org/jira/browse/GEODE-9856
 Project: Geode
  Issue Type: Bug
  Components: redis
Affects Versions: 1.15.0
Reporter: Mark Hanson


{noformat}
SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED
12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN 
The cluster is down
12:05:00at redis.clients.jedis.Protocol.processError(Protocol.java:125)
12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169)
12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223)
12:05:00at 
redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
12:05:00at 
redis.clients.jedis.Connection.getIntegerReply(Connection.java:294)
12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391)
12:05:00at 
redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973)
12:05:00at 
redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970)
12:05:00at 
redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121)
12:05:00at 
redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45)
12:05:00at redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975)
12:05:00at 
org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112)
12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
12:05:00at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
12:05:00at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
12:05:00at java.lang.reflect.Method.invoke(Method.java:498)
12:05:00at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
12:05:00at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
12:05:00at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
12:05:00at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
12:05:00at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
12:05:00at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
12:05:00at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
12:05:00at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
12:05:00at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
12:05:00at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
12:05:00at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
12:05:00at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
12:05:00at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
12:05:00at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
12:05:00at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
12:05:00at 
org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118)
12:05:00at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20)
12:05:00at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
12:05:00at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
12:05:00at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
12:05:00at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
12:05:00at java.util.Iterator.forEachRemaining(Iterator.java:116)
12:05:00at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
12:05:00at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
12:05:00at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
12:05:00at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
12:05:00at 

[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-17 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17445643#comment-17445643
 ] 

Mark Hanson commented on GEODE-8644:


 I have rerun this on a variety of cloud instances trying to reproduce this and 
I have not been successful. 

I think we may need to add more logging into the code so when it does fail we 
have more detail.

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone

2021-11-17 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9815:
--

Assignee: Mark Hanson

> Recovering persistent members can result in extra copies of a bucket or two 
> copies int the same redundancy zone
> ---
>
> Key: GEODE-9815
> URL: https://issues.apache.org/jira/browse/GEODE-9815
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.15.0
>Reporter: Dan Smith
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage
>
> The fix in GEODE-9554 is incomplete for some cases, and it also introduces a 
> new issue when removing buckets that are over redundancy.
> GEODE-9554 and these new issues are all related to using redundancy zones and 
> having persistent members.
> With persistence, when we start up a member with persisted buckets, we always 
> recover the persisted buckets on startup, regardless of whether redundancy is 
> already met or what zone the existing buckets are on. This is necessary to 
> ensure that we can recover all colocated buckets that might be persisted on 
> the member.
> Because recovering these persistent buckets may cause us to go over 
> redundancy, after we recover from disk, we run a "restore redundancy" task 
> that actually removes copies of buckets that are over redundancy.
> GEODE-9554 addressed one case where we end up removing the last copy of a 
> bucket from one redundancy zone while leaving two copies in another 
> redundancy zone. It did so by disallowing the removal of a bucket if it is 
> the last copy in a redundancy zone.
> There are a couple of issues with this approach.
> *Problem 1:* We may end up with two copies of the bucket in one zone in some 
> cases
> With a slight tweak to the scenario fixed with GEODE-9554 we can end up never 
> getting out of the situation where we have two copies of a bucket in the same 
> zone.
> Steps:
> 1. Start two redundancy zones A and B with two members each.  Bucket 0 is on 
> member A1 and B1.
> 2. Shutdown member A1.
> 3. Rebalance - this will create bucket 0 on A2.
> 4. Shutdown B1. Revoke it's disk store and delete the data
> 5. Startup A1 - it will recover bucket 0.
> 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that 
> situation.
> *Problem 2:* We may never delete extra copies of a bucket
> The fix for GEODE-9554 introduces a new problem if we have more than 2 
> redundancy zones
> Steps
> 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 
> and B1
> 2. Shutdown A1
> 3. Rebalance -  this will create Bucket 0 on C1
> 4. Startup A1 - this will recreate bucket 0
> 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy.
> I think the overall fix is probably to do something different than prevent 
> removing the last copy of a bucket from a redundancy zone. Instead, I think 
> we should do something like this:
> 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* 
> buckets that have two copies in the same zone, as well as any buckets that 
> are actually over redundancy.
> 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra 
> copies of a bucket in the same zone first
> 3. Back out the changes for GEODE-9554 and let the last copy be deleted from 
> a zone.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly

2021-11-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-8644:
--

Assignee: Mark Hanson

> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> intermittently fails when queues drain too slowly
> ---
>
> Key: GEODE-8644
> URL: https://issues.apache.org/jira/browse/GEODE-8644
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Benjamin P Ross
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, needsTriage, pull-request-available
>
> Currently the test 
> SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() 
> relies on a 2 second delay to allow for queues to finish draining after 
> finishing the put operation. If queues take longer than 2 seconds to drain 
> the test will fail. We should change the test to wait for the queues to be 
> empty with a long timeout in case the queues never fully drain.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-9425) AutoConnectionSource thread in client can't query for available locators when it is connected to a locator that was shut down

2021-11-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9425:
--

Assignee: (was: Mark Hanson)

> AutoConnectionSource thread in client can't query for available locators when 
> it is connected to a locator that was shut down
> -
>
> Key: GEODE-9425
> URL: https://issues.apache.org/jira/browse/GEODE-9425
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Lynn Gallinat
>Priority: Major
>
> The AutoConnectionSource thread runs in a client and queries the locator that 
> client is connected to so it can update the list of available locators.
>  But if the locator the client is connected to was shut down, the client 
> can't get an updated locator list.
>  In this case the locator was shut down and is not coming back, but there is 
> another available locator.
>  However we can't find out what that available locator is because we can't 
> complete the query.
> To summarize: The AutoConnectionSource thread that runs in a client to update 
> the list of available locators should be able to get a list of available 
> locators even when that client is connected to a locator that was shut down.
> The AutoConnectionSource thread starts and runs every 10 seconds. This is 
> from the client's system log.
>  [info 2021/07/07 19:37:33.723 GMT clientgemfire1_host1_881 
>  tid=0x2d] AutoConnectionSource 
> UpdateLocatorListTask started with interval=1 ms.
> After the locator is shut down the AutoConnectionSource thread can't complete 
> its work so we get stuck threads.
> This stuck thread stack shows it is trying to run UpdateLocatorListTask.
> {noformat}
> clientgemfire1_881/system.log: [warn 2021/07/07 19:47:25.784 GMT 
> clientgemfire1_host1_881  tid=0x36] Thread <286> (0x11e) that 
> was executed at <07 Jul 2021 19:46:03 GMT> has been stuck for <82.041 
> seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack for "poolTimer-pool-24" (0x11e):
> java.lang.ThreadState: RUNNABLE (in native)
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> - locked java.net.SocksSocketImpl@3e95a505
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:607)
>   at 
> org.apache.geode.distributed.internal.tcpserver.AdvancedSocketCreatorImpl.connect(AdvancedSocketCreatorImpl.java:102)
>   at 
> org.apache.geode.internal.net.SCAdvancedSocketCreator.connect(SCAdvancedSocketCreator.java:51)
>   at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.connect(ClusterSocketCreatorImpl.java:96)
>   at 
> org.apache.geode.distributed.internal.tcpserver.TcpClient.getServerVersion(TcpClient.java:246)
>   at 
> org.apache.geode.distributed.internal.tcpserver.TcpClient.requestToServer(TcpClient.java:151)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocatorUsingConnection(AutoConnectionSourceImpl.java:217)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocator(AutoConnectionSourceImpl.java:207)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryLocators(AutoConnectionSourceImpl.java:254)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.access$200(AutoConnectionSourceImpl.java:68)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl$UpdateLocatorListTask.run2(AutoConnectionSourceImpl.java:458)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1334)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
>   - java.util.concurrent.ThreadPoolExecutor$Worker@24cd39b5
> {noformat}
> Impact on running cache operations:
>  Any operations in progress by the client connected to a locator that was 
> shut down can take 59 seconds to complete, which is the 

[jira] [Commented] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client

2021-11-11 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442142#comment-17442142
 ] 

Mark Hanson commented on GEODE-8616:


Added a couple of reproductions to the bug as attachments. These were 
reproduced using develop. 

> ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException fails with 
> ServerConnectivityException : Pool unexpected socket timed out on client
> --
>
> Key: GEODE-8616
> URL: https://issues.apache.org/jira/browse/GEODE-8616
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.12.1
>Reporter: Donal Evans
>Priority: Major
>  Labels: GeodeOperationAPI, flaky-test
> Attachments: hansonm-findfailures-11-10-2021-23-52-38-logs.tgz, 
> hansonm-findfailures-11-10-2021-23-52-45-logs.tgz
>
>
> {noformat}
> > Task :geode-core:distributedTest
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run
>  in VM 2 running on Host c1346ab7b3e3 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117)
> Caused by:
> org.apache.geode.cache.client.ServerConnectivityException: Pool 
> unexpected socket timed out on client connection=Pooled Connection to 
> c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not 
> connect after 1 attempts
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108)
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774)
> at 
> org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91)
> at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116)
> at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795)
> at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472)
> at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445)
> at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306)
> at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120)
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz
> This is a flaky failure.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client

2021-11-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-8616:
---
Attachment: hansonm-findfailures-11-10-2021-23-52-38-logs.tgz
hansonm-findfailures-11-10-2021-23-52-45-logs.tgz

> ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException fails with 
> ServerConnectivityException : Pool unexpected socket timed out on client
> --
>
> Key: GEODE-8616
> URL: https://issues.apache.org/jira/browse/GEODE-8616
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.12.1
>Reporter: Donal Evans
>Priority: Major
>  Labels: GeodeOperationAPI, flaky-test
> Attachments: hansonm-findfailures-11-10-2021-23-52-38-logs.tgz, 
> hansonm-findfailures-11-10-2021-23-52-45-logs.tgz
>
>
> {noformat}
> > Task :geode-core:distributedTest
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run
>  in VM 2 running on Host c1346ab7b3e3 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117)
> Caused by:
> org.apache.geode.cache.client.ServerConnectivityException: Pool 
> unexpected socket timed out on client connection=Pooled Connection to 
> c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not 
> connect after 1 attempts
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108)
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774)
> at 
> org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91)
> at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116)
> at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795)
> at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472)
> at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445)
> at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306)
> at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120)
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz
> This is a flaky failure.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client

2021-11-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-8616:
--

Assignee: Mark Hanson

> ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException fails with 
> ServerConnectivityException : Pool unexpected socket timed out on client
> --
>
> Key: GEODE-8616
> URL: https://issues.apache.org/jira/browse/GEODE-8616
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.12.1
>Reporter: Donal Evans
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, flaky-test
>
> {noformat}
> > Task :geode-core:distributedTest
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run
>  in VM 2 running on Host c1346ab7b3e3 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117)
> Caused by:
> org.apache.geode.cache.client.ServerConnectivityException: Pool 
> unexpected socket timed out on client connection=Pooled Connection to 
> c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not 
> connect after 1 attempts
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108)
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774)
> at 
> org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91)
> at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116)
> at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795)
> at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472)
> at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445)
> at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306)
> at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120)
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz
> This is a flaky failure.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client

2021-11-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-8616:
--

Assignee: (was: Mark Hanson)

> ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException fails with 
> ServerConnectivityException : Pool unexpected socket timed out on client
> --
>
> Key: GEODE-8616
> URL: https://issues.apache.org/jira/browse/GEODE-8616
> Project: Geode
>  Issue Type: Bug
>Affects Versions: 1.12.1
>Reporter: Donal Evans
>Priority: Major
>  Labels: GeodeOperationAPI, flaky-test
>
> {noformat}
> > Task :geode-core:distributedTest
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > 
> largeObjectPutWithReadTimeoutThrowsException FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run
>  in VM 2 running on Host c1346ab7b3e3 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117)
> Caused by:
> org.apache.geode.cache.client.ServerConnectivityException: Pool 
> unexpected socket timed out on client connection=Pooled Connection to 
> c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not 
> connect after 1 attempts
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153)
> at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108)
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774)
> at 
> org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91)
> at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116)
> at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795)
> at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472)
> at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445)
> at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321)
> at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306)
> at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436)
> at 
> org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120)
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz
> This is a flaky failure.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (GEODE-9425) AutoConnectionSource thread in client can't query for available locators when it is connected to a locator that was shut down

2021-11-03 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437768#comment-17437768
 ] 

Mark Hanson commented on GEODE-9425:


Are there more logs available?

> AutoConnectionSource thread in client can't query for available locators when 
> it is connected to a locator that was shut down
> -
>
> Key: GEODE-9425
> URL: https://issues.apache.org/jira/browse/GEODE-9425
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Lynn Gallinat
>Assignee: Mark Hanson
>Priority: Major
>
> The AutoConnectionSource thread runs in a client and queries the locator that 
> client is connected to so it can update the list of available locators.
>  But if the locator the client is connected to was shut down, the client 
> can't get an updated locator list.
>  In this case the locator was shut down and is not coming back, but there is 
> another available locator.
>  However we can't find out what that available locator is because we can't 
> complete the query.
> To summarize: The AutoConnectionSource thread that runs in a client to update 
> the list of available locators should be able to get a list of available 
> locators even when that client is connected to a locator that was shut down.
> The AutoConnectionSource thread starts and runs every 10 seconds. This is 
> from the client's system log.
>  [info 2021/07/07 19:37:33.723 GMT clientgemfire1_host1_881 
>  tid=0x2d] AutoConnectionSource 
> UpdateLocatorListTask started with interval=1 ms.
> After the locator is shut down the AutoConnectionSource thread can't complete 
> its work so we get stuck threads.
> This stuck thread stack shows it is trying to run UpdateLocatorListTask.
> {noformat}
> clientgemfire1_881/system.log: [warn 2021/07/07 19:47:25.784 GMT 
> clientgemfire1_host1_881  tid=0x36] Thread <286> (0x11e) that 
> was executed at <07 Jul 2021 19:46:03 GMT> has been stuck for <82.041 
> seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack for "poolTimer-pool-24" (0x11e):
> java.lang.ThreadState: RUNNABLE (in native)
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> - locked java.net.SocksSocketImpl@3e95a505
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:607)
>   at 
> org.apache.geode.distributed.internal.tcpserver.AdvancedSocketCreatorImpl.connect(AdvancedSocketCreatorImpl.java:102)
>   at 
> org.apache.geode.internal.net.SCAdvancedSocketCreator.connect(SCAdvancedSocketCreator.java:51)
>   at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.connect(ClusterSocketCreatorImpl.java:96)
>   at 
> org.apache.geode.distributed.internal.tcpserver.TcpClient.getServerVersion(TcpClient.java:246)
>   at 
> org.apache.geode.distributed.internal.tcpserver.TcpClient.requestToServer(TcpClient.java:151)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocatorUsingConnection(AutoConnectionSourceImpl.java:217)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocator(AutoConnectionSourceImpl.java:207)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryLocators(AutoConnectionSourceImpl.java:254)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.access$200(AutoConnectionSourceImpl.java:68)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl$UpdateLocatorListTask.run2(AutoConnectionSourceImpl.java:458)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1334)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
>   - java.util.concurrent.ThreadPoolExecutor$Worker@24cd39b5
> {noformat}
> Impact on running cache operations:
>  Any operations in progress by the client connected to a locator that 

[jira] [Assigned] (GEODE-9425) AutoConnectionSource thread in client can't query for available locators when it is connected to a locator that was shut down

2021-10-25 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9425:
--

Assignee: Mark Hanson

> AutoConnectionSource thread in client can't query for available locators when 
> it is connected to a locator that was shut down
> -
>
> Key: GEODE-9425
> URL: https://issues.apache.org/jira/browse/GEODE-9425
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Lynn Gallinat
>Assignee: Mark Hanson
>Priority: Major
>
> The AutoConnectionSource thread runs in a client and queries the locator that 
> client is connected to so it can update the list of available locators.
>  But if the locator the client is connected to was shut down, the client 
> can't get an updated locator list.
>  In this case the locator was shut down and is not coming back, but there is 
> another available locator.
>  However we can't find out what that available locator is because we can't 
> complete the query.
> To summarize: The AutoConnectionSource thread that runs in a client to update 
> the list of available locators should be able to get a list of available 
> locators even when that client is connected to a locator that was shut down.
> The AutoConnectionSource thread starts and runs every 10 seconds. This is 
> from the client's system log.
>  [info 2021/07/07 19:37:33.723 GMT clientgemfire1_host1_881 
>  tid=0x2d] AutoConnectionSource 
> UpdateLocatorListTask started with interval=1 ms.
> After the locator is shut down the AutoConnectionSource thread can't complete 
> its work so we get stuck threads.
> This stuck thread stack shows it is trying to run UpdateLocatorListTask.
> {noformat}
> clientgemfire1_881/system.log: [warn 2021/07/07 19:47:25.784 GMT 
> clientgemfire1_host1_881  tid=0x36] Thread <286> (0x11e) that 
> was executed at <07 Jul 2021 19:46:03 GMT> has been stuck for <82.041 
> seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack for "poolTimer-pool-24" (0x11e):
> java.lang.ThreadState: RUNNABLE (in native)
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> - locked java.net.SocksSocketImpl@3e95a505
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:607)
>   at 
> org.apache.geode.distributed.internal.tcpserver.AdvancedSocketCreatorImpl.connect(AdvancedSocketCreatorImpl.java:102)
>   at 
> org.apache.geode.internal.net.SCAdvancedSocketCreator.connect(SCAdvancedSocketCreator.java:51)
>   at 
> org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.connect(ClusterSocketCreatorImpl.java:96)
>   at 
> org.apache.geode.distributed.internal.tcpserver.TcpClient.getServerVersion(TcpClient.java:246)
>   at 
> org.apache.geode.distributed.internal.tcpserver.TcpClient.requestToServer(TcpClient.java:151)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocatorUsingConnection(AutoConnectionSourceImpl.java:217)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocator(AutoConnectionSourceImpl.java:207)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryLocators(AutoConnectionSourceImpl.java:254)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.access$200(AutoConnectionSourceImpl.java:68)
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl$UpdateLocatorListTask.run2(AutoConnectionSourceImpl.java:458)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1334)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
>   - java.util.concurrent.ThreadPoolExecutor$Worker@24cd39b5
> {noformat}
> Impact on running cache operations:
>  Any operations in progress by the client connected to a locator that was 
> shut down can take 59 seconds to 

[jira] [Resolved] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails

2021-10-18 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9645.

Fix Version/s: 1.15.0
   Resolution: Fixed

The code will not send DataSerializer registration notifications when using 
multiuser authentication.

> MultiUserAuth: DataSerializerRecoveryListener is called without auth 
> information. Promptly fails
> 
>
> Key: GEODE-9645
> URL: https://issues.apache.org/jira/browse/GEODE-9645
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.15.0
>
>
> When multiuserSecureModeEnabled is enabled,  a user may register a 
> DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
> to register the data serializers with other machines. This is a problem was 
> there is no authentication information in the background process to 
> authenticate. Hence the error seen below.
>  
> {noformat}
> [warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
> DataSerializerRecoveryTask - Error recovering dataSerializers: 
> java.lang.UnsupportedOperationException: Use Pool APIs for doing operations 
> when multiuser-secure-mode-enabled is set to true. 
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
>  
> at 
> org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9617) CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size

2021-10-11 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9617:
--

Assignee: Mark Hanson

> CI Failure: PartitionedRegionSingleHopDUnitTest fails with 
> ConditionTimeoutException waiting for server to bucket map size
> --
>
> Key: GEODE-9617
> URL: https://issues.apache.org/jira/browse/GEODE-9617
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Kirk Lund
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testClientMetadataForPersistentPrs FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService, 
> org.apache.geode.cache.client.internal.ClientMetadataServiceorg.apache.geode.cache.Region
>  
> Expecting actual not to be null within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs(PartitionedRegionSingleHopDUnitTest.java:971)
> Caused by:
> java.lang.AssertionError: 
> Expecting actual not to be null
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:976)
> {noformat}
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testMetadataServiceCallAccuracy_FromGetOp FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e> within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testMetadataServiceCallAccuracy_FromGetOp(PartitionedRegionSingleHopDUnitTest.java:394)
> Caused by:
> org.junit.ComparisonFailure: 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e>
> at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
> Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testMetadataServiceCallAccuracy_FromGetOp$6(PartitionedRegionSingleHopDUnitTest.java:395)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9617) CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size

2021-10-11 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427402#comment-17427402
 ] 

Mark Hanson commented on GEODE-9617:


I did a little cleanup and added an assert that should help in future test 
runs. It actually moved the error to the latch assertions that I made, which 
means the failure was being missed before.

> CI Failure: PartitionedRegionSingleHopDUnitTest fails with 
> ConditionTimeoutException waiting for server to bucket map size
> --
>
> Key: GEODE-9617
> URL: https://issues.apache.org/jira/browse/GEODE-9617
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.15.0
>Reporter: Kirk Lund
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testClientMetadataForPersistentPrs FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService, 
> org.apache.geode.cache.client.internal.ClientMetadataServiceorg.apache.geode.cache.Region
>  
> Expecting actual not to be null within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs(PartitionedRegionSingleHopDUnitTest.java:971)
> Caused by:
> java.lang.AssertionError: 
> Expecting actual not to be null
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:976)
> {noformat}
> {noformat}
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > 
> testMetadataServiceCallAccuracy_FromGetOp FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses 
> org.apache.geode.cache.client.internal.ClientMetadataService 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e> within 5 minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testMetadataServiceCallAccuracy_FromGetOp(PartitionedRegionSingleHopDUnitTest.java:394)
> Caused by:
> org.junit.ComparisonFailure: 
> Expecting value to be false but was true expected:<[fals]e> but 
> was:<[tru]e>
> at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
> Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testMetadataServiceCallAccuracy_FromGetOp$6(PartitionedRegionSingleHopDUnitTest.java:395)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails

2021-09-30 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9645:
---
Labels: pull-request-available  (was: needsTriage pull-request-available)

> MultiUserAuth: DataSerializerRecoveryListener is called without auth 
> information. Promptly fails
> 
>
> Key: GEODE-9645
> URL: https://issues.apache.org/jira/browse/GEODE-9645
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> When multiuserSecureModeEnabled is enabled,  a user may register a 
> DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
> to register the data serializers with other machines. This is a problem was 
> there is no authentication information in the background process to 
> authenticate. Hence the error seen below.
>  
> {noformat}
> [warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
> DataSerializerRecoveryTask - Error recovering dataSerializers: 
> java.lang.UnsupportedOperationException: Use Pool APIs for doing operations 
> when multiuser-secure-mode-enabled is set to true. 
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
>  
> at 
> org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.

2021-09-30 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9647:
---
Labels:   (was: needsTriage)

> MultiUserAuth: DataSerializer.Register throws when attempting to register a 
> new DataSerializer.
> ---
>
> Key: GEODE-9647
> URL: https://issues.apache.org/jira/browse/GEODE-9647
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>
> When multiuserSecureModeEnabled is set, a user may attempt to register a 
> DataSerializer, but will get the following error. The reason is that the 
> PoolImpl needs credentials to authenticate against, which it does not have.
>  
> {noformat}
> [warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error registering 
> instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs 
> for doing operations when multiuser-secure-mode-enabled is set to true.
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800)
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34)
>  
> at 
> org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966)
>  at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) 
> at 
> org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152)
>  
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  
> at 
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
>  
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
>  
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139)
>  
> at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>  
> at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  
> at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
>  
> at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.

2021-09-30 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9647.

Resolution: Duplicate

One solution can fix both of these issues as they are related. So closing this 
issue as a duplicate.

> MultiUserAuth: DataSerializer.Register throws when attempting to register a 
> new DataSerializer.
> ---
>
> Key: GEODE-9647
> URL: https://issues.apache.org/jira/browse/GEODE-9647
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> When multiuserSecureModeEnabled is set, a user may attempt to register a 
> DataSerializer, but will get the following error. The reason is that the 
> PoolImpl needs credentials to authenticate against, which it does not have.
>  
> {noformat}
> [warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error registering 
> instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs 
> for doing operations when multiuser-secure-mode-enabled is set to true.
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800)
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34)
>  
> at 
> org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966)
>  at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) 
> at 
> org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152)
>  
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  
> at 
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
>  
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
>  
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139)
>  
> at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>  
> at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  
> at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
>  
> at 

[jira] [Assigned] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.

2021-09-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9647:
--

Assignee: Mark Hanson

> MultiUserAuth: DataSerializer.Register throws when attempting to register a 
> new DataSerializer.
> ---
>
> Key: GEODE-9647
> URL: https://issues.apache.org/jira/browse/GEODE-9647
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> When multiuserSecureModeEnabled is set, a user may attempt to register a 
> DataSerializer, but will get the following error. The reason is that the 
> PoolImpl needs credentials to authenticate against, which it does not have.
>  
> {noformat}
> [warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error registering 
> instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs 
> for doing operations when multiuser-secure-mode-enabled is set to true.
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800)
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34)
>  
> at 
> org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966)
>  at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) 
> at 
> org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152)
>  
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  
> at 
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
>  
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
>  
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139)
>  
> at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>  
> at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  
> at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
>  
> at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails

2021-09-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9645:
--

Assignee: Mark Hanson

> MultiUserAuth: DataSerializerRecoveryListener is called without auth 
> information. Promptly fails
> 
>
> Key: GEODE-9645
> URL: https://issues.apache.org/jira/browse/GEODE-9645
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> When multiuserSecureModeEnabled is enabled,  a user may register a 
> DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
> to register the data serializers with other machines. This is a problem was 
> there is no authentication information in the background process to 
> authenticate. Hence the error seen below.
>  
> {noformat}
> [warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
> DataSerializerRecoveryTask - Error recovering dataSerializers: 
> java.lang.UnsupportedOperationException: Use Pool APIs for doing operations 
> when multiuser-secure-mode-enabled is set to true. 
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
>  
> at 
> org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.

2021-09-28 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421565#comment-17421565
 ] 

Mark Hanson commented on GEODE-9647:


The solution appears to be to plumb the regionService down into the 
DataSerializer class call to register. Doing so appears to alleviate the issue.

> MultiUserAuth: DataSerializer.Register throws when attempting to register a 
> new DataSerializer.
> ---
>
> Key: GEODE-9647
> URL: https://issues.apache.org/jira/browse/GEODE-9647
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> When multiuserSecureModeEnabled is set, a user may attempt to register a 
> DataSerializer, but will get the following error. The reason is that the 
> PoolImpl needs credentials to authenticate against, which it does not have.
>  
> {noformat}
> [warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error registering 
> instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs 
> for doing operations when multiuser-secure-mode-enabled is set to true.
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800)
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34)
>  
> at 
> org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093)
>  
> at 
> org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966)
>  at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) 
> at 
> org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152)
>  
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  
> at 
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
>  
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  
> at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
> at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> at 
> org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
>  
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139)
>  
> at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>  
> at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  
> at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
>  
> at 

[jira] [Updated] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.

2021-09-28 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9647:
---
Description: 
When multiuserSecureModeEnabled is set, a user may attempt to register a 
DataSerializer, but will get the following error. The reason is that the 
PoolImpl needs credentials to authenticate against, which it does not have.

 
{noformat}
[warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error registering 
instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs for 
doing operations when multiuser-secure-mode-enabled is set to true.
at 
org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
 
at 
org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800)
at 
org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34)
 
at 
org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264)
 
at 
org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197)
 
at 
org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093)
 
at 
org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966)
 at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) 
at 
org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152)
 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:498) 
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 
at 
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
 
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
 
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
 
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
 
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at 
org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
 
at 
org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139)
 
at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 
at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
 
at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 
at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
 
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) {noformat}

  was:
When multiuserSecureModeEnabled is set, a user may attempt to register a 
DataSerializer, but will get the following error. The reason is that the 
PoolImpl needs credentials to authenticate against, which it does not have.

 
{noformat}
[warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error registering 
instantiator on pool:[warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error 
registering instantiator on pool:java.lang.UnsupportedOperationException: Use 
Pool APIs for doing operations when multiuser-secure-mode-enabled is set to 
true. at 
org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
 at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) 
at 

[jira] [Created] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.

2021-09-28 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9647:
--

 Summary: MultiUserAuth: DataSerializer.Register throws when 
attempting to register a new DataSerializer.
 Key: GEODE-9647
 URL: https://issues.apache.org/jira/browse/GEODE-9647
 Project: Geode
  Issue Type: Bug
  Components: core
Affects Versions: 1.15.0
Reporter: Mark Hanson


When multiuserSecureModeEnabled is set, a user may attempt to register a 
DataSerializer, but will get the following error. The reason is that the 
PoolImpl needs credentials to authenticate against, which it does not have.

 
{noformat}
[warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error registering 
instantiator on pool:[warn 2021/09/28 10:32:42.470 PDT   tid=0x1] Error 
registering instantiator on pool:java.lang.UnsupportedOperationException: Use 
Pool APIs for doing operations when multiuser-secure-mode-enabled is set to 
true. at 
org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
 at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) 
at 
org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34)
 at 
org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264)
 at 
org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197)
 at 
org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093)
 at 
org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966)
 at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) at 
org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
 at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
 at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at 
org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40)
 at 
org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:413) at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails

2021-09-27 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9645:
---
Description: 
When multiuserSecureModeEnabled is enabled,  a user may register a 
DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
to register the data serializers with other machines. This is a problem was 
there is no authentication information in the background process to 
authenticate. Hence the error seen below.

 
{noformat}
[warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
DataSerializerRecoveryTask - Error recovering dataSerializers: 
java.lang.UnsupportedOperationException: Use Pool APIs for doing operations 
when multiuser-secure-mode-enabled is set to true. 
at 
org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
 
at 
org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
at 
org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
 
at 
org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
 
at 
org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748){noformat}

  was:
When multiuserSecureModeEnabled is enabled,  a user may register a 
DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
to register the data serializers with other machines. This is a problem was 
there is no authentication information in the background process to 
authenticate. Hence the error seen below.

 
{noformat}
[warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
DataSerializerRecoveryTask - Error recovering dataSerializers: [warn 2021/09/27 
18:03:02.804 PDT   tid=0x62] DataSerializerRecoveryTask - 
Error recovering dataSerializers: java.lang.UnsupportedOperationException: Use 
Pool APIs for doing operations when multiuser-secure-mode-enabled is set to 
true. at 
org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
 at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
at 
org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
 at 
org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
 at 
org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748){noformat}


> MultiUserAuth: DataSerializerRecoveryListener is called without auth 
> information. Promptly fails
> 
>
> Key: GEODE-9645
> URL: https://issues.apache.org/jira/browse/GEODE-9645
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Reporter: Mark Hanson
>Priority: Major
>  Labels: needsTriage
>
> When multiuserSecureModeEnabled is enabled,  a user may register a 
> DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
> to register the data serializers with other machines. This is a problem was 
> there is no authentication information in the background process to 
> authenticate. Hence the error seen below.
>  
> {noformat}
> [warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
> DataSerializerRecoveryTask - Error recovering dataSerializers: 
> java.lang.UnsupportedOperationException: Use Pool APIs for doing operations 
> when multiuser-secure-mode-enabled is set to true. 
> at 
> org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
> at 
> org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
>  
> at 
> org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
>  
> at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> at java.lang.Thread.run(Thread.java:748){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails

2021-09-27 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9645:
--

 Summary: MultiUserAuth: DataSerializerRecoveryListener is called 
without auth information. Promptly fails
 Key: GEODE-9645
 URL: https://issues.apache.org/jira/browse/GEODE-9645
 Project: Geode
  Issue Type: Bug
  Components: core
Reporter: Mark Hanson


When multiuserSecureModeEnabled is enabled,  a user may register a 
DataSerializer. When endpoint manager detects a new endpoint, it will attempt 
to register the data serializers with other machines. This is a problem was 
there is no authentication information in the background process to 
authenticate. Hence the error seen below.

 
{noformat}
[warn 2021/09/27 18:03:02.804 PDT   tid=0x62] 
DataSerializerRecoveryTask - Error recovering dataSerializers: [warn 2021/09/27 
18:03:02.804 PDT   tid=0x62] DataSerializerRecoveryTask - 
Error recovering dataSerializers: java.lang.UnsupportedOperationException: Use 
Pool APIs for doing operations when multiuser-secure-mode-enabled is set to 
true. at 
org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540)
 at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) 
at 
org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40)
 at 
org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116)
 at 
org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9365) HARegionQueue over throttles when multiple threads attempt concurrent adds

2021-09-20 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9365.

Fix Version/s: 1.15.0
   Resolution: Fixed

This fix includes changes to reduce the number of semaphores used in the 

HARegionQueue while waiting for permission to add more items into the queue.

 

This reduced two semaphores to one and switched several fields to Atomics

> HARegionQueue over throttles when multiple threads attempt concurrent adds
> --
>
> Key: GEODE-9365
> URL: https://issues.apache.org/jira/browse/GEODE-9365
> Project: Geode
>  Issue Type: Bug
>  Components: client queues
>Reporter: Darrel Schneider
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI, pull-request-available
> Fix For: 1.15.0
>
>
> HARegionQueue.checkQueueSizeConstraint has some code that implements a 
> "throttle" on adds to a queue that is full. It is supposed to wait 
> "eventEnqueueWaitTime" before doing an add. But because this code does two 
> syncs (putGuard and permitMon) and only waits on one of them, it holds the 
> other sync for the duration of this threads throttle. Any other concurrent 
> thread trying to add to the queue gets stuck on the putGuard sync that is 
> held by the first thread that is doing the timed wait. So it ends up waiting 
> "eventEnqueueWaitTime" to acquire the first sync and then ends up waiting 
> again "eventEnqueueWaitTime" when it does its own timed wait. If you have 10 
> concurrent threads trying to add one of them will end up waiting 10 *  
> "eventEnqueueWaitTime".
> A couple ideas of how to fix this. Get rid of the putGuard and just use 
> permitMon. Then as soon as the first thread goes into its timed wait another 
> thread is allowed to sync on permitMon. But if this is done then we need to 
> think carefully about the code inside this sync block since it can not be 
> executed while one or more other threads are waiting in permitMon.
> The other solution would be to compute the elapsed time it took to get into 
> the first sync and subtract that from the time we wait on permitMon. This 
> seems like a simple solution but does introduce at least one call of get time 
> (the second call is only needed if the queue is full).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9603) BlockingHARegionJUnitTest is in need of a refactor. It is poorly written by current standards.

2021-09-14 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9603:
--

 Summary: BlockingHARegionJUnitTest is in need of a refactor. It is 
poorly written by current standards.
 Key: GEODE-9603
 URL: https://issues.apache.org/jira/browse/GEODE-9603
 Project: Geode
  Issue Type: Improvement
  Components: tests
Affects Versions: 1.15.0
Reporter: Mark Hanson


Both exception and thread handling could use some modernization...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail

2021-09-10 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9554.

Fix Version/s: 1.15.0
   1.14.1
   1.13.5
   1.12.5
   Resolution: Fixed

This fix to this issue was to ensure that we were not deleting the last copy of 
a bucket in a redundancy zone. 

> Rebalancing a region with multiple redundancy zones can fail
> 
>
> Key: GEODE-9554
> URL: https://issues.apache.org/jira/browse/GEODE-9554
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.5, 1.13.5, 1.14.1, 1.15.0
>
>
> When attempting to rebalance a region with multiple redundancy zones, the 
> code does not distinguish between zones when deleting redundant bucket 
> copies. This can mean that a bucket from a different zone gets deleted 
> leaving the servers in a state of reduced redundancy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail

2021-09-08 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412119#comment-17412119
 ] 

Mark Hanson commented on GEODE-9554:


I have a new fix that should address this issue better, plus some additional 
testing. The core change is to make sure that we don't delete the last copy in 
a redundancy zone.

> Rebalancing a region with multiple redundancy zones can fail
> 
>
> Key: GEODE-9554
> URL: https://issues.apache.org/jira/browse/GEODE-9554
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> When attempting to rebalance a region with multiple redundancy zones, the 
> code does not distinguish between zones when deleting redundant bucket 
> copies. This can mean that a bucket from a different zone gets deleted 
> leaving the servers in a state of reduced redundancy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9584) PartitionedRegionLoadModel.createRedundantBucket

2021-09-07 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9584:
--

 Summary: PartitionedRegionLoadModel.createRedundantBucket
 Key: GEODE-9584
 URL: https://issues.apache.org/jira/browse/GEODE-9584
 Project: Geode
  Issue Type: Improvement
  Components: tests
Affects Versions: 1.15.0
Reporter: Mark Hanson


PartitionedRegionLoadModel.createRedundantBucket needs unit testing. It does 
not currently have any JUnit tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail

2021-09-02 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17408994#comment-17408994
 ] 

Mark Hanson commented on GEODE-9554:


I have a new fix for this that will address all of the known cases, but the 
problem right now is getting the test to pass consecutively. I am going to get 
some help on this.

> Rebalancing a region with multiple redundancy zones can fail
> 
>
> Key: GEODE-9554
> URL: https://issues.apache.org/jira/browse/GEODE-9554
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> When attempting to rebalance a region with multiple redundancy zones, the 
> code does not distinguish between zones when deleting redundant bucket 
> copies. This can mean that a bucket from a different zone gets deleted 
> leaving the servers in a state of reduced redundancy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail

2021-08-30 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9554:
--

Assignee: Mark Hanson

> Rebalancing a region with multiple redundancy zones can fail
> 
>
> Key: GEODE-9554
> URL: https://issues.apache.org/jira/browse/GEODE-9554
> Project: Geode
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
>
> When attempting to rebalance a region with multiple redundancy zones, the 
> code does not distinguish between zones when deleting redundant bucket 
> copies. This can mean that a bucket from a different zone gets deleted 
> leaving the servers in a state of reduced redundancy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail

2021-08-26 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9554:
--

 Summary: Rebalancing a region with multiple redundancy zones can 
fail
 Key: GEODE-9554
 URL: https://issues.apache.org/jira/browse/GEODE-9554
 Project: Geode
  Issue Type: Bug
  Components: core
Affects Versions: 1.13.4, 1.12.4, 1.14.0, 1.15.0
Reporter: Mark Hanson


When attempting to rebalance a region with multiple redundancy zones, the code 
does not distinguish between zones when deleting redundant bucket copies. This 
can mean that a bucket from a different zone gets deleted leaving the servers 
in a state of reduced redundancy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9520) create index gfsh command behavior seems inconsistent with other commands

2021-08-18 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson updated GEODE-9520:
---
Description: 
The way the command currently works, when you it uses the group of the region 
to determine the target members to perform the operation on. When you create a 
region, it doesn't need the group or anything, it just works. Seems like create 
index should be able to find the region on a member without an issue.  

 

  was:
When you create a cluster from an XML with no group specified, then you try to 
create an index in a subregion that exists, the command will error out with 
"Region root/transRegion does not exist." The region exists, but in the 
background the command is using the group as a way to find the target members 
because there is no group, there are not target members and the command cannot 
complete. The  problem is that the error message is not good.

 

I suggest changing the error message to say something more useful like

"Could not find the region abc in a group. Please specify a target member or 
target members."

 

Summary: create index gfsh command behavior seems inconsistent with 
other commands  (was: Error message is not useful when trying to create index 
after using a cache xml for startup)

> create index gfsh command behavior seems inconsistent with other commands
> -
>
> Key: GEODE-9520
> URL: https://issues.apache.org/jira/browse/GEODE-9520
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0
>Reporter: Mark Hanson
>Priority: Major
>
> The way the command currently works, when you it uses the group of the region 
> to determine the target members to perform the operation on. When you create 
> a region, it doesn't need the group or anything, it just works. Seems like 
> create index should be able to find the region on a member without an issue.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9520) Error message is not useful when trying to create index after using a cache xml for startup

2021-08-18 Thread Mark Hanson (Jira)
Mark Hanson created GEODE-9520:
--

 Summary: Error message is not useful when trying to create index 
after using a cache xml for startup
 Key: GEODE-9520
 URL: https://issues.apache.org/jira/browse/GEODE-9520
 Project: Geode
  Issue Type: Bug
  Components: gfsh
Affects Versions: 1.13.4, 1.12.4, 1.14.0, 1.15.0
Reporter: Mark Hanson


When you create a cluster from an XML with no group specified, then you try to 
create an index in a subregion that exists, the command will error out with 
"Region root/transRegion does not exist." The region exists, but in the 
background the command is using the group as a way to find the target members 
because there is no group, there are not target members and the command cannot 
complete. The  problem is that the error message is not good.

 

I suggest changing the error message to say something more useful like

"Could not find the region abc in a group. Please specify a target member or 
target members."

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9490) CI failure: NativeRedisSessionAcceptanceTest > executionError

2021-08-11 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397481#comment-17397481
 ] 

Mark Hanson commented on GEODE-9490:


Ignore ^ build 108 failure note.

> CI failure: NativeRedisSessionAcceptanceTest > executionError
> -
>
> Key: GEODE-9490
> URL: https://issues.apache.org/jira/browse/GEODE-9490
> Project: Geode
>  Issue Type: Test
>  Components: redis, tests
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>  Labels: pull-request-available
>
> {noformat}
> NativeRedisSessionAcceptanceTest > executionError FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-local.log' at line 1611
> [error 2021/08/05 23:35:01.484 UTC  tid=78] Failed to 
> return response on inboundChannel
> io.netty.channel.StacklessClosedChannelException
>   at io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, 
> ChannelPromise)(Unknown Source)
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409)
> at 
> org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
> at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at java.util.Iterator.forEachRemaining(Iterator.java:116)
> at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
> at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
> at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
> at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
> at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
> at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
> at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
>  

[jira] [Resolved] (GEODE-9194) Move PR clear related statistics to the appropriate classes

2021-07-13 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson resolved GEODE-9194.

Fix Version/s: 1.15.0
 Assignee: Mark Hanson
   Resolution: Fixed

This has been merged to feature/GEODE-7665

> Move PR clear related statistics to the appropriate classes
> ---
>
> Key: GEODE-9194
> URL: https://issues.apache.org/jira/browse/GEODE-9194
> Project: Geode
>  Issue Type: New Feature
>  Components: statistics
>Reporter: Mark Hanson
>Assignee: Mark Hanson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Currently there are PR clear statistics that are not a part of the 
> Partitioned Region Stats. This feature work is to track the movement of those 
> stats.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9365) HARegionQueue over throttles when multiple threads attempt concurrent adds

2021-07-09 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378311#comment-17378311
 ] 

Mark Hanson commented on GEODE-9365:


I am testing the "other solution". That seems like a simple patch. The larger 
questions I think are interesting and I will need to investigate further.

> HARegionQueue over throttles when multiple threads attempt concurrent adds
> --
>
> Key: GEODE-9365
> URL: https://issues.apache.org/jira/browse/GEODE-9365
> Project: Geode
>  Issue Type: Bug
>  Components: client queues
>Reporter: Darrel Schneider
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI
>
> HARegionQueue.checkQueueSizeConstraint has some code that implements a 
> "throttle" on adds to a queue that is full. It is supposed to wait 
> "eventEnqueueWaitTime" before doing an add. But because this code does two 
> syncs (putGuard and permitMon) and only waits on one of them, it holds the 
> other sync for the duration of this threads throttle. Any other concurrent 
> thread trying to add to the queue gets stuck on the putGuard sync that is 
> held by the first thread that is doing the timed wait. So it ends up waiting 
> "eventEnqueueWaitTime" to acquire the first sync and then ends up waiting 
> again "eventEnqueueWaitTime" when it does its own timed wait. If you have 10 
> concurrent threads trying to add one of them will end up waiting 10 *  
> "eventEnqueueWaitTime".
> A couple ideas of how to fix this. Get rid of the putGuard and just use 
> permitMon. Then as soon as the first thread goes into its timed wait another 
> thread is allowed to sync on permitMon. But if this is done then we need to 
> think carefully about the code inside this sync block since it can not be 
> executed while one or more other threads are waiting in permitMon.
> The other solution would be to compute the elapsed time it took to get into 
> the first sync and subtract that from the time we wait on permitMon. This 
> seems like a simple solution but does introduce at least one call of get time 
> (the second call is only needed if the queue is full).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8064) DeploymentSemanticVersionJarDUnitTest.java (GEODE-7421) is failing.

2021-07-09 Thread Mark Hanson (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378295#comment-17378295
 ] 

Mark Hanson commented on GEODE-8064:


Another issue with this test 
{noformat}
org.apache.geode.management.internal.rest.DeploymentSemanticVersionJarDUnitTest 
> deploySameJarNameWithDifferentContent 
FAILEDorg.apache.geode.management.internal.rest.DeploymentSemanticVersionJarDUnitTest
 > deploySameJarNameWithDifferentContent FAILED
    java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
    Fix the strings or use IgnoredException.addIgnoredException to ignore.
    ---
    Found suspect string in 'dunit_suspect-vm0.log' at line 763
    ZMÐÈhÌ.߃Ý҄¡3†ÐþՅ
îÑTæ:£#ˆ¹±˜K÷¦nÀÞ0ö¡?‘¢èZy@†*¤Má‡Úâ©øa칤òƒ‘
½PKüh#p­jPKE“éR timestamp3432µ05547016PK½{}¨
    PKE“éRüh#p­jjddunit/function/Def.classþÊPKE“éR½{}¨
     ùtimestampPKƒ?
    --KpEt0WRhuP7_uJjerp4keHy2JOGeQ6
    Content-Disposition: form-data; name="config"
    Content-Type: application/json
        at org.junit.Assert.fail(Assert.java:89)
        at 
org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409)
        at 
org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425)
        at 
org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186)
        at 
org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70)
        at 
org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
        at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
        at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
        at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
        at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
        at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:566)
        at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
        at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
        at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
        at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
        at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
        at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119)
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:566)
        at 

[jira] [Assigned] (GEODE-9365) HARegionQueue over throttles when multiple threads attempt concurrent adds

2021-07-08 Thread Mark Hanson (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hanson reassigned GEODE-9365:
--

Assignee: Mark Hanson

> HARegionQueue over throttles when multiple threads attempt concurrent adds
> --
>
> Key: GEODE-9365
> URL: https://issues.apache.org/jira/browse/GEODE-9365
> Project: Geode
>  Issue Type: Bug
>  Components: client queues
>Reporter: Darrel Schneider
>Assignee: Mark Hanson
>Priority: Major
>  Labels: GeodeOperationAPI
>
> HARegionQueue.checkQueueSizeConstraint has some code that implements a 
> "throttle" on adds to a queue that is full. It is supposed to wait 
> "eventEnqueueWaitTime" before doing an add. But because this code does two 
> syncs (putGuard and permitMon) and only waits on one of them, it holds the 
> other sync for the duration of this threads throttle. Any other concurrent 
> thread trying to add to the queue gets stuck on the putGuard sync that is 
> held by the first thread that is doing the timed wait. So it ends up waiting 
> "eventEnqueueWaitTime" to acquire the first sync and then ends up waiting 
> again "eventEnqueueWaitTime" when it does its own timed wait. If you have 10 
> concurrent threads trying to add one of them will end up waiting 10 *  
> "eventEnqueueWaitTime".
> A couple ideas of how to fix this. Get rid of the putGuard and just use 
> permitMon. Then as soon as the first thread goes into its timed wait another 
> thread is allowed to sync on permitMon. But if this is done then we need to 
> think carefully about the code inside this sync block since it can not be 
> executed while one or more other threads are waiting in permitMon.
> The other solution would be to compute the elapsed time it took to get into 
> the first sync and subtract that from the time we wait on permitMon. This 
> seems like a simple solution but does introduce at least one call of get time 
> (the second call is only needed if the queue is full).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   10   >