[jira] [Updated] (GEODE-9204) A not serializable exception can cause a ServerConnection thread to get stuck waiting for a reply from another member

2021-04-28 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9204:
--
Summary: A not serializable exception can cause a ServerConnection thread 
to get stuck waiting for a reply from another member  (was: A not serializable 
object can cause a ServerConnection thread to get stuck waiting for a reply 
from another member)

> A not serializable exception can cause a ServerConnection thread to get stuck 
> waiting for a reply from another member
> -
>
> Key: GEODE-9204
> URL: https://issues.apache.org/jira/browse/GEODE-9204
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> A test case that reproduces it is:
> - a client get request is received in one server and sent to another server
> - the other server uses a CacheLoader to load the value
> - the CacheLoader throws an exception containing a non-serializable object
> - the reply attempts to serialize that exception but fails with 
> NotSerializableException
> - the original server's ServerConnection thread gets stuck waiting for a 
> reply that will never come
> Here is a stack trace showing the NotSerializableException:
> {noformat}
> [severe 2018/03/20 14:30:27.793 PDT   elgreco(85544):30177 unshared ordered uid=14 dom #1 port=53923> 
> tid=0x5c] Uncaught exception processing  partitioned.GetMessage(prid=2 (name 
> = "/data") processorId=0; posDup=false; key=0; callback arg=null; 
> context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2)
> org.apache.geode.InternalGemFireException: java.io.NotSerializableException: 
> java.lang.Object
>   at 
> org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76)
>   at 
> org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109)
>   at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392)
>   at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376)
>   at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386)
>   at 
> org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449)
>   at 
> org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872)
>   at 
> org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496)
>   at 
> org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647)
>   at 
> org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804)
>   at 
> org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835)
>   at 
> org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932)
>   at 
> org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515)
>   at 
> org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827)
>   at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.NotSerializableException: java.lang.Object
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>   at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>   at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>   at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>   at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>   at 
> java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441)
>   at java.lang.Throwable.writeObject(Throwable.java:985)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> 

[jira] [Resolved] (GEODE-9139) SSLException in starting up a Locator

2021-04-28 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-9139.
---
Resolution: Fixed

> SSLException in starting up a Locator
> -
>
> Key: GEODE-9139
> URL: https://issues.apache.org/jira/browse/GEODE-9139
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> If you start up a locator using its host name, without a domain name, as a 
> bind address you may get an SSLException in the form
> {noformat}
> javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: 
> No subject alternative DNS name matching hostname.domainname found
> {noformat}
> The LocatorLauncher and InternalLocator throw away the bind address string 
> and later do a reverse lookup to find the fully qualified hostname to use in 
> endpoint identification matching.If the locator's own TLS certificate 
> doesn't have the fully qualified name in it as a Subject Alternate Name the 
> connection that the Locator makes to its own location service will fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9204) A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member

2021-04-28 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9204:
--
Issue Type: Bug  (was: Test)

> A not serializable object can cause a ServerConnection thread to get stuck 
> waiting for a reply from another member
> --
>
> Key: GEODE-9204
> URL: https://issues.apache.org/jira/browse/GEODE-9204
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> A test case that reproduces it is:
> - a client get request is received in one server and sent to another server
> - the other server uses a CacheLoader to load the value
> - the CacheLoader throws an exception containing a non-serializable object
> - the reply attempts to serialize that exception but fails with 
> NotSerializableException
> - the original server's ServerConnection thread gets stuck waiting for a 
> reply that will never come
> Here is a stack trace showing the NotSerializableException:
> {noformat}
> [severe 2018/03/20 14:30:27.793 PDT   elgreco(85544):30177 unshared ordered uid=14 dom #1 port=53923> 
> tid=0x5c] Uncaught exception processing  partitioned.GetMessage(prid=2 (name 
> = "/data") processorId=0; posDup=false; key=0; callback arg=null; 
> context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2)
> org.apache.geode.InternalGemFireException: java.io.NotSerializableException: 
> java.lang.Object
>   at 
> org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76)
>   at 
> org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109)
>   at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392)
>   at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376)
>   at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386)
>   at 
> org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449)
>   at 
> org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872)
>   at 
> org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496)
>   at 
> org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647)
>   at 
> org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804)
>   at 
> org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835)
>   at 
> org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932)
>   at 
> org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515)
>   at 
> org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827)
>   at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.NotSerializableException: java.lang.Object
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>   at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>   at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>   at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>   at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>   at 
> java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441)
>   at java.lang.Throwable.writeObject(Throwable.java:985)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
>   at 
> 

[jira] [Assigned] (GEODE-9204) A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member

2021-04-28 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9204:
-

Assignee: Bruce J Schuchardt

> A not serializable object can cause a ServerConnection thread to get stuck 
> waiting for a reply from another member
> --
>
> Key: GEODE-9204
> URL: https://issues.apache.org/jira/browse/GEODE-9204
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> A test case that reproduces it is:
> - a client get request is received in one server and sent to another server
> - the other server uses a CacheLoader to load the value
> - the CacheLoader throws an exception containing a non-serializable object
> - the reply attempts to serialize that exception but fails with 
> NotSerializableException
> - the original server's ServerConnection thread gets stuck waiting for a 
> reply that will never come
> Here is a stack trace showing the NotSerializableException:
> {noformat}
> [severe 2018/03/20 14:30:27.793 PDT   elgreco(85544):30177 unshared ordered uid=14 dom #1 port=53923> 
> tid=0x5c] Uncaught exception processing  partitioned.GetMessage(prid=2 (name 
> = "/data") processorId=0; posDup=false; key=0; callback arg=null; 
> context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2)
> org.apache.geode.InternalGemFireException: java.io.NotSerializableException: 
> java.lang.Object
>   at 
> org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76)
>   at 
> org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109)
>   at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392)
>   at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376)
>   at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386)
>   at 
> org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449)
>   at 
> org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872)
>   at 
> org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496)
>   at 
> org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037)
>   at 
> org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647)
>   at 
> org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804)
>   at 
> org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835)
>   at 
> org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932)
>   at 
> org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515)
>   at 
> org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827)
>   at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.NotSerializableException: java.lang.Object
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>   at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>   at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>   at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>   at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>   at 
> java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441)
>   at java.lang.Throwable.writeObject(Throwable.java:985)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
>   

[jira] [Created] (GEODE-9204) A not serializable object can cause a ServerConnection thread to get stuck waiting for a reply from another member

2021-04-28 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-9204:
-

 Summary: A not serializable object can cause a ServerConnection 
thread to get stuck waiting for a reply from another member
 Key: GEODE-9204
 URL: https://issues.apache.org/jira/browse/GEODE-9204
 Project: Geode
  Issue Type: Test
  Components: membership, messaging
Reporter: Bruce J Schuchardt


A test case that reproduces it is:

- a client get request is received in one server and sent to another server
- the other server uses a CacheLoader to load the value
- the CacheLoader throws an exception containing a non-serializable object
- the reply attempts to serialize that exception but fails with 
NotSerializableException
- the original server's ServerConnection thread gets stuck waiting for a reply 
that will never come

Here is a stack trace showing the NotSerializableException:
{noformat}
[severe 2018/03/20 14:30:27.793 PDT  :30177 unshared ordered uid=14 dom #1 port=53923> tid=0x5c] 
Uncaught exception processing  partitioned.GetMessage(prid=2 (name = "/data") 
processorId=0; posDup=false; key=0; callback arg=null; 
context=identity(elgreco(client:85552:loner):53907:fce35145:client,connection=2)
org.apache.geode.InternalGemFireException: java.io.NotSerializableException: 
java.lang.Object
at 
org.apache.geode.internal.tcp.DirectReplySender.putOutgoing(DirectReplySender.java:76)
at 
org.apache.geode.distributed.internal.ReplyMessage.send(ReplyMessage.java:109)
at 
org.apache.geode.internal.cache.partitioned.PartitionMessage.sendReply(PartitionMessage.java:392)
at 
org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:376)
at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386)
at 
org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:449)
at 
org.apache.geode.distributed.internal.DistributionManager.scheduleIncomingMessage(DistributionManager.java:3872)
at 
org.apache.geode.distributed.internal.DistributionManager.handleIncomingDMsg(DistributionManager.java:3496)
at 
org.apache.geode.distributed.internal.DistributionManager$MyListener.messageReceived(DistributionManager.java:4693)
at 
org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.processMessage(JGroupMembershipManager.java:2128)
at 
org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager.handleOrDeferMessage(JGroupMembershipManager.java:2037)
at 
org.apache.geode.distributed.internal.membership.jgroup.JGroupMembershipManager$MyDCReceiver.messageReceived(JGroupMembershipManager.java:647)
at 
org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:804)
at 
org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:835)
at 
org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3932)
at 
org.apache.geode.internal.tcp.Connection.processNIOBuffer(Connection.java:3515)
at 
org.apache.geode.internal.tcp.Connection.runNioReader(Connection.java:1827)
at org.apache.geode.internal.tcp.Connection.run(Connection.java:1702)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.NotSerializableException: java.lang.Object
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at 
java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441)
at java.lang.Throwable.writeObject(Throwable.java:985)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at 
java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:441)
at 

[jira] [Updated] (GEODE-9139) SSLException in starting up a Locator

2021-04-27 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9139:
--
Fix Version/s: 1.15.0

> SSLException in starting up a Locator
> -
>
> Key: GEODE-9139
> URL: https://issues.apache.org/jira/browse/GEODE-9139
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> If you start up a locator using its host name, without a domain name, as a 
> bind address you may get an SSLException in the form
> {noformat}
> javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: 
> No subject alternative DNS name matching hostname.domainname found
> {noformat}
> The LocatorLauncher and InternalLocator throw away the bind address string 
> and later do a reverse lookup to find the fully qualified hostname to use in 
> endpoint identification matching.If the locator's own TLS certificate 
> doesn't have the fully qualified name in it as a Subject Alternate Name the 
> connection that the Locator makes to its own location service will fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-6413) CI Failure: Bind Exception during ClusterCommunicationsDUnitTest.performARollingUpgrade

2021-04-27 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-6413:
--
Component/s: membership

> CI Failure: Bind Exception during 
> ClusterCommunicationsDUnitTest.performARollingUpgrade
> ---
>
> Key: GEODE-6413
> URL: https://issues.apache.org/jira/browse/GEODE-6413
> Project: Geode
>  Issue Type: Bug
>  Components: membership, tests
>Reporter: Benjamin P Ross
>Priority: Major
> Fix For: 1.9.0
>
>
> Stack Trace: 
> {noformat}
> org.apache.geode.ClusterCommunicationsDUnitTest > 
> performARollingUpgrade[SHARED_CONNECTIONS] FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.test.dunit.NamedRunnable.run in VM 0 running on Host 
> c4dd6cb2c206 with 3 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:579)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:393)
> at 
> org.apache.geode.ClusterCommunicationsDUnitTest.performARollingUpgrade(ClusterCommunicationsDUnitTest.java:214)
> Caused by:
> java.net.BindException: Failed to create server socket on 
> c4dd6cb2c206/172.17.0.16[43969]
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:756)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:714)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:680)
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.initializeServerSocket(TcpServer.java:225)
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.startServerThread(TcpServer.java:215)
> at 
> org.apache.geode.distributed.internal.tcpserver.TcpServer.start(TcpServer.java:210)
> at 
> org.apache.geode.distributed.internal.InternalLocator.startTcpServer(InternalLocator.java:501)
> at 
> org.apache.geode.distributed.internal.InternalLocator.startPeerLocation(InternalLocator.java:557)
> at 
> org.apache.geode.distributed.internal.InternalLocator.startLocator(InternalLocator.java:340)
> at 
> org.apache.geode.distributed.Locator.startLocator(Locator.java:252)
> at 
> org.apache.geode.distributed.Locator.startLocatorAndDS(Locator.java:139)
> at 
> org.apache.geode.ClusterCommunicationsDUnitTest.lambda$null$1(ClusterCommunicationsDUnitTest.java:220)
> Caused by:
> java.net.BindException: Address already in use (Bind failed)
> at java.net.PlainSocketImpl.socketBind(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
> at java.net.ServerSocket.bind(ServerSocket.java:375)
> at 
> org.apache.geode.internal.net.SocketCreator.createServerSocket(SocketCreator.java:753)
> ... 11 more
> {noformat}
> This test may be fixed with a longer await() timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9141) Hang while shutting down a cache server due to corrupted message

2021-04-26 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9141:
-

Assignee: Bill Burcham  (was: Bruce J Schuchardt)

> Hang while shutting down a cache server due to corrupted message
> 
>
> Key: GEODE-9141
> URL: https://issues.apache.org/jira/browse/GEODE-9141
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.13.2, 1.14.0, 1.15.0
>Reporter: Bruce J Schuchardt
>Assignee: Bill Burcham
>Priority: Major
>  Labels: blocks-1.14.0​, blocks-1.15.0​, pull-request-available
>
> We have a test that fails once in 5000 runs with a corrupted 
> DestroyRegionMessage.  It is always during CacheServer teardown when 
> destroying a HARegionQueue Region.
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
>   - locked <0xf5f7b888> (a java.lang.Object)
>   at 
> org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
>   - locked <0xf7ef2980> (a 
> org.apache.geode.internal.cache.CacheServerImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
>   - locked 

[jira] [Assigned] (GEODE-7607) Create a concurrent-startup membership test outside of geode-core

2021-04-26 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-7607:
-

Assignee: (was: Bruce J Schuchardt)

> Create a concurrent-startup membership test outside of geode-core
> -
>
> Key: GEODE-7607
> URL: https://issues.apache.org/jira/browse/GEODE-7607
> Project: Geode
>  Issue Type: Test
>  Components: membership
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> There is currently a test in MembershipJUnitTest that spins up a Locator and 
> two Membership services in the same JVM.  Use this to springboard a new test 
> that concurrently starts up two Membership services, each hosting a TcpServer 
> peer-to-peer location service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-6222) CI Failure: GemFireDeadlockDetectorDUnitTest

2021-04-26 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-6222:
-

Assignee: (was: Bruce J Schuchardt)

> CI Failure: GemFireDeadlockDetectorDUnitTest
> 
>
> Key: GEODE-6222
> URL: https://issues.apache.org/jira/browse/GEODE-6222
> Project: Geode
>  Issue Type: Bug
>  Components: distributed lock service
>Affects Versions: 1.9.0
>Reporter: Ken Howe
>Priority: Major
>  Labels: flaky
>
> Flaky test failure in 
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/247]
> {code:java}
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest
>  > testDistributedDeadlockWithDLock FAILED
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.geode.distributed.internal.deadlock.GemFireDeadlockDetectorDUnitTest.testDistributedDeadlockWithDLock(GemFireDeadlockDetectorDUnitTest.java:199)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections

2021-04-26 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9135:
-

Assignee: (was: Bruce J Schuchardt)

> Remove reverse DNS lookup in Connection.java for accepted connections
> -
>
> Key: GEODE-9135
> URL: https://issues.apache.org/jira/browse/GEODE-9135
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> Prior to the introduction of SSLEngine use in the 
> org.apache.geode.internal.tcp package we used SSLSockets.  During a handshake 
> we would set the SNIHostName on the client side of the connection and have it 
> validate the hostname returned by the server side of the handshake.
> When we introduced SSLEngine we changed this to set the SNIHostName on both 
> sides.  We should revert this so that it only does it on the client side.
> The server side of the connection does not have a hostname for the client 
> side of the connection in this case and it is currently doing a reverse DNS 
> lookup to get the name.  That's a potentially expensive operation, and even 
> then we don't know whether to use the fully qualified domain name (FQDN) or a 
> simple host name.  This matters because endpoint verification requires that 
> the name we choose be presented in the certificate of the other server.  If 
> we choose the FQDN and the cert only has a simple host name the handshake 
> will fail.
> SSLEngine requires a host name when it's constructed but most algorithms 
> don't use it.  Documentation mentions Kerberos possibly needing it, so we'd 
> have to have a way for the reverse lookup to be enabled or find some other 
> way to get the host name, like SocketCreator.getHostName()'s reverse-lookup 
> cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9128) Remove host name look-up from JGAddress

2021-04-20 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9128:
--
Fix Version/s: 1.14.0

> Remove host name look-up from JGAddress
> ---
>
> Key: GEODE-9128
> URL: https://issues.apache.org/jira/browse/GEODE-9128
> Project: Geode
>  Issue Type: Test
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.3, 1.14.0, 1.15.0
>
>
> The method JGAddress.toString() contains a host name lookup that should be 
> removed.  It should just log the toString of its ip_addr field, not 
> ip_addr.getHostName().  That method can cause a reverse-DNS lookup, which is 
> needlessly expensive for a toString() operation.
> {code:java}
>   public String toString() {
> StringBuilder sb = new StringBuilder();
> if (ip_addr == null)
>   sb.append("");
> else {
>   sb.append(ip_addr.getHostName());
> }
> if (vmViewId >= 0) {
>   sb.append("');
> }
> if (SHOW_UUIDS) {
>   sb.append("(").append(toStringLong()).append(")");
> } else if (mostSigBits == 0 && leastSigBits == 0) {
>   sb.append("(no uuid set)");
> }
> sb.append(":").append(port);
> return sb.toString();
>   }
> {code:java}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GEODE-9145) update CODEOWNERS

2021-04-15 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt closed GEODE-9145.
-

> update CODEOWNERS
> -
>
> Key: GEODE-9145
> URL: https://issues.apache.org/jira/browse/GEODE-9145
> Project: Geode
>  Issue Type: Task
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> remove bschuchardt from CODEOWNERS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9145) update CODEOWNERS

2021-04-15 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-9145.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> update CODEOWNERS
> -
>
> Key: GEODE-9145
> URL: https://issues.apache.org/jira/browse/GEODE-9145
> Project: Geode
>  Issue Type: Task
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> remove bschuchardt from CODEOWNERS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9128) Remove host name look-up from JGAddress

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9128:
--
Fix Version/s: 1.13.3

> Remove host name look-up from JGAddress
> ---
>
> Key: GEODE-9128
> URL: https://issues.apache.org/jira/browse/GEODE-9128
> Project: Geode
>  Issue Type: Test
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.3, 1.15.0
>
>
> The method JGAddress.toString() contains a host name lookup that should be 
> removed.  It should just log the toString of its ip_addr field, not 
> ip_addr.getHostName().  That method can cause a reverse-DNS lookup, which is 
> needlessly expensive for a toString() operation.
> {code:java}
>   public String toString() {
> StringBuilder sb = new StringBuilder();
> if (ip_addr == null)
>   sb.append("");
> else {
>   sb.append(ip_addr.getHostName());
> }
> if (vmViewId >= 0) {
>   sb.append("');
> }
> if (SHOW_UUIDS) {
>   sb.append("(").append(toStringLong()).append(")");
> } else if (mostSigBits == 0 && leastSigBits == 0) {
>   sb.append("(no uuid set)");
> }
> sb.append(":").append(port);
> return sb.toString();
>   }
> {code:java}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9128) Remove host name look-up from JGAddress

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-9128.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

PR for backport to 1.14 is up so I'm closing this ticket.

> Remove host name look-up from JGAddress
> ---
>
> Key: GEODE-9128
> URL: https://issues.apache.org/jira/browse/GEODE-9128
> Project: Geode
>  Issue Type: Test
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> The method JGAddress.toString() contains a host name lookup that should be 
> removed.  It should just log the toString of its ip_addr field, not 
> ip_addr.getHostName().  That method can cause a reverse-DNS lookup, which is 
> needlessly expensive for a toString() operation.
> {code:java}
>   public String toString() {
> StringBuilder sb = new StringBuilder();
> if (ip_addr == null)
>   sb.append("");
> else {
>   sb.append(ip_addr.getHostName());
> }
> if (vmViewId >= 0) {
>   sb.append("');
> }
> if (SHOW_UUIDS) {
>   sb.append("(").append(toStringLong()).append(")");
> } else if (mostSigBits == 0 && leastSigBits == 0) {
>   sb.append("(no uuid set)");
> }
> sb.append(":").append(port);
> return sb.toString();
>   }
> {code:java}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9128) Remove host name look-up from JGAddress

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9128:
-

Assignee: Bruce J Schuchardt

> Remove host name look-up from JGAddress
> ---
>
> Key: GEODE-9128
> URL: https://issues.apache.org/jira/browse/GEODE-9128
> Project: Geode
>  Issue Type: Test
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
>
> The method JGAddress.toString() contains a host name lookup that should be 
> removed.  It should just log the toString of its ip_addr field, not 
> ip_addr.getHostName().  That method can cause a reverse-DNS lookup, which is 
> needlessly expensive for a toString() operation.
> {code:java}
>   public String toString() {
> StringBuilder sb = new StringBuilder();
> if (ip_addr == null)
>   sb.append("");
> else {
>   sb.append(ip_addr.getHostName());
> }
> if (vmViewId >= 0) {
>   sb.append("');
> }
> if (SHOW_UUIDS) {
>   sb.append("(").append(toStringLong()).append(")");
> } else if (mostSigBits == 0 && leastSigBits == 0) {
>   sb.append("(no uuid set)");
> }
> sb.append(":").append(port);
> return sb.toString();
>   }
> {code:java}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9145) update CODEOWNERS

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9145:
--
Component/s: membership

> update CODEOWNERS
> -
>
> Key: GEODE-9145
> URL: https://issues.apache.org/jira/browse/GEODE-9145
> Project: Geode
>  Issue Type: Task
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> remove bschuchardt from CODEOWNERS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9145) update CODEOWNERS

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9145:
--
Issue Type: Task  (was: Test)

> update CODEOWNERS
> -
>
> Key: GEODE-9145
> URL: https://issues.apache.org/jira/browse/GEODE-9145
> Project: Geode
>  Issue Type: Task
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> remove bschuchardt from CODEOWNERS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9145) update CODEOWNERS

2021-04-13 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-9145:
-

 Summary: update CODEOWNERS
 Key: GEODE-9145
 URL: https://issues.apache.org/jira/browse/GEODE-9145
 Project: Geode
  Issue Type: Test
Reporter: Bruce J Schuchardt


remove bschuchardt from CODEOWNERS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9145) update CODEOWNERS

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9145:
-

Assignee: Bruce J Schuchardt

> update CODEOWNERS
> -
>
> Key: GEODE-9145
> URL: https://issues.apache.org/jira/browse/GEODE-9145
> Project: Geode
>  Issue Type: Test
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> remove bschuchardt from CODEOWNERS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (GEODE-5940) ServerLauncherRemoteIntegrationTest times out waiting for server to start

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reopened GEODE-5940:
---

This problem has returned in this run:
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK8/builds/141

> ServerLauncherRemoteIntegrationTest times out waiting for server to start
> -
>
> Key: GEODE-5940
> URL: https://issues.apache.org/jira/browse/GEODE-5940
> Project: Geode
>  Issue Type: Test
>Reporter: Dale Emery
>Assignee: Kirk Lund
>Priority: Major
>  Labels: swat
> Fix For: 1.11.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.8.0-build.50/test-results/integrationTest/1540492907/classes/org.apache.geode.distributed.ServerLauncherRemoteIntegrationTest.html#startOverwritesStalePidFile
> {noformat}
> org.awaitility.core.ConditionTimeoutException: Assertion condition defined as 
> a lambda expression in 
> org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase that 
> uses org.apache.geode.distributed.ServerLauncher expected:<[online]> but 
> was:<[not responding]> within 300 seconds.
>   at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:145)
>   at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:122)
>   at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32)
>   at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:890)
>   at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:711)
>   at 
> org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.awaitStart(ServerLauncherRemoteIntegrationTestCase.java:200)
>   at 
> org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.awaitStart(ServerLauncherRemoteIntegrationTestCase.java:178)
>   at 
> org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.awaitStart(ServerLauncherRemoteIntegrationTestCase.java:189)
>   at 
> org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.startServer(ServerLauncherRemoteIntegrationTestCase.java:128)
>   at 
> org.apache.geode.distributed.ServerLauncherRemoteIntegrationTestCase.startServer(ServerLauncherRemoteIntegrationTestCase.java:124)
>   at 
> org.apache.geode.distributed.ServerLauncherRemoteIntegrationTest.startOverwritesStalePidFile(ServerLauncherRemoteIntegrationTest.java:91)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.Verifier$1.evaluate(Verifier.java:35)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> 

[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9141:
--
Labels: blocks-1.14.0​ blocks-1.15.0​  (was: blocks-1.15.0​)

> Hang while shutting down a cache server due to corrupted message
> 
>
> Key: GEODE-9141
> URL: https://issues.apache.org/jira/browse/GEODE-9141
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.13.2, 1.14.0, 1.15.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: blocks-1.14.0​, blocks-1.15.0​
>
> We have a test that fails once in 5000 runs with a corrupted 
> DestroyRegionMessage.  It is always during CacheServer teardown when 
> destroying a HARegionQueue Region.
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
>   - locked <0xf5f7b888> (a java.lang.Object)
>   at 
> org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
>   - locked <0xf7ef2980> (a 
> org.apache.geode.internal.cache.CacheServerImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
>   - locked <0xf5a21a08> (a 

[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9141:
--
Labels: blocks-1.15.0​  (was: )

> Hang while shutting down a cache server due to corrupted message
> 
>
> Key: GEODE-9141
> URL: https://issues.apache.org/jira/browse/GEODE-9141
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: blocks-1.15.0​
>
> We have a test that fails once in 5000 runs with a corrupted 
> DestroyRegionMessage.  It is always during CacheServer teardown when 
> destroying a HARegionQueue Region.
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
>   - locked <0xf5f7b888> (a java.lang.Object)
>   at 
> org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
>   - locked <0xf7ef2980> (a 
> org.apache.geode.internal.cache.CacheServerImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> 

[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9141:
--
Affects Version/s: 1.15.0
   1.14.0
   1.13.2

> Hang while shutting down a cache server due to corrupted message
> 
>
> Key: GEODE-9141
> URL: https://issues.apache.org/jira/browse/GEODE-9141
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.13.2, 1.14.0, 1.15.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: blocks-1.15.0​
>
> We have a test that fails once in 5000 runs with a corrupted 
> DestroyRegionMessage.  It is always during CacheServer teardown when 
> destroying a HARegionQueue Region.
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
>   - locked <0xf5f7b888> (a java.lang.Object)
>   at 
> org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
>   - locked <0xf7ef2980> (a 
> org.apache.geode.internal.cache.CacheServerImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
>   - locked 

[jira] [Updated] (GEODE-9141) Hang while shutting down a cache server due to corrupted message

2021-04-13 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9141:
--
Issue Type: Bug  (was: Test)

> Hang while shutting down a cache server due to corrupted message
> 
>
> Key: GEODE-9141
> URL: https://issues.apache.org/jira/browse/GEODE-9141
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> We have a test that fails once in 5000 runs with a corrupted 
> DestroyRegionMessage.  It is always during CacheServer teardown when 
> destroying a HARegionQueue Region.
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
>   - locked <0xf5f7b888> (a java.lang.Object)
>   at 
> org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
>   - locked <0xf7ef2980> (a 
> org.apache.geode.internal.cache.CacheServerImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> 

[jira] [Assigned] (GEODE-9141) Hang while shutting down a cache server due to corrupted message

2021-04-12 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9141:
-

Assignee: Bruce J Schuchardt

> Hang while shutting down a cache server due to corrupted message
> 
>
> Key: GEODE-9141
> URL: https://issues.apache.org/jira/browse/GEODE-9141
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> We have a test that fails once in 5000 runs with a corrupted 
> DestroyRegionMessage.  It is always during CacheServer teardown when 
> destroying a HARegionQueue Region.
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
>   - locked <0xf5f7b888> (a java.lang.Object)
>   at 
> org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
>   - locked <0xf7ef2980> (a 
> org.apache.geode.internal.cache.CacheServerImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
>   - locked <0xf5a21a08> (a java.lang.Class for 
> org.apache.geode.internal.cache.GemFireCacheImpl)
>   at 
> 

[jira] [Created] (GEODE-9141) Hang while shutting down a cache server due to corrupted message

2021-04-12 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-9141:
-

 Summary: Hang while shutting down a cache server due to corrupted 
message
 Key: GEODE-9141
 URL: https://issues.apache.org/jira/browse/GEODE-9141
 Project: Geode
  Issue Type: Test
  Components: membership, messaging
Reporter: Bruce J Schuchardt


We have a test that fails once in 5000 runs with a corrupted 
DestroyRegionMessage.  It is always during CacheServer teardown when destroying 
a HARegionQueue Region.

{noformat}
"vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xf4f654f8> (a 
java.util.concurrent.CountDownLatch$Sync)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at 
org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
at 
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
at 
org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
at 
org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
at 
org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
at 
org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
at 
org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
at 
org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
- locked <0xf8022800> (a 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
- locked <0xf5f7b888> (a java.lang.Object)
at 
org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
- locked <0xf7ef2980> (a 
org.apache.geode.internal.cache.CacheServerImpl)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
- locked <0xf5a21a08> (a java.lang.Class for 
org.apache.geode.internal.cache.GemFireCacheImpl)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2151)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
- locked <0xf5a21a08> (a java.lang.Class for 
org.apache.geode.internal.cache.GemFireCacheImpl)
at 
org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1257)
at hydra.RemoteTestModule$2.run(RemoteTestModule.java:388)
{noformat}

Another server logs this corrupted message.  It is almost always the same 
corruption.  When it's not we see the message header messed up, not a bad DSFID.

{noformat}
[fatal 2021/03/06 09:45:02.796 PST 

[jira] [Updated] (GEODE-9139) SSLException in starting up a Locator

2021-04-12 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9139:
--
Issue Type: Bug  (was: Test)

> SSLException in starting up a Locator
> -
>
> Key: GEODE-9139
> URL: https://issues.apache.org/jira/browse/GEODE-9139
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> If you start up a locator using its host name, without a domain name, as a 
> bind address you may get an SSLException in the form
> {noformat}
> javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: 
> No subject alternative DNS name matching hostname.domainname found
> {noformat}
> The LocatorLauncher and InternalLocator throw away the bind address string 
> and later do a reverse lookup to find the fully qualified hostname to use in 
> endpoint identification matching.If the locator's own TLS certificate 
> doesn't have the fully qualified name in it as a Subject Alternate Name the 
> connection that the Locator makes to its own location service will fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9139) SSLException in starting up a Locator

2021-04-12 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9139:
-

Assignee: Bruce J Schuchardt

> SSLException in starting up a Locator
> -
>
> Key: GEODE-9139
> URL: https://issues.apache.org/jira/browse/GEODE-9139
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> If you start up a locator using its host name, without a domain name, as a 
> bind address you may get an SSLException in the form
> {noformat}
> javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: 
> No subject alternative DNS name matching hostname.domainname found
> {noformat}
> The LocatorLauncher and InternalLocator throw away the bind address string 
> and later do a reverse lookup to find the fully qualified hostname to use in 
> endpoint identification matching.If the locator's own TLS certificate 
> doesn't have the fully qualified name in it as a Subject Alternate Name the 
> connection that the Locator makes to its own location service will fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9139) SSLException in starting up a Locator

2021-04-12 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-9139:
-

 Summary: SSLException in starting up a Locator
 Key: GEODE-9139
 URL: https://issues.apache.org/jira/browse/GEODE-9139
 Project: Geode
  Issue Type: Test
  Components: membership, messaging
Reporter: Bruce J Schuchardt


If you start up a locator using its host name, without a domain name, as a bind 
address you may get an SSLException in the form

{noformat}
javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: 
No subject alternative DNS name matching hostname.domainname found
{noformat}

The LocatorLauncher and InternalLocator throw away the bind address string and 
later do a reverse lookup to find the fully qualified hostname to use in 
endpoint identification matching.If the locator's own TLS certificate 
doesn't have the fully qualified name in it as a Subject Alternate Name the 
connection that the Locator makes to its own location service will fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections

2021-04-09 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9135:
--
Issue Type: Bug  (was: Test)

> Remove reverse DNS lookup in Connection.java for accepted connections
> -
>
> Key: GEODE-9135
> URL: https://issues.apache.org/jira/browse/GEODE-9135
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> Prior to the introduction of SSLEngine use in the 
> org.apache.geode.internal.tcp package we used SSLSockets.  During a handshake 
> we would set the SNIHostName on the client side of the connection and have it 
> validate the hostname returned by the server side of the handshake.
> When we introduced SSLEngine we changed this to set the SNIHostName on both 
> sides.  We should revert this so that it only does it on the client side.
> The server side of the connection does not have a hostname for the client 
> side of the connection in this case and it is currently doing a reverse DNS 
> lookup to get the name.  That's a potentially expensive operation, and even 
> then we don't know whether to use the fully qualified domain name (FQDN) or a 
> simple host name.  This matters because endpoint verification requires that 
> the name we choose be presented in the certificate of the other server.  If 
> we choose the FQDN and the cert only has a simple host name the handshake 
> will fail.
> SSLEngine requires a host name when it's constructed but most algorithms 
> don't use it.  Documentation mentions Kerberos possibly needing it, so we'd 
> have to have a way for the reverse lookup to be enabled or find some other 
> way to get the host name, like SocketCreator.getHostName()'s reverse-lookup 
> cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections

2021-04-09 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-9135:
-

Assignee: Bruce J Schuchardt

> Remove reverse DNS lookup in Connection.java for accepted connections
> -
>
> Key: GEODE-9135
> URL: https://issues.apache.org/jira/browse/GEODE-9135
> Project: Geode
>  Issue Type: Test
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> Prior to the introduction of SSLEngine use in the 
> org.apache.geode.internal.tcp package we used SSLSockets.  During a handshake 
> we would set the SNIHostName on the client side of the connection and have it 
> validate the hostname returned by the server side of the handshake.
> When we introduced SSLEngine we changed this to set the SNIHostName on both 
> sides.  We should revert this so that it only does it on the client side.
> The server side of the connection does not have a hostname for the client 
> side of the connection in this case and it is currently doing a reverse DNS 
> lookup to get the name.  That's a potentially expensive operation, and even 
> then we don't know whether to use the fully qualified domain name (FQDN) or a 
> simple host name.  This matters because endpoint verification requires that 
> the name we choose be presented in the certificate of the other server.  If 
> we choose the FQDN and the cert only has a simple host name the handshake 
> will fail.
> SSLEngine requires a host name when it's constructed but most algorithms 
> don't use it.  Documentation mentions Kerberos possibly needing it, so we'd 
> have to have a way for the reverse lookup to be enabled or find some other 
> way to get the host name, like SocketCreator.getHostName()'s reverse-lookup 
> cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9135) Remove reverse DNS lookup in Connection.java for accepted connections

2021-04-08 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-9135:
-

 Summary: Remove reverse DNS lookup in Connection.java for accepted 
connections
 Key: GEODE-9135
 URL: https://issues.apache.org/jira/browse/GEODE-9135
 Project: Geode
  Issue Type: Test
  Components: membership
Reporter: Bruce J Schuchardt


Prior to the introduction of SSLEngine use in the org.apache.geode.internal.tcp 
package we used SSLSockets.  During a handshake we would set the SNIHostName on 
the client side of the connection and have it validate the hostname returned by 
the server side of the handshake.

When we introduced SSLEngine we changed this to set the SNIHostName on both 
sides.  We should revert this so that it only does it on the client side.

The server side of the connection does not have a hostname for the client side 
of the connection in this case and it is currently doing a reverse DNS lookup 
to get the name.  That's a potentially expensive operation, and even then we 
don't know whether to use the fully qualified domain name (FQDN) or a simple 
host name.  This matters because endpoint verification requires that the name 
we choose be presented in the certificate of the other server.  If we choose 
the FQDN and the cert only has a simple host name the handshake will fail.

SSLEngine requires a host name when it's constructed but most algorithms don't 
use it.  Documentation mentions Kerberos possibly needing it, so we'd have to 
have a way for the reverse lookup to be enabled or find some other way to get 
the host name, like SocketCreator.getHostName()'s reverse-lookup cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9128) Remove host name look-up from JGAddress

2021-04-07 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-9128:
-

 Summary: Remove host name look-up from JGAddress
 Key: GEODE-9128
 URL: https://issues.apache.org/jira/browse/GEODE-9128
 Project: Geode
  Issue Type: Test
  Components: membership
Reporter: Bruce J Schuchardt


The method JGAddress.toString() contains a host name lookup that should be 
removed.  It should just log the toString of its ip_addr field, not 
ip_addr.getHostName().  That method can cause a reverse-DNS lookup, which is 
needlessly expensive for a toString() operation.

{code:java}
  public String toString() {
StringBuilder sb = new StringBuilder();

if (ip_addr == null)
  sb.append("");
else {
  sb.append(ip_addr.getHostName());
}
if (vmViewId >= 0) {
  sb.append("');
}
if (SHOW_UUIDS) {
  sb.append("(").append(toStringLong()).append(")");
} else if (mostSigBits == 0 && leastSigBits == 0) {
  sb.append("(no uuid set)");
}
sb.append(":").append(port);
return sb.toString();
  }
{code:java}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8997) remove protobuf client server code

2021-04-02 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8997.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> remove protobuf client server code
> --
>
> Key: GEODE-8997
> URL: https://issues.apache.org/jira/browse/GEODE-8997
> Project: Geode
>  Issue Type: Improvement
>  Components: client/server
>Reporter: Darrel Schneider
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> The protobuf based client/server project is essentially dead but code for it 
> is still part of geode.
> This complicates the implementation. For example I was working on an 
> improvement to have the thread monitor detect stuck server connection threads 
> and found myself trying to figure out how to make this work for 
> ProtobufServerConnection.
> I think it would be best to remove the dead protobuf code. I'm not sure what 
> all of it is but here is what I have found so far:
> ProtobufServerConnection
> package org.apache.geode.internal.cache.client.protocol
> package org.apache.geode.internal.protocol.protobuf.v1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-8997) remove protobuf client server code

2021-03-22 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-8997:
-

Assignee: Bruce J Schuchardt  (was: Bill Burcham)

> remove protobuf client server code
> --
>
> Key: GEODE-8997
> URL: https://issues.apache.org/jira/browse/GEODE-8997
> Project: Geode
>  Issue Type: Improvement
>  Components: client/server
>Reporter: Darrel Schneider
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> The protobuf based client/server project is essentially dead but code for it 
> is still part of geode.
> This complicates the implementation. For example I was working on an 
> improvement to have the thread monitor detect stuck server connection threads 
> and found myself trying to figure out how to make this work for 
> ProtobufServerConnection.
> I think it would be best to remove the dead protobuf code. I'm not sure what 
> all of it is but here is what I have found so far:
> ProtobufServerConnection
> package org.apache.geode.internal.cache.client.protocol
> package org.apache.geode.internal.protocol.protobuf.v1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9011) (deleted)

2021-03-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9011:
--
Priority: Trivial  (was: Major)

> (deleted)
> -
>
> Key: GEODE-9011
> URL: https://issues.apache.org/jira/browse/GEODE-9011
> Project: Geode
>  Issue Type: Test
>Reporter: Bruce J Schuchardt
>Priority: Trivial
>
> submitted to the wrong JIRA - sorry



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9011) (deleted)

2021-03-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9011:
--
Component/s: (was: messaging)
 (was: membership)

> (deleted)
> -
>
> Key: GEODE-9011
> URL: https://issues.apache.org/jira/browse/GEODE-9011
> Project: Geode
>  Issue Type: Test
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> submitted to the wrong JIRA - sorry



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9011) (deleted)

2021-03-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9011:
--
Summary: (deleted)  (was: hctKill.conf Error deserializing message causes 
hang)

> (deleted)
> -
>
> Key: GEODE-9011
> URL: https://issues.apache.org/jira/browse/GEODE-9011
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> A test was reported hung when it tried to shut down.  One server reported 
> this:
> {noformat}
> [warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 
>  tid=0x90] 15 seconds have elapsed while 
> waiting for replies:  66 waiting for 2 replies from 
> [rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]>
>  on 
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007
>  whose current membership list is: 
> [[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]]
> {noformat}
> and was stuck waiting for a reply in thread dumps
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> 

[jira] [Resolved] (GEODE-9011) hctKill.conf Error deserializing message causes hang

2021-03-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-9011.
---
Resolution: Invalid

> hctKill.conf Error deserializing message causes hang
> 
>
> Key: GEODE-9011
> URL: https://issues.apache.org/jira/browse/GEODE-9011
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> A test was reported hung when it tried to shut down.  One server reported 
> this:
> {noformat}
> [warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 
>  tid=0x90] 15 seconds have elapsed while 
> waiting for replies:  66 waiting for 2 replies from 
> [rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]>
>  on 
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007
>  whose current membership list is: 
> [[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007,
>  
> rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]]
> {noformat}
> and was stuck waiting for a reply in thread dumps
> {noformat}
> "vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
> tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xf4f654f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>   at 
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
>   at 
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
>   at 
> org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
>   at 
> org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
>   at 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
>   - locked <0xf8022800> (a 
> org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
>   at 
> 

[jira] [Closed] (GEODE-9011) (deleted)

2021-03-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt closed GEODE-9011.
-

> (deleted)
> -
>
> Key: GEODE-9011
> URL: https://issues.apache.org/jira/browse/GEODE-9011
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> submitted to the wrong JIRA - sorry



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9011) (deleted)

2021-03-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-9011:
--
Description: submitted to the wrong JIRA - sorry  (was: A test was reported 
hung when it tried to shut down.  One server reported this:

{noformat}
[warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 
 tid=0x90] 15 seconds have elapsed while 
waiting for replies: :41006,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]>
 on 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007
 whose current membership list is: 
[[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004,
 
rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007,
 
rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]]
{noformat}

and was stuck waiting for a reply in thread dumps

{noformat}
"vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xf4f654f8> (a 
java.util.concurrent.CountDownLatch$Sync)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at 
org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
at 
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
at 
org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
at 
org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
at 
org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
at 
org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
at 
org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
at 
org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
- locked <0xf8022800> (a 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
- locked <0xf5f7b888> (a java.lang.Object)
at 
org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
- locked <0xf7ef2980> (a 
org.apache.geode.internal.cache.CacheServerImpl)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2263)
- locked <0xf5a21a08> (a java.lang.Class for 

[jira] [Created] (GEODE-9011) hctKill.conf Error deserializing message causes hang

2021-03-08 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-9011:
-

 Summary: hctKill.conf Error deserializing message causes hang
 Key: GEODE-9011
 URL: https://issues.apache.org/jira/browse/GEODE-9011
 Project: Geode
  Issue Type: Test
  Components: membership, messaging
Reporter: Bruce J Schuchardt


A test was reported hung when it tried to shut down.  One server reported this:

{noformat}
[warn 2021/03/06 09:45:18.783 PST bridgegemfire_1_1_host1_6920 
 tid=0x90] 15 seconds have elapsed while 
waiting for replies: :41006,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005]>
 on 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007
 whose current membership list is: 
[[rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_2_host1_7658:7658):41004,
 
rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_2_host1_13486:13486:locator):41003,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_3_host1_582:582):41006,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_4_host1_31258:31258):41005,
 
rs-FullRegression58615648a0i3large-hydra-client-18(bridgegemfire_1_1_host1_6920:6920):41007,
 
rs-FullRegression58615648a0i3large-hydra-client-18(locatorgemfire_1_1_host1_13950:13950:locator):41000]]
{noformat}

and was stuck waiting for a reply in thread dumps

{noformat}
"vm_0_thr_0_bridge_1_1_host1_6920" #144 daemon prio=5 os_prio=0 
tid=0x7fec70058800 nid=0x1d28 waiting on condition [0x7fec62063000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xf4f654f8> (a 
java.util.concurrent.CountDownLatch$Sync)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at 
org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:723)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:794)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:771)
at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:857)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:779)
at 
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:676)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
at 
org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
at 
org.apache.geode.internal.cache.DistributedRegion.distributeDestroyRegion(DistributedRegion.java:1865)
at 
org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1844)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6180)
at 
org.apache.geode.internal.cache.HARegion.destroyRegion(HARegion.java:331)
at 
org.apache.geode.internal.cache.AbstractRegion.destroyRegion(AbstractRegion.java:476)
at 
org.apache.geode.internal.cache.ha.HARegionQueue.destroy(HARegionQueue.java:3438)
at 
org.apache.geode.internal.cache.ha.HARegionQueue$BlockingHARegionQueue.destroy(HARegionQueue.java:2272)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.destroyRQ(CacheClientProxy.java:1031)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:939)
at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.shutdown(CacheClientNotifier.java:1306)
- locked <0xf8022800> (a 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.close(AcceptorImpl.java:1630)
- locked <0xf5f7b888> (a java.lang.Object)
at 
org.apache.geode.internal.cache.CacheServerImpl.stop(CacheServerImpl.java:491)
- locked <0xf7ef2980> (a 
org.apache.geode.internal.cache.CacheServerImpl)
at 
org.apache.geode.internal.cache.GemFireCacheImpl.stopServers(GemFireCacheImpl.java:2672)
at 

[jira] [Resolved] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest

2021-03-04 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8979.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> CI Failure: SSLSocketHostNameVerificationIntegrationTest
> 
>
> Key: GEODE-8979
> URL: https://issues.apache.org/jira/browse/GEODE-8979
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> This test failed in a CI IntegrationTest run with this exception:
> {noformat}
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > 
> nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] 
> FAILED
> org.apache.geode.GemFireIOException: exception closing SSL session
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409)
> at 
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216)
> Caused by:
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:51)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403)
> ... 1 more
> {noformat}
> It looks like the test needs to have a try/catch for IOException when closing 
> the NioSslEngine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-9000) NPE During Reconnect After Network Split

2021-03-04 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295377#comment-17295377
 ] 

Bruce J Schuchardt commented on GEODE-9000:
---

The server was reconnecting and emptying out messages queued during quorum 
checks:

{noformat}
logsAndStats/gemfire-cluster-server-0-02-01.log: [info 2021/03/04 10:30:28.595 
GMT gemfire-cluster-server-0  tid=0x8c] Delivering 22 messages 
queued by quorum checker

logsAndStats/gemfire-cluster-server-0-02-01.log: [info 2021/03/04 10:30:28.596 
GMT gemfire-cluster-server-0  tid=0x8c] received suspect 
message from 10.4.2.34(:locator):41000 for 
10.4.3.19(gemfire-cluster-locator-0:1:locator):41000: Member isn't 
responding to heartbeat requests

[fatal 2021/03/04 10:30:28.596 GMT gemfire-cluster-server-0  
tid=0x8c] Unexpected exception while booting membership services
java.lang.NullPointerException
at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
at 
org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
{noformat}

The network-partition message was delivered during this time and was likely 
intended for the previous Membership service.  Adding a check for "isJoined" or 
a null currentView and ignoring the message is probably the right way to fix 
this problem.

> NPE During Reconnect After Network Split
> 
>
> Key: GEODE-9000
> URL: https://issues.apache.org/jira/browse/GEODE-9000
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.14.0
>Reporter: Juan Ramos
>Priority: Major
>
> During a full network split when all members get shutdown by a partition, one 
> of the servers continually fails to reconnect due to a 
> {{NullPointerException}}. When using persistent regions, this also prevents 
> the remaining members from correctly start up as they might be waiting for 
> the stuck member to recover the latest data.
> The issue itself has been introduced by the fix for GEODE-8901, the new 
> implementation for {{GMSJoinLeave.processNetworkPartitionMessage}} doesn't 
> have a {{currentView}} installed during the reconnect phase ({{getView() == 
> null}}) and the following is shown in the logs:
> {noformat}
> [fatal 2021/03/04 03:32:02.744 GMT gemfire-cluster-server-0  
> tid=0x8a] Unexpected exception while booting membership services
> java.lang.NullPointerException
>   at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processNetworkPartitionMessage(GMSJoinLeave.java:1459)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1343)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger.started(JGroupsMessenger.java:428)
>   at 
> org.apache.geode.distributed.internal.membership.gms.Services.start(Services.java:210)
>   at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.start(GMSMembership.java:1782)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.start(DistributionImpl.java:171)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.createDistribution(DistributionImpl.java:222)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:464)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.(ClusterDistributionManager.java:497)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.create(ClusterDistributionManager.java:326)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.initialize(InternalDistributedSystem.java:779)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.access$200(InternalDistributedSystem.java:135)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem$Builder.build(InternalDistributedSystem.java:3034)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.connectInternal(InternalDistributedSystem.java:290)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2605)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2424)
>   at 
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1275)
>   at 
> 

[jira] [Resolved] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck

2021-03-03 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8999.
---
Resolution: Not A Problem

After talking with Darrel I think the thread was stuck.  It was reported during 
a network partition and the client had invalidated the server and was closing 
connections to it.

> When max-threads is specified for a cache server its reader threads may be 
> reported as Stuck
> 
>
> Key: GEODE-8999
> URL: https://issues.apache.org/jira/browse/GEODE-8999
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, membership
>Affects Versions: 1.14.0
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> We noticed this report of a stuck thread in a test that enabled max-threads 
> in a cache server:
> {noformat}
> [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822  
> tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> 
> has been stuck for <46.356 seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack:
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> sun.nio.ch.IOUtil.read(IOUtil.java:192)
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237)
> org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859)
> org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229)
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777)
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown
>  Source)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown
>  Source)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> The cache server should suspend thread monitoring before reading from a 
> socket and resume monitoring afterward.  An example of this can be found in 
> org.apache.geode.internal.tcp.Connection.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck

2021-03-03 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt closed GEODE-8999.
-

> When max-threads is specified for a cache server its reader threads may be 
> reported as Stuck
> 
>
> Key: GEODE-8999
> URL: https://issues.apache.org/jira/browse/GEODE-8999
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, membership
>Affects Versions: 1.14.0
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> We noticed this report of a stuck thread in a test that enabled max-threads 
> in a cache server:
> {noformat}
> [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822  
> tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> 
> has been stuck for <46.356 seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack:
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> sun.nio.ch.IOUtil.read(IOUtil.java:192)
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237)
> org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859)
> org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229)
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777)
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown
>  Source)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown
>  Source)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> The cache server should suspend thread monitoring before reading from a 
> socket and resume monitoring afterward.  An example of this can be found in 
> org.apache.geode.internal.tcp.Connection.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck

2021-03-03 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8999:
--
Component/s: membership

> When max-threads is specified for a cache server its reader threads may be 
> reported as Stuck
> 
>
> Key: GEODE-8999
> URL: https://issues.apache.org/jira/browse/GEODE-8999
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, membership
>Affects Versions: 1.14.0
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> We noticed this report of a stuck thread in a test that enabled max-threads 
> in a cache server:
> {noformat}
> [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822  
> tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> 
> has been stuck for <46.356 seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack:
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> sun.nio.ch.IOUtil.read(IOUtil.java:192)
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237)
> org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859)
> org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229)
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777)
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown
>  Source)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown
>  Source)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> The cache server should suspend thread monitoring before reading from a 
> socket and resume monitoring afterward.  An example of this can be found in 
> org.apache.geode.internal.tcp.Connection.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck

2021-03-03 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8999:
--
Issue Type: Bug  (was: Test)

> When max-threads is specified for a cache server its reader threads may be 
> reported as Stuck
> 
>
> Key: GEODE-8999
> URL: https://issues.apache.org/jira/browse/GEODE-8999
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> We noticed this report of a stuck thread in a test that enabled max-threads 
> in a cache server:
> {noformat}
> [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822  
> tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> 
> has been stuck for <46.356 seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack:
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> sun.nio.ch.IOUtil.read(IOUtil.java:192)
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237)
> org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859)
> org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229)
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777)
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown
>  Source)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown
>  Source)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> The cache server should suspend thread monitoring before reading from a 
> socket and resume monitoring afterward.  An example of this can be found in 
> org.apache.geode.internal.tcp.Connection.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck

2021-03-03 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8999:
--
Affects Version/s: 1.14.0

> When max-threads is specified for a cache server its reader threads may be 
> reported as Stuck
> 
>
> Key: GEODE-8999
> URL: https://issues.apache.org/jira/browse/GEODE-8999
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.14.0
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> We noticed this report of a stuck thread in a test that enabled max-threads 
> in a cache server:
> {noformat}
> [warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822  
> tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> 
> has been stuck for <46.356 seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Executor Group 
> Monitored metric 
> Thread stack:
> sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> sun.nio.ch.IOUtil.read(IOUtil.java:192)
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237)
> org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859)
> org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213)
> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229)
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777)
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73)
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710)
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown
>  Source)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
> org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown
>  Source)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> The cache server should suspend thread monitoring before reading from a 
> socket and resume monitoring afterward.  An example of this can be found in 
> org.apache.geode.internal.tcp.Connection.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8999) When max-threads is specified for a cache server its reader threads may be reported as Stuck

2021-03-03 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8999:
-

 Summary: When max-threads is specified for a cache server its 
reader threads may be reported as Stuck
 Key: GEODE-8999
 URL: https://issues.apache.org/jira/browse/GEODE-8999
 Project: Geode
  Issue Type: Test
  Components: client/server
Reporter: Bruce J Schuchardt


We noticed this report of a stuck thread in a test that enabled max-threads in 
a cache server:

{noformat}
[warn 2021/03/02 19:54:31.041 PST bridgep2_host2_17822  
tid=0x1b] Thread <104> (0x68) that was executed at <02 Mar 2021 19:53:44 PST> 
has been stuck for <46.356 seconds> and number of thread monitor iteration <1>
Thread Name  state 
Executor Group 
Monitored metric 
Thread stack:
sun.nio.ch.FileDispatcherImpl.read0(Native Method)
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
sun.nio.ch.IOUtil.read(IOUtil.java:192)
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
org.apache.geode.internal.cache.tier.sockets.Message.readWrappedHeaders(Message.java:1237)
org.apache.geode.internal.cache.tier.sockets.Message.fetchHeader(Message.java:859)
org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:698)
org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1213)
org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1229)
org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:816)
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:777)
org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:73)
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1185)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:710)
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$351/1357226696.invoke(Unknown
 Source)
org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$88/1800187767.run(Unknown
 Source)
java.lang.Thread.run(Thread.java:748)
{noformat}

The cache server should suspend thread monitoring before reading from a socket 
and resume monitoring afterward.  An example of this can be found in 
org.apache.geode.internal.tcp.Connection.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8997) remove protobuf client server code

2021-03-03 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294685#comment-17294685
 ] 

Bruce J Schuchardt commented on GEODE-8997:
---

also these subprojects:
geode-protobuf
geode-protobuf-messages
geode-experimental-driver

> remove protobuf client server code
> --
>
> Key: GEODE-8997
> URL: https://issues.apache.org/jira/browse/GEODE-8997
> Project: Geode
>  Issue Type: Improvement
>  Components: client/server
>Reporter: Darrel Schneider
>Priority: Major
>
> The protobuf based client/server project is essentially dead but code for it 
> is still part of geode.
> This complicates the implementation. For example I was working on an 
> improvement to have the thread monitor detect stuck server connection threads 
> and found myself trying to figure out how to make this work for 
> ProtobufServerConnection.
> I think it would be best to remove the dead protobuf code. I'm not sure what 
> all of it is but here is what I have found so far:
> ProtobufServerConnection
> package org.apache.geode.internal.cache.client.protocol
> package org.apache.geode.internal.protocol.protobuf.v1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest

2021-03-02 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-8979:
-

Assignee: Bruce J Schuchardt

> CI Failure: SSLSocketHostNameVerificationIntegrationTest
> 
>
> Key: GEODE-8979
> URL: https://issues.apache.org/jira/browse/GEODE-8979
> Project: Geode
>  Issue Type: Test
>  Components: membership, messaging
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> This test failed in a CI IntegrationTest run with this exception:
> {noformat}
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > 
> nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] 
> FAILED
> org.apache.geode.GemFireIOException: exception closing SSL session
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409)
> at 
> org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216)
> Caused by:
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:51)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
> at 
> org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403)
> ... 1 more
> {noformat}
> It looks like the test needs to have a try/catch for IOException when closing 
> the NioSslEngine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-03-02 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8963.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: serialization
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8979) CI Failure: SSLSocketHostNameVerificationIntegrationTest

2021-02-26 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8979:
-

 Summary: CI Failure: SSLSocketHostNameVerificationIntegrationTest
 Key: GEODE-8979
 URL: https://issues.apache.org/jira/browse/GEODE-8979
 Project: Geode
  Issue Type: Test
  Components: membership, messaging
Reporter: Bruce J Schuchardt


This test failed in a CI IntegrationTest run with this exception:

{noformat}
org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest > 
nioHandshakeValidatesHostName[hasSAN=true and doEndPointIdentification=true] 
FAILED
org.apache.geode.GemFireIOException: exception closing SSL session
at 
org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:409)
at 
org.apache.geode.internal.net.SSLSocketHostNameVerificationIntegrationTest.lambda$startServerNIO$3(SSLSocketHostNameVerificationIntegrationTest.java:216)

Caused by:
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:51)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
at 
org.apache.geode.internal.net.NioSslEngine.close(NioSslEngine.java:403)
... 1 more
{noformat}

It looks like the test needs to have a try/catch for IOException when closing 
the NioSslEngine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8978) convert all DataSerializableFixedID classes to stop using InternalDataSerializer's static methods

2021-02-26 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8978:
--
Description: 
When we introduced the geode-serialization module we created new method 
signatures for toData and fromData in the form
{code:java}
void toData(DataOutput out, SerializationContext context) throws IOException;
{code}
and
{code:java}
void fromData(DataInput in, DeserializationContext context)
  throws IOException, ClassNotFoundException;
{code}

All DataSerializableFixedID classes were modified to use these signatures but 
many continue to use InternalDataSerializer and/or DataSerializer static 
methods to perform their work.  These should be changed to use the 
SerializationContext or DeserializationContext parameter along with 
StaticSerialization  whenever possible.

  was:
When we introduced the geode-serialization module we created new method 
signatures for toData and fromData in the form
{code:java}
void toData(DataOutput out, SerializationContext context) throws IOException;
{code}
and
{code:java}
void fromData(DataInput in, DeserializationContext context)
  throws IOException, ClassNotFoundException;
{code}

All DataSerializableFixedID classes were modified to use these signatures but 
many continue to use InternalDataSerializer and/or DataSerializer static 
methods to perform their work.  These should be changed to use the 
SerializationContext parameter and StaticSerialization  whenever possible.


> convert all DataSerializableFixedID classes to stop using 
> InternalDataSerializer's static methods
> -
>
> Key: GEODE-8978
> URL: https://issues.apache.org/jira/browse/GEODE-8978
> Project: Geode
>  Issue Type: Improvement
>  Components: membership, serialization
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> When we introduced the geode-serialization module we created new method 
> signatures for toData and fromData in the form
> {code:java}
> void toData(DataOutput out, SerializationContext context) throws IOException;
> {code}
> and
> {code:java}
> void fromData(DataInput in, DeserializationContext context)
>   throws IOException, ClassNotFoundException;
> {code}
> All DataSerializableFixedID classes were modified to use these signatures but 
> many continue to use InternalDataSerializer and/or DataSerializer static 
> methods to perform their work.  These should be changed to use the 
> SerializationContext or DeserializationContext parameter along with 
> StaticSerialization  whenever possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8978) convert all DataSerializableFixedID classes to stop using InternalDataSerializer's static methods

2021-02-26 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8978:
--
Issue Type: Improvement  (was: Bug)

> convert all DataSerializableFixedID classes to stop using 
> InternalDataSerializer's static methods
> -
>
> Key: GEODE-8978
> URL: https://issues.apache.org/jira/browse/GEODE-8978
> Project: Geode
>  Issue Type: Improvement
>  Components: membership, serialization
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> When we introduced the geode-serialization module we created new method 
> signatures for toData and fromData in the form
> {code:java}
> void toData(DataOutput out, SerializationContext context) throws IOException;
> {code}
> and
> {code:java}
> void fromData(DataInput in, DeserializationContext context)
>   throws IOException, ClassNotFoundException;
> {code}
> All DataSerializableFixedID classes were modified to use these signatures but 
> many continue to use InternalDataSerializer and/or DataSerializer static 
> methods to perform their work.  These should be changed to use the 
> SerializationContext parameter and StaticSerialization  whenever possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8978) convert all DataSerializableFixedID classes to stop using InternalDataSerializer's static methods

2021-02-26 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8978:
-

 Summary: convert all DataSerializableFixedID classes to stop using 
InternalDataSerializer's static methods
 Key: GEODE-8978
 URL: https://issues.apache.org/jira/browse/GEODE-8978
 Project: Geode
  Issue Type: Bug
  Components: membership, serialization
Reporter: Bruce J Schuchardt


When we introduced the geode-serialization module we created new method 
signatures for toData and fromData in the form
{code:java}
void toData(DataOutput out, SerializationContext context) throws IOException;
{code}
and
{code:java}
void fromData(DataInput in, DeserializationContext context)
  throws IOException, ClassNotFoundException;
{code}

All DataSerializableFixedID classes were modified to use these signatures but 
many continue to use InternalDataSerializer and/or DataSerializer static 
methods to perform their work.  These should be changed to use the 
SerializationContext parameter and StaticSerialization  whenever possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8948) log a locator's coordinates during launch

2021-02-25 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8948.
---
Resolution: Invalid

We are already logging the locator's coordinates in its log file, so this 
ticket isn't needed.

[info 2021/02/17 15:43:16.162 PST locatorgemfire_1_1_host1_47196 
 tid=0x17] Locator started on 
bruces-a01.fios-router.home[27985]

> log a locator's coordinates during launch
> -
>
> Key: GEODE-8948
> URL: https://issues.apache.org/jira/browse/GEODE-8948
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> Looking through a Locator's log file it is difficult, if not impossible, to 
> tell what the locator's host and port are.  This makes it difficult to know 
> which Locator log files to examine when debugging if a client (or WAN 
> service) has trouble contacting a Locator because they only log that locators 
> host and port number.
> If Locators would log something like
>                   Starting location services on \{hostname} and port 
> \{portnumber}
> and with any other additional info that would be useful in grepping through 
> artifacts to find a log file of interest it would help a lot in debugging 
> efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GEODE-8948) log a locator's coordinates during launch

2021-02-25 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt closed GEODE-8948.
-

> log a locator's coordinates during launch
> -
>
> Key: GEODE-8948
> URL: https://issues.apache.org/jira/browse/GEODE-8948
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> Looking through a Locator's log file it is difficult, if not impossible, to 
> tell what the locator's host and port are.  This makes it difficult to know 
> which Locator log files to examine when debugging if a client (or WAN 
> service) has trouble contacting a Locator because they only log that locators 
> host and port number.
> If Locators would log something like
>                   Starting location services on \{hostname} and port 
> \{portnumber}
> and with any other additional info that would be useful in grepping through 
> artifacts to find a log file of interest it would help a lot in debugging 
> efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8972) remove shunnedMembers collection from GMSMembership

2021-02-24 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8972:
--
Issue Type: Improvement  (was: Bug)

> remove shunnedMembers collection from GMSMembership
> ---
>
> Key: GEODE-8972
> URL: https://issues.apache.org/jira/browse/GEODE-8972
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> GMSMembership has a _shunnedMembers_ collection that is used to track the IDs 
> of nodes that are no longer part of the cluster.  This collection is no 
> longer needed since we can tell if a node is old by comparing the view ID in 
> its identifier to that of the current view (called _latestView_ in that 
> class.  Checks like this are already in place in some parts of the code.
> All uses of _shunnedMembers_ should be replaced with this check.
> MembershipView view = latestView;
> boolean shunned = memberId.getVmViewId() <= view.getViewId() && 
> !view.contains(memberId);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8972) remove shunnedMembers collection from GMSMembership

2021-02-24 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8972:
-

 Summary: remove shunnedMembers collection from GMSMembership
 Key: GEODE-8972
 URL: https://issues.apache.org/jira/browse/GEODE-8972
 Project: Geode
  Issue Type: Bug
  Components: membership
Reporter: Bruce J Schuchardt


GMSMembership has a _shunnedMembers_ collection that is used to track the IDs 
of nodes that are no longer part of the cluster.  This collection is no longer 
needed since we can tell if a node is old by comparing the view ID in its 
identifier to that of the current view (called _latestView_ in that class.  
Checks like this are already in place in some parts of the code.

All uses of _shunnedMembers_ should be replaced with this check.

MembershipView view = latestView;
boolean shunned = memberId.getVmViewId() <= view.getViewId() && 
!view.contains(memberId);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-02-23 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289395#comment-17289395
 ] 

Bruce J Schuchardt commented on GEODE-8963:
---

bq. If we didn't bump the ordinal at 1.14.0 what would we do if we needed a 
protocol change in 1.13.2? Would we say "no" to that protocol change?

That is exactly the reason we do it.  I think we should be bumping it by more 
than 5.  We aren't going to run out of numbers if we bump it by 10 or 20.

> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: serialization
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Minor
>  Labels: pull-request-available
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-02-23 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8963:
--
Priority: Minor  (was: Major)

> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: membership, serialization
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Minor
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-02-23 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-8963:
-

Assignee: Bruce J Schuchardt

> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: membership, serialization
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-02-23 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8963:
--
Component/s: (was: membership)

> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: serialization
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Minor
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-02-23 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8963:
--
Component/s: membership

> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: membership, serialization
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-02-22 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8963:
-

 Summary: separate client/server compatibility from server/server 
version compatibility
 Key: GEODE-8963
 URL: https://issues.apache.org/jira/browse/GEODE-8963
 Project: Geode
  Issue Type: Bug
  Components: serialization
Reporter: Bruce J Schuchardt


A client's version is used for deserializing data received from the client and 
for serializing data sent to the client. It is also used to locate the map of 
Commands used to process client requests. Every time we cut a new release we 
bump this version in KnownVersions and create a new map of Commands, even 
though client/server communications protocols rarely change.
 We should have each KnownVersion hold a client/server compatibility number 
that is used to identify clients rather than the KnownVersion's ordinal.

For instance,
{code:java}
  public static final KnownVersion GEODE_1_15_0 =
  new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, (byte) 
0,
  /*server/server version*/GEODE_1_15_0_ORDINAL, 
  /*client/server version*/GEODE_1_15_0_ORDINAL);
  
public static final KnownVersion GEODE_1_16_0 =
  new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, (byte) 
0,
  /*server/server version*/GEODE_1_16_0_ORDINAL, 
  /*client/server version*/GEODE_1_15_0_ORDINAL);

public static final KnownVersion GEODE_1_17_0 =
  new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, (byte) 
0,
  /*server/server version*/GEODE_1_17_0_ORDINAL, 
  /*client/server version*/GEODE_1_15_0_ORDINAL);
{code}

In the above KnownVersions the client/server serialization is known to have not 
changed since v1.15.0 and so there is no need to use a newer KnownVersion for 
clients.

Client handshake code will need to be changed to use the client/server ordinal 
when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8963) separate client/server compatibility from server/server version compatibility

2021-02-22 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8963:
--
Issue Type: Improvement  (was: Bug)

> separate client/server compatibility from server/server version compatibility
> -
>
> Key: GEODE-8963
> URL: https://issues.apache.org/jira/browse/GEODE-8963
> Project: Geode
>  Issue Type: Improvement
>  Components: serialization
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> A client's version is used for deserializing data received from the client 
> and for serializing data sent to the client. It is also used to locate the 
> map of Commands used to process client requests. Every time we cut a new 
> release we bump this version in KnownVersions and create a new map of 
> Commands, even though client/server communications protocols rarely change.
>  We should have each KnownVersion hold a client/server compatibility number 
> that is used to identify clients rather than the KnownVersion's ordinal.
> For instance,
> {code:java}
>   public static final KnownVersion GEODE_1_15_0 =
>   new KnownVersion("GEODE", "1.15.0", (byte) 1, (byte) 15, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_15_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
>   
> public static final KnownVersion GEODE_1_16_0 =
>   new KnownVersion("GEODE", "1.16.0", (byte) 1, (byte) 16, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_16_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> public static final KnownVersion GEODE_1_17_0 =
>   new KnownVersion("GEODE", "1.17.0", (byte) 1, (byte) 17, (byte) 0, 
> (byte) 0,
>   /*server/server version*/GEODE_1_17_0_ORDINAL, 
>   /*client/server version*/GEODE_1_15_0_ORDINAL);
> {code}
> In the above KnownVersions the client/server serialization is known to have 
> not changed since v1.15.0 and so there is no need to use a newer KnownVersion 
> for clients.
> Client handshake code will need to be changed to use the client/server 
> ordinal when identifying clients and servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8956) LocatorMembershipListenerImpl has unconstrained thread creation that can crash a machine

2021-02-19 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8956:
-

 Summary: LocatorMembershipListenerImpl has unconstrained thread 
creation that can crash a machine
 Key: GEODE-8956
 URL: https://issues.apache.org/jira/browse/GEODE-8956
 Project: Geode
  Issue Type: Bug
  Components: wan
Reporter: Bruce J Schuchardt


In reviewing PR 6013 I found that a simple change meant to resolve a difficult 
problem lead to unrestrained thread growth, sometimes topping out at over 5000 
threads, in a locator that often crashed the host machine.  The thread growth 
was due to this method in LocatorMembershipListenerImpl:
{code:java}
Thread buildLocatorsDistributorThread(DistributionLocatorId localLocatorId,
Map> remoteLocators, 
DistributionLocatorId joiningLocator,
int joiningLocatorDistributedSystemId) {
  Runnable distributeLocatorsRunnable =
  new DistributeLocatorsRunnable(config.getMemberTimeout(), tcpClient, 
localLocatorId,
  remoteLocators, joiningLocator, joiningLocatorDistributedSystemId);
  ThreadFactory threadFactory = new 
LoggingThreadFactory(LOCATORS_DISTRIBUTOR_THREAD_NAME, true);

  return threadFactory.newThread(distributeLocatorsRunnable);
}
{code}

This should probably be performed in an Executor with a reasonable max-threads 
limit based on the number of local and remote-locators in the 
DistributionConfig.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8955) WAN location service uses DistributedLocatorId.toString() to represent a locator

2021-02-19 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8955:
--
Priority: Minor  (was: Major)

> WAN location service uses DistributedLocatorId.toString() to represent a 
> locator
> 
>
> Key: GEODE-8955
> URL: https://issues.apache.org/jira/browse/GEODE-8955
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Reporter: Bruce J Schuchardt
>Priority: Minor
>
> This code in LocatorHelper, and probably code in other parts of the WAN 
> location service, uses DistributionLocatorId.toString() to track whether 
> other locators have the WAN location service available.  It should use the 
> DistributionLocatorId.marshal() method instead.  We should never use the 
> toString() representation of an object in this way as it may change over time.
>  
> {code:java}
> private static void addServerLocator(Integer distributedSystemId,
> LocatorMembershipListener locatorListener, DistributionLocatorId locator) 
> {
>   ConcurrentHashMap> allServerLocatorsInfo =
>   (ConcurrentHashMap>) 
> locatorListener.getAllServerLocatorsInfo();
>   Set locatorsSet = new CopyOnWriteHashSet();
>   locatorsSet.add(locator.toString());
>   Set existingValue = 
> allServerLocatorsInfo.putIfAbsent(distributedSystemId, locatorsSet);
>   if (existingValue != null) {
> if (!existingValue.contains(locator.toString())) {
>   existingValue.add(locator.toString());
> }
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8955) WAN location service uses DistributedLocatorId.toString() to represent a locator

2021-02-19 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8955:
-

 Summary: WAN location service uses DistributedLocatorId.toString() 
to represent a locator
 Key: GEODE-8955
 URL: https://issues.apache.org/jira/browse/GEODE-8955
 Project: Geode
  Issue Type: Improvement
  Components: wan
Reporter: Bruce J Schuchardt


This code in LocatorHelper, and probably code in other parts of the WAN 
location service, uses DistributionLocatorId.toString() to track whether other 
locators have the WAN location service available.  It should use the 
DistributionLocatorId.marshal() method instead.  We should never use the 
toString() representation of an object in this way as it may change over time.

 
{code:java}
private static void addServerLocator(Integer distributedSystemId,
LocatorMembershipListener locatorListener, DistributionLocatorId locator) {
  ConcurrentHashMap> allServerLocatorsInfo =
  (ConcurrentHashMap>) 
locatorListener.getAllServerLocatorsInfo();

  Set locatorsSet = new CopyOnWriteHashSet();
  locatorsSet.add(locator.toString());
  Set existingValue = 
allServerLocatorsInfo.putIfAbsent(distributedSystemId, locatorsSet);
  if (existingValue != null) {
if (!existingValue.contains(locator.toString())) {
  existingValue.add(locator.toString());
}
  }
}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8922) Remove ProductUseLog

2021-02-18 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8922.
---
Fix Version/s: 1.15.0
   Resolution: Fixed

> Remove ProductUseLog
> 
>
> Key: GEODE-8922
> URL: https://issues.apache.org/jira/browse/GEODE-8922
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> A Locator logs the number of servers present in the cluster to a file that's 
> of little use to anyone.  The log was added long ago in a weird attempt to 
> monitor whether users were adhering to their license contract.  We should 
> remove ProductUseLog and its tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8951) Unnecessary messaging in WAN locator discovery

2021-02-17 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8951:
--
Priority: Minor  (was: Major)

> Unnecessary messaging in WAN locator discovery
> --
>
> Key: GEODE-8951
> URL: https://issues.apache.org/jira/browse/GEODE-8951
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Affects Versions: 1.15.0
>Reporter: Bruce J Schuchardt
>Priority: Minor
>
> While debugging another issue I noticed that a locator was trying to send a 
> notice to another locator in its cluster telling it that the recipient had 
> joined.
>  
> [warn 2021/02/16 15:16:56.195 PST locatorgemfire_4_3_host2_9736 
>  tid=0x153] Locator Membership listener 
> permanently failed to exchange locator information 
> *rs-GEM-3188-VJ1459-1a0i3large-hydra-client-1:27878* with 
> *rs-GEM-3188-VJ1459-1a0i3large-hydra-client-2:28778* after 3 retry attempts
>  
> This messaging is unnecessary.  The locator that this message was being sent 
> to already knows about itself.   This is being done in 
> _DistributeLocatorsRunnable.run()._ 
>  
> {code:java}
> for (DistributionLocatorId remoteLocator : entry.getValue()) {
>   // Notify known remote locator about the advertised locator.
>   LocatorJoinMessage advertiseNewLocatorMessage = new 
> LocatorJoinMessage(
>   joiningLocatorDistributedSystemId, joiningLocator, 
> localLocatorId, "");
>   sendMessage(remoteLocator, advertiseNewLocatorMessage, 
> failedMessages);
>   // Notify the advertised locator about remote known locator.
>   LocatorJoinMessage advertiseKnownLocatorMessage =
>   new LocatorJoinMessage(entry.getKey(), remoteLocator, 
> localLocatorId, "");
>   sendMessage(joiningLocator, advertiseKnownLocatorMessage, 
> failedMessages);
> }
> {code}
> It should check to see if the joiningLocator ID is equal to the remoteLocator 
> ID and, if so, not create messages in that iteration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8951) Unnecessary messaging in WAN locator discovery

2021-02-17 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8951:
-

 Summary: Unnecessary messaging in WAN locator discovery
 Key: GEODE-8951
 URL: https://issues.apache.org/jira/browse/GEODE-8951
 Project: Geode
  Issue Type: Improvement
  Components: wan
Affects Versions: 1.15.0
Reporter: Bruce J Schuchardt


While debugging another issue I noticed that a locator was trying to send a 
notice to another locator in its cluster telling it that the recipient had 
joined.

 

[warn 2021/02/16 15:16:56.195 PST locatorgemfire_4_3_host2_9736 
 tid=0x153] Locator Membership listener permanently 
failed to exchange locator information 
*rs-GEM-3188-VJ1459-1a0i3large-hydra-client-1:27878* with 
*rs-GEM-3188-VJ1459-1a0i3large-hydra-client-2:28778* after 3 retry attempts

 

This messaging is unnecessary.  The locator that this message was being sent to 
already knows about itself.   This is being done in 
_DistributeLocatorsRunnable.run()._ 

 
{code:java}
for (DistributionLocatorId remoteLocator : entry.getValue()) {
  // Notify known remote locator about the advertised locator.
  LocatorJoinMessage advertiseNewLocatorMessage = new 
LocatorJoinMessage(
  joiningLocatorDistributedSystemId, joiningLocator, 
localLocatorId, "");
  sendMessage(remoteLocator, advertiseNewLocatorMessage, 
failedMessages);

  // Notify the advertised locator about remote known locator.
  LocatorJoinMessage advertiseKnownLocatorMessage =
  new LocatorJoinMessage(entry.getKey(), remoteLocator, 
localLocatorId, "");
  sendMessage(joiningLocator, advertiseKnownLocatorMessage, 
failedMessages);
}
{code}
It should check to see if the joiningLocator ID is equal to the remoteLocator 
ID and, if so, not create messages in that iteration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8030) CI Failure: HARQueueNewImplDUnitTest.testHAEventWrapperDoesNotHoldCUMOnceInsideCMR

2021-02-16 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17285399#comment-17285399
 ] 

Bruce J Schuchardt commented on GEODE-8030:
---

Failed in the same way in this DistributedTestOpenJDK8 run:

https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/24

> CI Failure: 
> HARQueueNewImplDUnitTest.testHAEventWrapperDoesNotHoldCUMOnceInsideCMR
> --
>
> Key: GEODE-8030
> URL: https://issues.apache.org/jira/browse/GEODE-8030
> Project: Geode
>  Issue Type: Bug
>  Components: client queues, tests
>Reporter: Kirk Lund
>Priority: Major
>  Labels: flaky
>
> Link:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.13.0-SNAPSHOT.0220/test-results/distributedTest/1587763613/classes/org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.html#testHAEventWrapperDoesNotHoldCUMOnceInsideCMR
> Partial stack:
> {noformat}
> Caused by: org.junit.ComparisonFailure: expected: but 
> was:  bytes;threadID=3;sequenceID=14];shouldConflate=false;versionTag={v1; rv6; 
> time=1587758285078; remote};hasCqs=false]>
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 
> org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.verifyNullCUMReference(HARQueueNewImplDUnitTest.java:855)
>   at 
> org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.lambda$testHAEventWrapperDoesNotHoldCUMOnceInsideCMR$bb17a952$4(HARQueueNewImplDUnitTest.java:683)
> {noformat}
> Full stack:
> {noformat}
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest$$Lambda$260/1730948285.run
>  in VM 1 running on Host c8674217ee1c with 4 VMs
>   at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610)
>   at org.apache.geode.test.dunit.VM.invoke(VM.java:437)
>   at 
> org.apache.geode.internal.cache.ha.HARQueueNewImplDUnitTest.testHAEventWrapperDoesNotHoldCUMOnceInsideCMR(HARQueueNewImplDUnitTest.java:683)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>   at 
> 

[jira] [Created] (GEODE-8948) log a locator's coordinates during launch

2021-02-16 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8948:
-

 Summary: log a locator's coordinates during launch
 Key: GEODE-8948
 URL: https://issues.apache.org/jira/browse/GEODE-8948
 Project: Geode
  Issue Type: Improvement
  Components: membership
Reporter: Bruce J Schuchardt


Looking through a Locator's log file it is difficult, if not impossible, to 
tell what the locator's host and port are.  This makes it difficult to know 
which Locator log files to examine when debugging if a client (or WAN service) 
has trouble contacting a Locator because they only log that locators host and 
port number.

If Locators would log something like

                  Starting location services on \{hostname} and port 
\{portnumber}

and with any other additional info that would be useful in grepping through 
artifacts to find a log file of interest it would help a lot in debugging 
efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-8922) Remove ProductUseLog

2021-02-10 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reassigned GEODE-8922:
-

Assignee: Bruce J Schuchardt

> Remove ProductUseLog
> 
>
> Key: GEODE-8922
> URL: https://issues.apache.org/jira/browse/GEODE-8922
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>
> A Locator logs the number of servers present in the cluster to a file that's 
> of little use to anyone.  The log was added long ago in a weird attempt to 
> monitor whether users were adhering to their license contract.  We should 
> remove ProductUseLog and its tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8817) server hangs in cache close with ssl enabled due to active client connection; client side (CacheClientUpdater.close()) is hung in SSLSocketImpl$AppInputStream.deplete()

2021-02-09 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8817.
---
Fix Version/s: 1.14.0
   Resolution: Fixed

Client-side closing of sockets has not been altered but with server-side 
changes we are no longer seeing hangs.  Open a new ticket if this kind of hang 
is seen again.

> server hangs in cache close with ssl enabled due to active client connection; 
> client side (CacheClientUpdater.close()) is hung in 
> SSLSocketImpl$AppInputStream.deplete()
> 
>
> Key: GEODE-8817
> URL: https://issues.apache.org/jira/browse/GEODE-8817
> Project: Geode
>  Issue Type: Bug
>  Components: client/server, security
>Affects Versions: 1.14.0
>Reporter: Bill Burcham
>Assignee: Bill Burcham
>Priority: Major
>  Labels: blocks-1.14.0​, pull-request-available
> Fix For: 1.14.0
>
>
> A proprietary TLS/SSL-enabled application encountered a network partition. A 
> server hangs in cache close due to active client connection; client side 
> ({{CacheClientUpdater.close()}}) is hung in 
> {{SSLSocketImpl$AppInputStream.deplete()}}
> The configuration is:
> {noformat}
> ==
> losingSide  |survivingSide
> ==
> 0   |10627
> 5   |10632
> --
> 11139   |10655
> |10662
> --
> {noformat}
> The stuck threads were stuck in sun's SSL code. Geode's client/Server 
> framework uses old I/O and that was also part of where they were stuck. If 
> the clients had closed their connections to the server then it would not have 
> been stuck here. But the server shutdown shouldn't hang because of client 
> that refuses to disconnect.
> The Geode client-side of the connection is hung here:
> {code:java}
> \[warn 2020/11/06 14:56:56.577 PST  tid=0x18] Thread <50> 
> (0x32) that was executed at <06 Nov 2020 14:55:43 PST> has been stuck for 
> <72.81 seconds> and number of thread monitor iteration <1>
> Thread Name  state 
> Waiting on 
> Owned By  10.32.108.224(bridgep2_host2_10627:10627):41003 port 27636> with ID <43>
> Executor Group 
> Monitored metric 
> Thread stack:
> sun.security.ssl.SSLSocketImpl$AppInputStream.deplete(SSLSocketImpl.java:1016)
> sun.security.ssl.SSLSocketImpl$AppInputStream.access$100(SSLSocketImpl.java:816)
> sun.security.ssl.SSLSocketImpl.bruteForceCloseInput(SSLSocketImpl.java:702)
> sun.security.ssl.SSLSocketImpl.duplexCloseOutput(SSLSocketImpl.java:553)
> sun.security.ssl.SSLSocketImpl.close(SSLSocketImpl.java:485)
> org.apache.geode.internal.cache.tier.sockets.CacheClientUpdater.close(CacheClientUpdater.java:546)
> org.apache.geode.cache.client.internal.QueueConnectionImpl.internalDestroy(QueueConnectionImpl.java:112)
> org.apache.geode.cache.client.internal.QueueManagerImpl.endpointCrashed(QueueManagerImpl.java:379)
> org.apache.geode.cache.client.internal.QueueManagerImpl.connectionCrashed(QueueManagerImpl.java:357)
> org.apache.geode.cache.client.internal.QueueConnectionImpl.destroy(QueueConnectionImpl.java:88)
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:645)
> org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:504)
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:334)
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:303)
> org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:839)
> org.apache.geode.cache.client.internal.PingOp.execute(PingOp.java:38)
> org.apache.geode.cache.client.internal.LiveServerPinger$PingTask.run2(LiveServerPinger.java:90)
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1329)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:279)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> Lock owner thread stack
> java.net.SocketInputStream.socketRead0(Native Method)
> java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> java.net.SocketInputStream.read(SocketInputStream.java:171)
> 

[jira] [Resolved] (GEODE-8195) ConcurrentModificationException from LocatorMembershipListenerImpl$DistributeLocatorsRunnable.run

2021-02-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8195.
---

I accidentally reopened this ticket

> ConcurrentModificationException from 
> LocatorMembershipListenerImpl$DistributeLocatorsRunnable.run
> -
>
> Key: GEODE-8195
> URL: https://issues.apache.org/jira/browse/GEODE-8195
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Reporter: Bill Burcham
>Assignee: Bruce J Schuchardt
>Priority: Major
> Fix For: 1.12.1, 1.14.0, 1.13.0
>
>
> this WAN code in 
> {{LocatorMembershipListenerImpl$DistributeLocatorsRunnable.run}}:
> {code}
> Set joinMessages = entry.getValue();
> for (LocatorJoinMessage locatorJoinMessage : joinMessages) {
>   if (retryMessage(targetLocator, locatorJoinMessage, attempt)) {
> joinMessages.remove(locatorJoinMessage);
>   } else {
> {code}
> modifies the {{joinMessages}} set as it is iterating over the set, resulting 
> in a {{ConcurrentModificationException}}.
> This bug will cause (inter-site) notification of locators (of the presence of 
> a new locator) to fail early if retry is necessary. If we have to retry 
> notifying any locator, and we succeed, we’ll throw the 
> {{ConcurrentModificationException}} and stop trying to notify any of the 
> other locators. See the _Discovery For Multi-Site Systems_ section of the 
> [Overview of Multi-Site 
> Caching|https://geode.apache.org/docs/guide/14/topologies_and_comm/topology_concepts/multisite_overview.html]
>  documentation for an overview of the locator's role in WAN.
> Here is a scratch file that illustrates the issue, throwing 
> {{ConcurrentModificationException}}:
> {code}
> import java.util.HashSet;
> import java.util.Set;
> class Scratch {
>   public static void main(String[] args) {
> final Set joinMessages = new HashSet<>();
> joinMessages.add("one");
> joinMessages.add("two");
> for( final String entry:joinMessages ) {
>   if (entry.equals("one"))
> joinMessages.remove(entry);
> }
>   }
> }
> {code}
> From looking at the Geode code, {{joinMessages}} is not used outside the loop 
> so there is no need to modify it at all—I think we can simply remove this 
> line:
> {code}
> joinMessages.remove(locatorJoinMessage);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8825) CI failure: GatewayReceiverMBeanDUnitTest > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy

2021-02-05 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279901#comment-17279901
 ] 

Bruce J Schuchardt commented on GEODE-8825:
---

Test failed in this PR run: https://concourse.apachegeode-ci.info/builds/724

> CI failure: GatewayReceiverMBeanDUnitTest > 
> testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy
> 
>
> Key: GEODE-8825
> URL: https://issues.apache.org/jira/browse/GEODE-8825
> Project: Geode
>  Issue Type: Bug
>  Components: tests, wan
>Reporter: Jianxia Chen
>Priority: Major
>  Labels: flaky
>
> {code:java}
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest > 
> testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest$$Lambda$202/0x0001008f0c40.run
>  in VM 0 running on Host c3e48bdac460 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:623)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:447)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy(GatewayReceiverMBeanDUnitTest.java:76)
> Caused by:
> java.lang.AssertionError: expected null, but was: GemFire:service=GatewayReceiver,type=Member,member=172.17.0.18(183)-41002>
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotNull(Assert.java:756)
> at org.junit.Assert.assertNull(Assert.java:738)
> at org.junit.Assert.assertNull(Assert.java:748)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.verifyMBeanProxiesDoesNotExist(GatewayReceiverMBeanDUnitTest.java:106)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.lambda$testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy$bb17a952$3(GatewayReceiverMBeanDUnitTest.java:76)
>  {code}
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/704
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-results/distributedTest/1610390301/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-artifacts/1610390301/distributedtestfiles-OpenJDK11-1.14.0-build.0601.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8825) CI failure: GatewayReceiverMBeanDUnitTest > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy

2021-02-05 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8825:
--
Labels: flaky  (was: )

> CI failure: GatewayReceiverMBeanDUnitTest > 
> testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy
> 
>
> Key: GEODE-8825
> URL: https://issues.apache.org/jira/browse/GEODE-8825
> Project: Geode
>  Issue Type: Bug
>  Components: tests, wan
>Reporter: Jianxia Chen
>Priority: Major
>  Labels: flaky
>
> {code:java}
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest > 
> testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest$$Lambda$202/0x0001008f0c40.run
>  in VM 0 running on Host c3e48bdac460 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:623)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:447)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy(GatewayReceiverMBeanDUnitTest.java:76)
> Caused by:
> java.lang.AssertionError: expected null, but was: GemFire:service=GatewayReceiver,type=Member,member=172.17.0.18(183)-41002>
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotNull(Assert.java:756)
> at org.junit.Assert.assertNull(Assert.java:738)
> at org.junit.Assert.assertNull(Assert.java:748)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.verifyMBeanProxiesDoesNotExist(GatewayReceiverMBeanDUnitTest.java:106)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.lambda$testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy$bb17a952$3(GatewayReceiverMBeanDUnitTest.java:76)
>  {code}
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/704
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-results/distributedTest/1610390301/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-artifacts/1610390301/distributedtestfiles-OpenJDK11-1.14.0-build.0601.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8825) CI failure: GatewayReceiverMBeanDUnitTest > testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy

2021-02-05 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8825:
--
Component/s: wan
 tests

> CI failure: GatewayReceiverMBeanDUnitTest > 
> testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy
> 
>
> Key: GEODE-8825
> URL: https://issues.apache.org/jira/browse/GEODE-8825
> Project: Geode
>  Issue Type: Bug
>  Components: tests, wan
>Reporter: Jianxia Chen
>Priority: Major
>
> {code:java}
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest > 
> testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest$$Lambda$202/0x0001008f0c40.run
>  in VM 0 running on Host c3e48bdac460 with 4 VMs
> at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:623)
> at org.apache.geode.test.dunit.VM.invoke(VM.java:447)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy(GatewayReceiverMBeanDUnitTest.java:76)
> Caused by:
> java.lang.AssertionError: expected null, but was: GemFire:service=GatewayReceiver,type=Member,member=172.17.0.18(183)-41002>
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotNull(Assert.java:756)
> at org.junit.Assert.assertNull(Assert.java:738)
> at org.junit.Assert.assertNull(Assert.java:748)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.verifyMBeanProxiesDoesNotExist(GatewayReceiverMBeanDUnitTest.java:106)
> at 
> org.apache.geode.internal.cache.wan.GatewayReceiverMBeanDUnitTest.lambda$testMBeanAndProxiesForGatewayReceiverAreRemovedOnDestroy$bb17a952$3(GatewayReceiverMBeanDUnitTest.java:76)
>  {code}
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/704
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-results/distributedTest/1610390301/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0601/test-artifacts/1610390301/distributedtestfiles-OpenJDK11-1.14.0-build.0601.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8922) Remove ProductUseLog

2021-02-04 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8922:
-

 Summary: Remove ProductUseLog
 Key: GEODE-8922
 URL: https://issues.apache.org/jira/browse/GEODE-8922
 Project: Geode
  Issue Type: Improvement
  Components: membership
Reporter: Bruce J Schuchardt


A Locator logs the number of servers present in the cluster to a file that's of 
little use to anyone.  The log was added long ago in a weird attempt to monitor 
whether users were adhering to their license contract.  We should remove 
ProductUseLog and its tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8920) Modify debug logging to make it easier to trace a message

2021-02-04 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17279106#comment-17279106
 ] 

Bruce J Schuchardt commented on GEODE-8920:
---

This could do the trick in DirectChannel.java:
{code:java}
if (logger.isDebugEnabled()) {
  StringBuilder sb = new StringBuilder();
  if (retry) {
sb.append("Retrying send");
  } else {
sb.append("Sending ").append(msg).append("to 
").append(p_destinations.length)
  .append(" nodes");
  }
  sb.append(" via these tcp/ip connections: ");
  for (Connection connection: cons) {
sb.append("[").append(connection.getRemoteAddress()).append(", 
uid=")
.append(connection.getUniqueId()).append("] ");
  }
  logger.debug(sb.toString());
}
{code}

> Modify debug logging to make it easier to trace a message
> -
>
> Key: GEODE-8920
> URL: https://issues.apache.org/jira/browse/GEODE-8920
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> Debug logging in DirectChannel lets us know the IDs of receivers of a message 
> and the toString of the message but it's very difficult to figure out what 
> thread on the receiving end is supposed to process that message.
> Here's an example of what we currently have:
> [debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 
>  tid=0x4f0] Sending 
> (DLockRequestProcessor.DLockResponseMessage responding GRANT; 
> serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; 
> keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; 
> lockId=509) to 1 peers 
> ([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005])
>  via tcp/ip
> This does not tell you anything about the receiver except its ID.  On the 
> receiving side the thread that, in this run, would handle that message is 
> this:
> persistgemfire9_host1_8517  rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586):41006
>  unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> 
> tid=0x51
> I've highlighted the *uid* here because that is the _uniqueId_ of the sending 
> Connection.  If you looked through the logs or stack traces of the receiver 
> and knew the uniqueId of the sending Connection you could easily locate the 
> thread that should receive this DLockResponseMessage.  Currently this is much 
> harder than it needs to be because the DirectChannel _Sending_ log message 
> doesn't include the _uniqueId_ of the Connections it is using to send the 
> message.
> Let's change that log message to include the _uniqueId_ of each outgoing 
> Connection.  Maybe something like this:
> Sending (message.toString()) to 1 peers (peer ID)*, uid=1036* via tcp/ip
> and on the receiving side we could be clearer about what the *uid* in the 
> thread's name means:
> persistgemfire9_host1_8517  rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire8_host1_8586:8586):41006
>  unshared ordered *sender uid=1036* dom #1 local port=47207 remote 
> port=42068> tid=0x51
> or something like that.
> Now we can look at the _Sending_ message and know that the receiving thread 
> will have _uid=1036_ in its name.  Knowing this it ought to be possible to 
> write a program/script to trace a message and its consequences from one node 
> to another.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8920) Modify debug logging to make it easier to trace a message

2021-02-04 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8920:
--
Description: 
Debug logging in DirectChannel lets us know the IDs of receivers of a message 
and the toString of the message but it's very difficult to figure out what 
thread on the receiving end is supposed to process that message.

Here's an example of what we currently have:

[debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 
 tid=0x4f0] Sending 
(DLockRequestProcessor.DLockResponseMessage responding GRANT; 
serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; 
keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; 
lockId=509) to 1 peers 
([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005])
 via tcp/ip

This does not tell you anything about the receiver except its ID.  On the 
receiving side the thread that, in this run, would handle that message is this:

persistgemfire9_host1_8517 :41006
 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51

I've highlighted the *uid* here because that is the _uniqueId_ of the sending 
Connection.  If you looked through the logs or stack traces of the receiver and 
knew the uniqueId of the sending Connection you could easily locate the thread 
that should receive this DLockResponseMessage.  Currently this is much harder 
than it needs to be because the DirectChannel _Sending_ log message doesn't 
include the _uniqueId_ of the Connections it is using to send the message.

Let's change that log message to include the _uniqueId_ of each outgoing 
Connection.  Maybe something like this:

Sending (message.toString()) to 1 peers (peer ID)*, uid=1036* via tcp/ip

and on the receiving side we could be clearer about what the *uid* in the 
thread's name means:

persistgemfire9_host1_8517 :41006
 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> 
tid=0x51

or something like that.

Now we can look at the _Sending_ message and know that the receiving thread 
will have _uid=1036_ in its name.  Knowing this it ought to be possible to 
write a program/script to trace a message and its consequences from one node to 
another.


  was:
Debug logging in DirectChannel lets us know the IDs of receivers of a message 
and the toString of the message but it's very difficult to figure out what 
thread on the receiving end is supposed to process that message.

Here's an example of what we currently have:

[debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 
 tid=0x4f0] Sending 
(DLockRequestProcessor.DLockResponseMessage responding GRANT; 
serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; 
keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; 
lockId=509) to 1 peers 
([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005])
 via tcp/ip

This does not tell you anything about the receiver except its ID.  On the 
receiving side the thread that, in this run, would handle that message is this:

persistgemfire9_host1_8517 :41006
 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51

I've highlighted the *uid* here because that is the _uniqueId_ of the sending 
Connection.  If you looked through the logs or stack traces of the receiver and 
knew the uniqueId of the sending Connection you could easily locate the thread 
that should receive this DLockResponseMessage.  Currently this is much harder 
than it needs to be because the DirectChannel _Sending_ log message doesn't 
include the _uniqueId_ of the Connections it is using to send the message.

Let's change that log message to include the _uniqueId_ of each outgoing 
Connection.  Maybe something like this:

Sending (message.toString()) to 1 peers (peer ID)to 1 peers (peer ID)*, 
uid=1036* via tcp/ip

and on the receiving side we could be clearer about what the *uid* in the 
thread's name means:

persistgemfire9_host1_8517 :41006
 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> 
tid=0x51

or something like that.

Now we can look at the _Sending_ message and know that the receiving thread 
will have _uid=1036_ in its name.  Knowing this it ought to be possible to 
write a program/script to trace a message and its consequences from one node to 
another.



> Modify debug logging to make it easier to trace a message
> -
>
> Key: GEODE-8920
> URL: https://issues.apache.org/jira/browse/GEODE-8920
> Project: Geode
>  Issue Type: Improvement
>  Components: membership
>Reporter: Bruce J Schuchardt
>Priority: Major
>
> Debug logging in DirectChannel lets us know the IDs of receivers of a message 
> and the toString of the message but it's very 

[jira] [Created] (GEODE-8920) Modify debug logging to make it easier to trace a message

2021-02-04 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8920:
-

 Summary: Modify debug logging to make it easier to trace a message
 Key: GEODE-8920
 URL: https://issues.apache.org/jira/browse/GEODE-8920
 Project: Geode
  Issue Type: Improvement
  Components: membership
Reporter: Bruce J Schuchardt


Debug logging in DirectChannel lets us know the IDs of receivers of a message 
and the toString of the message but it's very difficult to figure out what 
thread on the receiving end is supposed to process that message.

Here's an example of what we currently have:

[debug 2021/02/01 16:15:17.492 PST persistgemfire8_host1_8586 
 tid=0x4f0] Sending 
(DLockRequestProcessor.DLockResponseMessage responding GRANT; 
serviceName=__PDX(version 4); objectName=PDX_LOCK; responseCode=0; 
keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=509; 
lockId=509) to 1 peers 
([rs-GEM-3166-PL1535a2i32xlarge-hydra-client-36(persistgemfire9_host1_8517:8517):41005])
 via tcp/ip

This does not tell you anything about the receiver except its ID.  On the 
receiving side the thread that, in this run, would handle that message is this:

persistgemfire9_host1_8517 :41006
 unshared ordered *uid=1036* dom #1 local port=47207 remote port=42068> tid=0x51

I've highlighted the *uid* here because that is the _uniqueId_ of the sending 
Connection.  If you looked through the logs or stack traces of the receiver and 
knew the uniqueId of the sending Connection you could easily locate the thread 
that should receive this DLockResponseMessage.  Currently this is much harder 
than it needs to be because the DirectChannel _Sending_ log message doesn't 
include the _uniqueId_ of the Connections it is using to send the message.

Let's change that log message to include the _uniqueId_ of each outgoing 
Connection.  Maybe something like this:

Sending (message.toString()) to 1 peers (peer ID)to 1 peers (peer ID)*, 
uid=1036* via tcp/ip

and on the receiving side we could be clearer about what the *uid* in the 
thread's name means:

persistgemfire9_host1_8517 :41006
 unshared ordered *sender uid=1036* dom #1 local port=47207 remote port=42068> 
tid=0x51

or something like that.

Now we can look at the _Sending_ message and know that the receiving thread 
will have _uid=1036_ in its name.  Knowing this it ought to be possible to 
write a program/script to trace a message and its consequences from one node to 
another.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8919) revert renaming of GMS processMessage methods

2021-02-04 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8919:
-

 Summary: revert renaming of GMS processMessage methods
 Key: GEODE-8919
 URL: https://issues.apache.org/jira/browse/GEODE-8919
 Project: Geode
  Issue Type: Improvement
  Components: membership
Reporter: Bruce J Schuchardt


[~upthewaterspout] modified methods in the membership module that process 
membership methods so that they are now all named *processMessage*, but this 
make it more difficult to read stack traces and know what type of message a 
thread is processing.  Let's make life easier for us and revert that change.  
Let's name each method after the type of message it processes so that we don't 
have to look at source code to figure it out.

This method, for instance, could be named *processInstallViewMessage* and we 
would know, without looking at source code, which type of message is being 
processed.

{noformat}
at 
org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:1053)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1330)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1269)
>   at org.jgroups.JChannel.invokeCallback(JChannel.java:816)
>   at org.jgroups.JChannel.up(JChannel.java:741)
>   at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030)
>   at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
>   at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
>   at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077)
>   at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792)
>   at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:73)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:72)
>   at org.jgroups.protocols.TP.passMessageUp(TP.java:1658)
>   at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876)
>   at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10)
>   at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789)
>   at org.jgroups.protocols.TP.receive(TP.java:1714)
>   at 
> org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:152)
>   at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701)
>   at java.lang.Thread.run(Thread.java:748)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8767) NullPointerException in TCPConduit.getBufferPool due to conTable being null on Windows

2021-01-29 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-8767.
---
Fix Version/s: 1.14.0
   Resolution: Fixed

> NullPointerException in TCPConduit.getBufferPool due to conTable being null 
> on Windows
> --
>
> Key: GEODE-8767
> URL: https://issues.apache.org/jira/browse/GEODE-8767
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.14.0
>Reporter: Donal Evans
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> This failure was seen in the WindowsGfshDistributedTestOpenJDK11 CI pipeline 
> job:
> {noformat}
> org.apache.geode.management.MemberMXBeanDistributedTest > classMethod FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm1.log' at line 8350
> [fatal 2020/12/03 20:00:40.000 GMT  tid=630] While pushing 
> message  regionName=#testCreateRegion2 ,distTx=false)> to recipients: 
> <10.0.0.75(server-2:2444):41002, 10.0.0.75(server-3:4716):41003, 
> 10.0.0.75(server-4:6184):41004>
> java.lang.NullPointerException
>   at 
> org.apache.geode.internal.tcp.TCPConduit.getBufferPool(TCPConduit.java:949)
>   at 
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:298)
>   at 
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:513)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:346)
>   at 
> org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:291)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2053)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1981)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2018)
>   at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1083)
>   at 
> org.apache.geode.internal.cache.partitioned.PRSanityCheckMessage$1.run2(PRSanityCheckMessage.java:133)
>   at 
> org.apache.geode.internal.SystemTimer$SystemTimerTask.run(SystemTimer.java:334)
>   at java.base/java.util.TimerThread.mainLoop(Timer.java:556)
>   at java.base/java.util.TimerThread.run(Timer.java:506)
> 3 tests completed, 1 failed
> {noformat}
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>  
> [http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0532/test-results/distributedTest/1607032539/]
>  
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> [http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0532/test-artifacts/1607032539/windows-gfshdistributedtest-OpenJDK11-1.14.0-build.0532.tgz]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-5526) CI Failure: ParallelWANStatsDUnitTest.testParallelPropagationHA fails with AssertionError for Queue Size

2021-01-07 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260862#comment-17260862
 ] 

Bruce J Schuchardt commented on GEODE-5526:
---

Same issue with a different test method in the same class:

{noformat}
org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest > 
testParallelPropagationHAWithGroupTransactionEvents FAILED
java.lang.AssertionError: expected:<0> but was:<-20>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:633)
at 
org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest.testParallelPropagationHAWithGroupTransactionEvents(ParallelWANStatsDUnitTest.java:823)
{noformat}

https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/726

> CI Failure: ParallelWANStatsDUnitTest.testParallelPropagationHA fails with 
> AssertionError for Queue Size
> 
>
> Key: GEODE-5526
> URL: https://issues.apache.org/jira/browse/GEODE-5526
> Project: Geode
>  Issue Type: Bug
>Reporter: Helena Bales
>Priority: Major
>  Labels: swat
>
> Failed in Geode DistributedTests on August 3rd, 2018 with:
> {{org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest > 
> testParallelPropagationHA FAILED}}
> {{java.lang.AssertionError: expected:<0> but was:<-3>}}
> {{at org.junit.Assert.fail(Assert.java:88)}}
> {{at org.junit.Assert.failNotEquals(Assert.java:834)}}
> {{at org.junit.Assert.assertEquals(Assert.java:645)}}
> {{at org.junit.Assert.assertEquals(Assert.java:631)}}
> {{at 
> org.apache.geode.internal.cache.wan.parallel.ParallelWANStatsDUnitTest.testParallelPropagationHA(ParallelWANStatsDUnitTest.java:429)}}
> On the assertion:
> {{assertEquals(0, v5List.get(0) + v6List.get(0) + v7List.get(0)); // queue 
> size}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3

2021-01-07 Thread Bruce J Schuchardt (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260621#comment-17260621
 ] 

Bruce J Schuchardt commented on GEODE-8816:
---

https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/723

> CI failure: SerialWanPropagationDUnitTest. 
> testReplicatedSerialPropagationWithRemoteRegionDestroy3
> --
>
> Key: GEODE-8816
> URL: https://issues.apache.org/jira/browse/GEODE-8816
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Bruce J Schuchardt
>Priority: Minor
>
> This test failed with a suspect string showing a functional problem with the 
> sender event processor.
> {noformat}
> org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > 
> testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm5.log' at line 737
> [error 2021/01/07 01:06:31.894 GMT  172.17.0.18(663):41005 unshared ordered uid=191 dom #1 local port=49013 
> remote port=40362> tid=1289] Exception occurred in CacheListener
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b
>  rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
>   at 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607)
>   at 
> org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91)
>   at 
> org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178)
>   at 
> org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265)
>   at 
> org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974)
>   at 
> org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775)
>   at 
> org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446)
>   at 
> org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716)
>   at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414)
>   at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244)
>   at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152)
>   at 
> org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968)
>   at 
> org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505)
>   at 
> org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730)
>   at 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387)
>   at 
> org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967)
>   at 
> org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110)
>   at 
> 

[jira] [Updated] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3

2021-01-07 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8816:
--
Priority: Minor  (was: Major)

> CI failure: SerialWanPropagationDUnitTest. 
> testReplicatedSerialPropagationWithRemoteRegionDestroy3
> --
>
> Key: GEODE-8816
> URL: https://issues.apache.org/jira/browse/GEODE-8816
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Bruce J Schuchardt
>Priority: Minor
>
> This test failed with a suspect string showing a functional problem with the 
> sender event processor.
> {noformat}
> org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > 
> testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in 'dunit_suspect-vm5.log' at line 737
> [error 2021/01/07 01:06:31.894 GMT  172.17.0.18(663):41005 unshared ordered uid=191 dom #1 local port=49013 
> remote port=40362> tid=1289] Exception occurred in CacheListener
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b
>  rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
>   at 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607)
>   at 
> org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91)
>   at 
> org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178)
>   at 
> org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265)
>   at 
> org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974)
>   at 
> org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775)
>   at 
> org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446)
>   at 
> org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716)
>   at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414)
>   at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244)
>   at 
> org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152)
>   at 
> org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968)
>   at 
> org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505)
>   at 
> org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59)
>   at 
> org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730)
>   at 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387)
>   at 
> org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230)
>   at 
> org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967)
>   at 
> org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208)
>   at 
> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110)
>   at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
>   at 
> 

[jira] [Updated] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3

2021-01-07 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8816:
--
Description: 
This test failed with a suspect string showing a functional problem with the 
sender event processor.

{noformat}
org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > 
testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED
java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
Fix the strings or use IgnoredException.addIgnoredException to ignore.
---
Found suspect string in 'dunit_suspect-vm5.log' at line 737

[error 2021/01/07 01:06:31.894 GMT :41005 unshared ordered uid=191 dom #1 local port=49013 
remote port=40362> tid=1289] Exception occurred in CacheListener
java.util.concurrent.RejectedExecutionException: Task 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b
 rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607)
at 
org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91)
at 
org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178)
at 
org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265)
at 
org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974)
at 
org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775)
at 
org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446)
at 
org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152)
at 
org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968)
at 
org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505)
at 
org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479)
at 
org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430)
at 
org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730)
at 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387)
at 
org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230)
at 
org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967)
at 
org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121)
at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208)
at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110)
at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
at 
org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:432)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2066)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:1831)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.dispatchMessage(GMSMembership.java:930)
at 

[jira] [Updated] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3

2021-01-07 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8816:
--
Description: 
This test failed with a suspect string showing a functional problem with the 
sender event processor.

{noformat}
org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > 
testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED
java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
Fix the strings or use IgnoredException.addIgnoredException to ignore.
---
Found suspect string in 'dunit_suspect-vm5.log' at line 737

[error 2021/01/07 01:06:31.894 GMT :41005 unshared ordered uid=191 dom #1 local port=49013 
remote port=40362> tid=1289] Exception occurred in CacheListener
java.util.concurrent.RejectedExecutionException: Task 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b
 rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607)
at 
org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91)
at 
org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178)
at 
org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265)
at 
org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974)
at 
org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775)
at 
org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446)
at 
org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152)
at 
org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968)
at 
org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505)
at 
org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479)
at 
org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430)
at 
org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730)
at 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387)
at 
org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230)
at 
org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967)
at 
org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121)
at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208)
at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110)
at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
at 
org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:432)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2066)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:1831)
at 
org.apache.geode.distributed.internal.membership.gms.GMSMembership.dispatchMessage(GMSMembership.java:930)
at 

[jira] [Created] (GEODE-8816) CI failure: SerialWanPropagationDUnitTest. testReplicatedSerialPropagationWithRemoteRegionDestroy3

2021-01-07 Thread Bruce J Schuchardt (Jira)
Bruce J Schuchardt created GEODE-8816:
-

 Summary: CI failure: SerialWanPropagationDUnitTest. 
testReplicatedSerialPropagationWithRemoteRegionDestroy3
 Key: GEODE-8816
 URL: https://issues.apache.org/jira/browse/GEODE-8816
 Project: Geode
  Issue Type: Bug
  Components: wan
Reporter: Bruce J Schuchardt


This test failed with a suspect string showing a functional problem with the 
sender queues.

{noformat}
org.apache.geode.internal.cache.wan.serial.SerialWANPropagationDUnitTest > 
testReplicatedSerialPropagationWithRemoteRegionDestroy3 FAILED
java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
Fix the strings or use IgnoredException.addIgnoredException to ignore.
---
Found suspect string in 'dunit_suspect-vm5.log' at line 737

[error 2021/01/07 01:06:31.894 GMT :41005 unshared ordered uid=191 dom #1 local port=49013 
remote port=40362> tid=1289] Exception occurred in CacheListener
java.util.concurrent.RejectedExecutionException: Task 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor$2@3f0ab37b
 rejected from java.util.concurrent.ThreadPoolExecutor@1a4d5355[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2071]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
at 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderEventProcessor.handlePrimaryDestroy(SerialGatewaySenderEventProcessor.java:607)
at 
org.apache.geode.internal.cache.wan.serial.SerialSecondaryGatewayListener.afterDestroy(SerialSecondaryGatewayListener.java:91)
at 
org.apache.geode.internal.cache.EnumListenerEvent$AFTER_DESTROY.dispatchEvent(EnumListenerEvent.java:178)
at 
org.apache.geode.internal.cache.LocalRegion.dispatchEvent(LocalRegion.java:8265)
at 
org.apache.geode.internal.cache.LocalRegion.dispatchListenerEvent(LocalRegion.java:6974)
at 
org.apache.geode.internal.cache.LocalRegion.invokeDestroyCallbacks(LocalRegion.java:6775)
at 
org.apache.geode.internal.cache.EntryEventImpl.invokeCallbacks(EntryEventImpl.java:2446)
at 
org.apache.geode.internal.cache.entries.AbstractRegionEntry.dispatchListenerEvents(AbstractRegionEntry.java:164)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroyPart2(LocalRegion.java:6716)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:414)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:244)
at 
org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:152)
at 
org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:968)
at 
org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6505)
at 
org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6479)
at 
org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:59)
at 
org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430)
at 
org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1730)
at 
org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderQueue$SerialGatewaySenderQueueMetaRegion.basicDestroy(SerialGatewaySenderQueue.java:1387)
at 
org.apache.geode.internal.cache.LocalRegion.localDestroy(LocalRegion.java:2230)
at 
org.apache.geode.internal.cache.DistributedRegion.localDestroy(DistributedRegion.java:967)
at 
org.apache.geode.internal.cache.wan.serial.BatchDestroyOperation$DestroyMessage.operateOnRegion(BatchDestroyOperation.java:121)
at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1208)
at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1110)
at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
at 
org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:432)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2066)
at 
org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:1831)
at 

[jira] [Resolved] (GEODE-5922) SerialGatewaySenderQueue concurrency is poorly implemented

2021-01-06 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt resolved GEODE-5922.
---
Resolution: Won't Fix

The fix for this problem has been reverted on develop and all support branches. 
 The change to use a fair lock instead of Java synchronization caused queuing 
to be about 3x slower under heavy load.

> SerialGatewaySenderQueue concurrency is poorly implemented
> --
>
> Key: GEODE-5922
> URL: https://issues.apache.org/jira/browse/GEODE-5922
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: blocks-1.14.0​, pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This class uses synchronization on the queue to limit access to one put at a 
> time.  Synchronization isn't a fair locking mechanism so threads can be 
> blocked trying to add events to the queue while other more recent events get 
> the lock and insert their events.  This causes inconsistent latency which 
> I've observed being as long as 30 seconds, causing client connections to be 
> shut down by the ClientHealthMonitor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-5922) SerialGatewaySenderQueue concurrency is poorly implemented

2021-01-06 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-5922:
--
Labels: blocks-1.14.0​ pull-request-available  (was: pull-request-available)

> SerialGatewaySenderQueue concurrency is poorly implemented
> --
>
> Key: GEODE-5922
> URL: https://issues.apache.org/jira/browse/GEODE-5922
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: blocks-1.14.0​, pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This class uses synchronization on the queue to limit access to one put at a 
> time.  Synchronization isn't a fair locking mechanism so threads can be 
> blocked trying to add events to the queue while other more recent events get 
> the lock and insert their events.  This causes inconsistent latency which 
> I've observed being as long as 30 seconds, causing client connections to be 
> shut down by the ClientHealthMonitor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (GEODE-5922) SerialGatewaySenderQueue concurrency is poorly implemented

2021-01-06 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt reopened GEODE-5922:
---

The fix for this issue caused a 3x performance degradation in adding new events 
to the async queue.  The fix needs to be reverted and reevaluated.  A 
performance test of the AEQ with heavy load should be created to vet any new 
fix.

> SerialGatewaySenderQueue concurrency is poorly implemented
> --
>
> Key: GEODE-5922
> URL: https://issues.apache.org/jira/browse/GEODE-5922
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This class uses synchronization on the queue to limit access to one put at a 
> time.  Synchronization isn't a fair locking mechanism so threads can be 
> blocked trying to add events to the queue while other more recent events get 
> the lock and insert their events.  This causes inconsistent latency which 
> I've observed being as long as 30 seconds, causing client connections to be 
> shut down by the ClientHealthMonitor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8567) CI Failure: ConcurrentSerialGatewaySenderOperationsDistributedTest > testRestartSerialGatewaySendersWhilePutting

2020-12-08 Thread Bruce J Schuchardt (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce J Schuchardt updated GEODE-8567:
--
Labels: no-release-note  (was: )

> CI Failure: ConcurrentSerialGatewaySenderOperationsDistributedTest > 
> testRestartSerialGatewaySendersWhilePutting
> 
>
> Key: GEODE-8567
> URL: https://issues.apache.org/jira/browse/GEODE-8567
> Project: Geode
>  Issue Type: Improvement
>  Components: wan
>Reporter: Owen Nichols
>Priority: Major
>  Labels: no-release-note
>
> ConcurrentSerialGatewaySenderOperationsDistributedTest > 
> testRestartSerialGatewaySendersWhilePutting[1: numDispatchers=3] FAILED
> seen in 
> [DistributedTestOpenJDK11|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/490]
>  #490
> > Task :geode-wan:distributedTest
> org.apache.geode.internal.cache.wan.concurrent.ConcurrentSerialGatewaySenderOperationsDistributedTest
>  > testRestartSerialGatewaySendersWhilePutting[1: numDispatchers=3] FAILED
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderOperationsDistributedTest$$Lambda$490/0x00010094b040.run
>  in VM 5 running on Host 60d64fc07216 with 8 VMs
> Caused by:
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.internal.cache.wan.serial.SerialGatewaySenderOperationsDistributedTest
>  that uses org.apache.geode.internal.cache.wan.InternalGatewaySender, 
> org.apache.geode.internal.cache.wan.InternalGatewaySenderint [Sender 
> statistics unprocessed event map size] expected:<[0]> but was:<[2]> within 5 
> minutes.
> Caused by:
> org.junit.ComparisonFailure: [Sender statistics unprocessed event 
> map size] expected:<[0]> but was:<[2]>
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> [*http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0380/test-results/distributedTest/1601531249/*]
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> [*http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0380/test-artifacts/1601531249/distributedtestfiles-OpenJDK11-1.14.0-build.0380.tgz*]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   >