[jira] [Resolved] (GEODE-9407) RegionDestroyedException while executing GetMemberInformationFunction

2021-11-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-9407.
--
Fix Version/s: 1.15.0
   Resolution: Fixed

> RegionDestroyedException while executing GetMemberInformationFunction
> -
>
> Key: GEODE-9407
> URL: https://issues.apache.org/jira/browse/GEODE-9407
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh, management
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> GetMemberInformationFunction is used by the gfsh "describe member" command, 
> the management REST API "/members" endpoint, and is also used internally 
> within Geode. If this function is invoked while concurrently destroying a 
> region, it may throw RegionDestroyedException while trying to gather 
> information about the destroyed region's subregions.
>  
> This bug manifests as a nasty error message in the logs of the member where 
> the function was being executed (shown below). This confuses Geode 
> users/developers/operators because it looks like a problem with the system 
> while instead it's actually expected behavior. GetMemberInformationFunction 
> should probably catch RegionDestroyedException and remove the destroyed 
> region from the set of region names in ManagementUtils.getAllRegionNames.
>  
> {code:java}
> [error 2021/06/29 23:01:38.640 GMT system-test-gemfire-server-0  Execution Processor3> tid=0x94] Unable to gather runtime information on this 
> member.
> org.apache.geode.cache.RegionDestroyedException: Partitioned Region @79f60edb 
> [path='/region'; dataPolicy=PARTITION; prId=37; isDestroyed=true; 
> isClosed=false; retryTimeout=360; serialNumber=4309; partition 
> attributes=PartitionAttributes@1299510666[redundantCopies=2;localMaxMemory=594;totalMaxMemory=2147483647;totalNumBuckets=113;partitionResolver=null;colocatedWith=null;recoveryDelay=-1;startupRecoveryDelay=0;FixedPartitionAttributes=null;partitionListeners=null];
>  on VM system-test-gemfire-server-0(system-test-gemfire-server-0:1):41000]
>  at 
> org.apache.geode.internal.cache.LocalRegion.checkRegionDestroyed(LocalRegion.java:7342)
>  at 
> org.apache.geode.internal.cache.LocalRegion.checkReadiness(LocalRegion.java:2757)
>  at 
> org.apache.geode.internal.cache.LocalRegion.subregions(LocalRegion.java:1908)
>  at 
> org.apache.geode.management.internal.util.ManagementUtils.getAllRegionNames(ManagementUtils.java:167)
>  at 
> org.apache.geode.management.internal.functions.GetMemberInformationFunction.getMemberInformation(GetMemberInformationFunction.java:131)
>  at 
> org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:52)
>  at 
> org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:35)
>  at 
> org.apache.geode.management.internal.functions.CacheRealizationFunction.executeGet(CacheRealizationFunction.java:136)
>  at 
> org.apache.geode.management.internal.functions.CacheRealizationFunction.execute(CacheRealizationFunction.java:92)
>  at 
> org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:201)
>  at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
>  at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444)
>  at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.doFunctionExecutionThread(ClusterOperationExecutors.java:379)
>  at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
>  at java.base/java.lang.Thread.run(Thread.java:829){code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-13 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-9666.
--
Fix Version/s: 1.15.0
   Resolution: Fixed

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage, pull-request-available
> Fix For: 1.15.0
>
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
>   at 
> 

[jira] [Assigned] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9666:


Assignee: Aaron Lindsey

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
>   at 
> 

[jira] [Assigned] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9666:


Assignee: (was: Aaron Lindsey)

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
>   at 
> 

[jira] [Commented] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424702#comment-17424702
 ] 

Aaron Lindsey commented on GEODE-9666:
--

I no longer see the {{NoAvailableLocatorsException}} when running our test with 
this change: 
[https://github.com/apache/geode/compare/develop...aaronlindsey:GEODE-9666-NoAvailableLocatorsException?expand=1]

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> 

[jira] [Created] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-01 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-9666:


 Summary: Client throws NoAvailableLocatorsException after locators 
change IP addresses
 Key: GEODE-9666
 URL: https://issues.apache.org/jira/browse/GEODE-9666
 Project: Geode
  Issue Type: Bug
  Components: membership
Affects Versions: 1.15.0
Reporter: Aaron Lindsey


We have a test for Geode on Kubernetes which:
 * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
 * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
 * Triggers a rolling restart of the locator Pods
 ** The rolling restart operation restarts one locator at a time, waiting for 
each restarted locator to become fully online before restarting the next locator
 * Stops the client operations and validates there were no exceptions thrown in 
the clients.

Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
clients:

{code:none}
org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
to any locators in the list 
[system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
 
system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
at 
org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
at 
org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
at 
org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
at 
org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
at 
org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
at 
org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
at 
org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
at 
org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
at 
org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
at 
org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
at 
org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
at 
org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
at 
org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
at 
org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
at 
org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
at 
org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
at 
org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
at 
org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
at 
org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1063)
at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:963)
at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
   

[jira] [Assigned] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-01 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9666:


Assignee: Aaron Lindsey

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:894)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808)
>   at 
> 

[jira] [Commented] (GEODE-9463) Default serialization filter rejects SerializableRegionRedundancyStatusImpl

2021-08-03 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17392540#comment-17392540
 ] 

Aaron Lindsey commented on GEODE-9463:
--

[~upthewaterspout] [~eshu] thanks for getting to the bottom of this!

> Default serialization filter rejects SerializableRegionRedundancyStatusImpl
> ---
>
> Key: GEODE-9463
> URL: https://issues.apache.org/jira/browse/GEODE-9463
> Project: Geode
>  Issue Type: Bug
>  Components: serialization
>Affects Versions: 1.13.0, 1.14.0
>Reporter: Aaron Lindsey
>Assignee: Eric Shu
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.14.0​
> Attachments: logs-1.tgz, logs-2.tgz
>
>
> When validate-serializable-objects=true, there are exceptions in the logs 
> related to serializing the class SerializableRegionRedundancyStatusImpl. This 
> is an internal class which should be allowed by the default serializable 
> object filter.
> We saw this issue happen on Kubernetes while invoking rebalance and restore 
> redundancy operations on the cluster. I attached logs from 2 separate test 
> failures due to this issue.
> {code:java}
> [fatal 2021/07/22 00:14:31.392 GMT system-test-gemfire-locator-1 
>  tid=0x51] Serialization filter is rejecting class 
> org.apache.geode.internal.cache.control.SerializableRegionRedundancyStatusImpljava.lang.Exception:
>  at 
> org.apache.geode.internal.ObjectInputStreamFilterWrapper.lambda$createSerializationFilter$0(ObjectInputStreamFilterWrapper.java:234)
> at com.sun.proxy.$Proxy23.checkInput(Unknown Source)at 
> java.base/java.io.ObjectInputStream.filterCheck(ObjectInputStream.java:1336)  
>   at 
> java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
> at 
> java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)
> at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451) 
>at java.base/java.util.HashMap.readObject(HashMap.java:1460)at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2325)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)
> at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451) 
>at 
> org.apache.geode.internal.InternalDataSerializer.readSerializable(InternalDataSerializer.java:2689)
> at 
> org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2633)
> at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2864)   
>  at 
> org.apache.geode.internal.util.BlobHelper.deserializeBlob(BlobHelper.java:102)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2049)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2041)
> at 
> org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:138)
> at 
> org.apache.geode.internal.cache.LocalRegion.getDeserialized(LocalRegion.java:1277)
> at 
> 

[jira] [Commented] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2021-07-29 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390194#comment-17390194
 ] 

Aaron Lindsey commented on GEODE-8200:
--

[~jchen21] yes, checking the restore status is done using the GET 
management/v1/operations/restoreRedundancy/{id} endpoint.

> Rebalance operations stuck in "IN_PROGRESS" state forever
> -
>
> Key: GEODE-8200
> URL: https://issues.apache.org/jira/browse/GEODE-8200
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0​
> Fix For: 1.13.1, 1.14.0
>
> Attachments: GEODE-8200-exportedLogs.zip
>
>
> We use the management REST API to call rebalance immediately before stopping 
> a server to limit the possibility of data loss. In a cluster with 3 locators, 
> 3 servers, and no regions, we noticed that sometimes the rebalance operation 
> never ends if one of the locators is restarting concurrently with the 
> rebalance operation.
> More specifically, the scenario where we see this issue crop up is during an 
> automated "rolling restart" operation in a Kubernetes environment which 
> proceeds as follows:
> * At most one locator and one server are restarting at any point in time
> * Each locator/server waits until the previous locator/server is fully online 
> before restarting
> * Immediately before stopping a server, a rebalance operation is performed 
> and the server is not stopped until the rebalance operation is completed
> The impact of this issue is that the "rolling restart" operation will never 
> complete, because it cannot proceed with stopping a server until the 
> rebalance operation is completed. A human is then required to intervene and 
> manually trigger a rebalance and stop the server. This type of "rolling 
> restart" operation is triggered fairly often in Kubernetes — any time part of 
> the configuration of the locators or servers changes. 
> The following JSON is a sample response from the management REST API that 
> shows the rebalance operation stuck in "IN_PROGRESS".
> {code}
> {
>   "statusCode": "IN_PROGRESS",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-05-27T22:38:30.619Z",
>   "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
>   "operation": {
> "simulate": false
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2021-07-29 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390136#comment-17390136
 ] 

Aaron Lindsey commented on GEODE-8200:
--

[~agingade] it looks like the first one—start restore, then periodically check 
the restore status until the restore completes. After the restore has 
completed, if the status says the restore failed for any reason it will retry 
the whole process. It repeats this process until it eventually gets a 
successful restore.

> Rebalance operations stuck in "IN_PROGRESS" state forever
> -
>
> Key: GEODE-8200
> URL: https://issues.apache.org/jira/browse/GEODE-8200
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0​
> Fix For: 1.13.1, 1.14.0
>
> Attachments: GEODE-8200-exportedLogs.zip
>
>
> We use the management REST API to call rebalance immediately before stopping 
> a server to limit the possibility of data loss. In a cluster with 3 locators, 
> 3 servers, and no regions, we noticed that sometimes the rebalance operation 
> never ends if one of the locators is restarting concurrently with the 
> rebalance operation.
> More specifically, the scenario where we see this issue crop up is during an 
> automated "rolling restart" operation in a Kubernetes environment which 
> proceeds as follows:
> * At most one locator and one server are restarting at any point in time
> * Each locator/server waits until the previous locator/server is fully online 
> before restarting
> * Immediately before stopping a server, a rebalance operation is performed 
> and the server is not stopped until the rebalance operation is completed
> The impact of this issue is that the "rolling restart" operation will never 
> complete, because it cannot proceed with stopping a server until the 
> rebalance operation is completed. A human is then required to intervene and 
> manually trigger a rebalance and stop the server. This type of "rolling 
> restart" operation is triggered fairly often in Kubernetes — any time part of 
> the configuration of the locators or servers changes. 
> The following JSON is a sample response from the management REST API that 
> shows the rebalance operation stuck in "IN_PROGRESS".
> {code}
> {
>   "statusCode": "IN_PROGRESS",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-05-27T22:38:30.619Z",
>   "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
>   "operation": {
> "simulate": false
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2021-07-26 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387644#comment-17387644
 ] 

Aaron Lindsey edited comment on GEODE-8200 at 7/26/21, 11:01 PM:
-

[~agingade] We first saw this issue on July 16 while testing Geode at commit 
[https://github.com/apache/geode/commit/8b7a1a242290523310080a13338c1f85a283c684].
 The issue was discovered in an automated test which had been passing 
consistently for some time. We have only seen the issue happen with the 
"restore redundancy" operation, but I cannot say for sure that the issue does 
not happen for the "rebalance" operation as well.

We have a closed-source test which reproduces the issue, but it does not 
reproduce the issue every time. The test does a rolling restart of the Geode 
cluster by restarting up to one locator and one server at a time. We have a 
Kubernetes hook which runs "restore redundancy" right before a server is 
stopped to reduce the chance of data loss. The hook is implemented such that 
the "restore redundancy" operation must succeed before the server can be 
stopped. Note that this is the exact same scenario as described in the original 
ticket description, except that we now use "restore redundancy" instead of 
"rebalance".


was (Author: aaronlindsey):
[~agingade] We first saw this issue on July 16 while testing Geode at commit 
[https://github.com/apache/geode/commit/8b7a1a242290523310080a13338c1f85a283c684.]
 The issue was discovered in an automated test which had been passing 
consistently for some time. We have only seen the issue happen with the 
"restore redundancy" operation, but I cannot say for sure that the issue does 
not happen for the "rebalance" operation as well.

We have a closed-source test which reproduces the issue, but it does not 
reproduce the issue every time. The test does a rolling restart of the Geode 
cluster by restarting up to one locator and one server at a time. We have a 
Kubernetes hook which runs "restore redundancy" right before a server is 
stopped to reduce the chance of data loss. The hook is implemented such that 
the "restore redundancy" operation must succeed before the server can be 
stopped. Note that this is the exact same scenario as described in the original 
ticket description, except that we now use "restore redundancy" instead of 
"rebalance".

> Rebalance operations stuck in "IN_PROGRESS" state forever
> -
>
> Key: GEODE-8200
> URL: https://issues.apache.org/jira/browse/GEODE-8200
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0​
> Fix For: 1.13.1, 1.14.0
>
> Attachments: GEODE-8200-exportedLogs.zip
>
>
> We use the management REST API to call rebalance immediately before stopping 
> a server to limit the possibility of data loss. In a cluster with 3 locators, 
> 3 servers, and no regions, we noticed that sometimes the rebalance operation 
> never ends if one of the locators is restarting concurrently with the 
> rebalance operation.
> More specifically, the scenario where we see this issue crop up is during an 
> automated "rolling restart" operation in a Kubernetes environment which 
> proceeds as follows:
> * At most one locator and one server are restarting at any point in time
> * Each locator/server waits until the previous locator/server is fully online 
> before restarting
> * Immediately before stopping a server, a rebalance operation is performed 
> and the server is not stopped until the rebalance operation is completed
> The impact of this issue is that the "rolling restart" operation will never 
> complete, because it cannot proceed with stopping a server until the 
> rebalance operation is completed. A human is then required to intervene and 
> manually trigger a rebalance and stop the server. This type of "rolling 
> restart" operation is triggered fairly often in Kubernetes — any time part of 
> the configuration of the locators or servers changes. 
> The following JSON is a sample response from the management REST API that 
> shows the rebalance operation stuck in "IN_PROGRESS".
> {code}
> {
>   "statusCode": "IN_PROGRESS",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-05-27T22:38:30.619Z",
>   "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
>   "operation": {
> "simulate": false
>   }
> }
> {code}



--
This message was sent by Atlassian 

[jira] [Commented] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2021-07-26 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387644#comment-17387644
 ] 

Aaron Lindsey commented on GEODE-8200:
--

[~agingade] We first saw this issue on July 16 while testing Geode at commit 
[https://github.com/apache/geode/commit/8b7a1a242290523310080a13338c1f85a283c684.]
 The issue was discovered in an automated test which had been passing 
consistently for some time. We have only seen the issue happen with the 
"restore redundancy" operation, but I cannot say for sure that the issue does 
not happen for the "rebalance" operation as well.

We have a closed-source test which reproduces the issue, but it does not 
reproduce the issue every time. The test does a rolling restart of the Geode 
cluster by restarting up to one locator and one server at a time. We have a 
Kubernetes hook which runs "restore redundancy" right before a server is 
stopped to reduce the chance of data loss. The hook is implemented such that 
the "restore redundancy" operation must succeed before the server can be 
stopped. Note that this is the exact same scenario as described in the original 
ticket description, except that we now use "restore redundancy" instead of 
"rebalance".

> Rebalance operations stuck in "IN_PROGRESS" state forever
> -
>
> Key: GEODE-8200
> URL: https://issues.apache.org/jira/browse/GEODE-8200
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: GeodeOperationAPI, blocks-1.15.0​
> Fix For: 1.13.1, 1.14.0
>
> Attachments: GEODE-8200-exportedLogs.zip
>
>
> We use the management REST API to call rebalance immediately before stopping 
> a server to limit the possibility of data loss. In a cluster with 3 locators, 
> 3 servers, and no regions, we noticed that sometimes the rebalance operation 
> never ends if one of the locators is restarting concurrently with the 
> rebalance operation.
> More specifically, the scenario where we see this issue crop up is during an 
> automated "rolling restart" operation in a Kubernetes environment which 
> proceeds as follows:
> * At most one locator and one server are restarting at any point in time
> * Each locator/server waits until the previous locator/server is fully online 
> before restarting
> * Immediately before stopping a server, a rebalance operation is performed 
> and the server is not stopped until the rebalance operation is completed
> The impact of this issue is that the "rolling restart" operation will never 
> complete, because it cannot proceed with stopping a server until the 
> rebalance operation is completed. A human is then required to intervene and 
> manually trigger a rebalance and stop the server. This type of "rolling 
> restart" operation is triggered fairly often in Kubernetes — any time part of 
> the configuration of the locators or servers changes. 
> The following JSON is a sample response from the management REST API that 
> shows the rebalance operation stuck in "IN_PROGRESS".
> {code}
> {
>   "statusCode": "IN_PROGRESS",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-05-27T22:38:30.619Z",
>   "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
>   "operation": {
> "simulate": false
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9463) Default serialization filter rejects SerializableRegionRedundancyStatusImpl

2021-07-26 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-9463:
-
Description: 
When validate-serializable-objects=true, there are exceptions in the logs 
related to serializing the class SerializableRegionRedundancyStatusImpl. This 
is an internal class which should be allowed by the default serializable object 
filter.

We saw this issue happen on Kubernetes while invoking rebalance and restore 
redundancy operations on the cluster. I attached logs from 2 separate test 
failures due to this issue.
{code:java}
[fatal 2021/07/22 00:14:31.392 GMT system-test-gemfire-locator-1 
 tid=0x51] Serialization filter is rejecting class 
org.apache.geode.internal.cache.control.SerializableRegionRedundancyStatusImpljava.lang.Exception:
 at 
org.apache.geode.internal.ObjectInputStreamFilterWrapper.lambda$createSerializationFilter$0(ObjectInputStreamFilterWrapper.java:234)
at com.sun.proxy.$Proxy23.checkInput(Unknown Source)at 
java.base/java.io.ObjectInputStream.filterCheck(ObjectInputStream.java:1336)
at 
java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
at 
java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862)  
  at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)   
 at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451)  
  at java.base/java.util.HashMap.readObject(HashMap.java:1460)at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  
  at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)at 
java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2325) 
   at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at 
java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358) 
   at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at 
java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358) 
   at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)   
 at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451)  
  at 
org.apache.geode.internal.InternalDataSerializer.readSerializable(InternalDataSerializer.java:2689)
at 
org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2633)
at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2864)
at 
org.apache.geode.internal.util.BlobHelper.deserializeBlob(BlobHelper.java:102)  
  at 
org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2049)
at 
org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2041)
at 
org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:138)
at 
org.apache.geode.internal.cache.LocalRegion.getDeserialized(LocalRegion.java:1277)
at org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:91)  
  at org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:86)
at 
org.apache.geode.internal.cache.EntriesSet$EntriesIterator.moveNext(EntriesSet.java:187)
at 
org.apache.geode.internal.cache.EntriesSet$EntriesIterator.(EntriesSet.java:119)
at org.apache.geode.internal.cache.EntriesSet.iterator(EntriesSet.java:84)  
  at 
org.apache.geode.management.internal.operation.RegionOperationStateStore.list(RegionOperationStateStore.java:102)
at 
org.apache.geode.management.internal.operation.OperationHistoryManager.expireHistory(OperationHistoryManager.java:74)
at 
org.apache.geode.management.internal.operation.OperationHistoryManager.recordStart(OperationHistoryManager.java:120)
at 

[jira] [Updated] (GEODE-9463) Default serialization filter rejects SerializableRegionRedundancyStatusImpl

2021-07-26 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-9463:
-
Attachment: logs-2.tgz

> Default serialization filter rejects SerializableRegionRedundancyStatusImpl
> ---
>
> Key: GEODE-9463
> URL: https://issues.apache.org/jira/browse/GEODE-9463
> Project: Geode
>  Issue Type: Bug
>  Components: serialization
>Reporter: Aaron Lindsey
>Priority: Major
> Attachments: logs-1.tgz, logs-2.tgz
>
>
> When validate-serializable-objects=true, there are exceptions in the logs 
> related to serializing the class SerializableRegionRedundancyStatusImpl. This 
> is an internal class which should be allowed by the default serializable 
> object filter.
> We saw this issue happen on Kubernetes while invoking rebalance and restore 
> redundancy operations on the cluster.
> {code:java}
> [fatal 2021/07/22 00:14:31.392 GMT system-test-gemfire-locator-1 
>  tid=0x51] Serialization filter is rejecting class 
> org.apache.geode.internal.cache.control.SerializableRegionRedundancyStatusImpljava.lang.Exception:
>  at 
> org.apache.geode.internal.ObjectInputStreamFilterWrapper.lambda$createSerializationFilter$0(ObjectInputStreamFilterWrapper.java:234)
> at com.sun.proxy.$Proxy23.checkInput(Unknown Source)at 
> java.base/java.io.ObjectInputStream.filterCheck(ObjectInputStream.java:1336)  
>   at 
> java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
> at 
> java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)
> at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451) 
>at java.base/java.util.HashMap.readObject(HashMap.java:1460)at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2325)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)
> at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451) 
>at 
> org.apache.geode.internal.InternalDataSerializer.readSerializable(InternalDataSerializer.java:2689)
> at 
> org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2633)
> at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2864)   
>  at 
> org.apache.geode.internal.util.BlobHelper.deserializeBlob(BlobHelper.java:102)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2049)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2041)
> at 
> org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:138)
> at 
> org.apache.geode.internal.cache.LocalRegion.getDeserialized(LocalRegion.java:1277)
> at 
> org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:91)at 
> org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:86)at 
> org.apache.geode.internal.cache.EntriesSet$EntriesIterator.moveNext(EntriesSet.java:187)
> at 
> 

[jira] [Updated] (GEODE-9463) Default serialization filter rejects SerializableRegionRedundancyStatusImpl

2021-07-26 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-9463:
-
Attachment: logs-1.tgz

> Default serialization filter rejects SerializableRegionRedundancyStatusImpl
> ---
>
> Key: GEODE-9463
> URL: https://issues.apache.org/jira/browse/GEODE-9463
> Project: Geode
>  Issue Type: Bug
>  Components: serialization
>Reporter: Aaron Lindsey
>Priority: Major
> Attachments: logs-1.tgz, logs-2.tgz
>
>
> When validate-serializable-objects=true, there are exceptions in the logs 
> related to serializing the class SerializableRegionRedundancyStatusImpl. This 
> is an internal class which should be allowed by the default serializable 
> object filter.
> We saw this issue happen on Kubernetes while invoking rebalance and restore 
> redundancy operations on the cluster.
> {code:java}
> [fatal 2021/07/22 00:14:31.392 GMT system-test-gemfire-locator-1 
>  tid=0x51] Serialization filter is rejecting class 
> org.apache.geode.internal.cache.control.SerializableRegionRedundancyStatusImpljava.lang.Exception:
>  at 
> org.apache.geode.internal.ObjectInputStreamFilterWrapper.lambda$createSerializationFilter$0(ObjectInputStreamFilterWrapper.java:234)
> at com.sun.proxy.$Proxy23.checkInput(Unknown Source)at 
> java.base/java.io.ObjectInputStream.filterCheck(ObjectInputStream.java:1336)  
>   at 
> java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
> at 
> java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)
> at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451) 
>at java.base/java.util.HashMap.readObject(HashMap.java:1460)at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)at 
> java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2325)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
> at 
> java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358)
> at 
> java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
> at 
> java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)  
>   at 
> java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)
> at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451) 
>at 
> org.apache.geode.internal.InternalDataSerializer.readSerializable(InternalDataSerializer.java:2689)
> at 
> org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2633)
> at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2864)   
>  at 
> org.apache.geode.internal.util.BlobHelper.deserializeBlob(BlobHelper.java:102)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2049)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2041)
> at 
> org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:138)
> at 
> org.apache.geode.internal.cache.LocalRegion.getDeserialized(LocalRegion.java:1277)
> at 
> org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:91)at 
> org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:86)at 
> org.apache.geode.internal.cache.EntriesSet$EntriesIterator.moveNext(EntriesSet.java:187)
> at 
> 

[jira] [Created] (GEODE-9463) Default serialization filter rejects SerializableRegionRedundancyStatusImpl

2021-07-26 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-9463:


 Summary: Default serialization filter rejects 
SerializableRegionRedundancyStatusImpl
 Key: GEODE-9463
 URL: https://issues.apache.org/jira/browse/GEODE-9463
 Project: Geode
  Issue Type: Bug
  Components: serialization
Reporter: Aaron Lindsey


When validate-serializable-objects=true, there are exceptions in the logs 
related to serializing the class SerializableRegionRedundancyStatusImpl. This 
is an internal class which should be allowed by the default serializable object 
filter.

We saw this issue happen on Kubernetes while invoking rebalance and restore 
redundancy operations on the cluster.
{code:java}
[fatal 2021/07/22 00:14:31.392 GMT system-test-gemfire-locator-1 
 tid=0x51] Serialization filter is rejecting class 
org.apache.geode.internal.cache.control.SerializableRegionRedundancyStatusImpljava.lang.Exception:
 at 
org.apache.geode.internal.ObjectInputStreamFilterWrapper.lambda$createSerializationFilter$0(ObjectInputStreamFilterWrapper.java:234)
at com.sun.proxy.$Proxy23.checkInput(Unknown Source)at 
java.base/java.io.ObjectInputStream.filterCheck(ObjectInputStream.java:1336)
at 
java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
at 
java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862)  
  at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)   
 at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451)  
  at java.base/java.util.HashMap.readObject(HashMap.java:1460)at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  
  at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)at 
java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2325) 
   at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at 
java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358) 
   at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at 
java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358) 
   at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)   
 at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451)  
  at 
org.apache.geode.internal.InternalDataSerializer.readSerializable(InternalDataSerializer.java:2689)
at 
org.apache.geode.internal.InternalDataSerializer.basicReadObject(InternalDataSerializer.java:2633)
at org.apache.geode.DataSerializer.readObject(DataSerializer.java:2864)
at 
org.apache.geode.internal.util.BlobHelper.deserializeBlob(BlobHelper.java:102)  
  at 
org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2049)
at 
org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2041)
at 
org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:138)
at 
org.apache.geode.internal.cache.LocalRegion.getDeserialized(LocalRegion.java:1277)
at org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:91)  
  at org.apache.geode.internal.cache.NonTXEntry.getValue(NonTXEntry.java:86)
at 
org.apache.geode.internal.cache.EntriesSet$EntriesIterator.moveNext(EntriesSet.java:187)
at 
org.apache.geode.internal.cache.EntriesSet$EntriesIterator.(EntriesSet.java:119)
at org.apache.geode.internal.cache.EntriesSet.iterator(EntriesSet.java:84)  
  at 
org.apache.geode.management.internal.operation.RegionOperationStateStore.list(RegionOperationStateStore.java:102)
at 
org.apache.geode.management.internal.operation.OperationHistoryManager.expireHistory(OperationHistoryManager.java:74)
at 

[jira] [Commented] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2021-07-23 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386578#comment-17386578
 ] 

Aaron Lindsey commented on GEODE-8200:
--

The only difference is that now it's happening for restore redundancy instead 
of rebalance, but it's pretty much the same scenario where we see the issue. 
(We switched to using restore redundancy instead of rebalance after that 
operation was added to Geode.)

> Rebalance operations stuck in "IN_PROGRESS" state forever
> -
>
> Key: GEODE-8200
> URL: https://issues.apache.org/jira/browse/GEODE-8200
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: GeodeOperationAPI
> Fix For: 1.13.1, 1.14.0
>
> Attachments: GEODE-8200-exportedLogs.zip
>
>
> We use the management REST API to call rebalance immediately before stopping 
> a server to limit the possibility of data loss. In a cluster with 3 locators, 
> 3 servers, and no regions, we noticed that sometimes the rebalance operation 
> never ends if one of the locators is restarting concurrently with the 
> rebalance operation.
> More specifically, the scenario where we see this issue crop up is during an 
> automated "rolling restart" operation in a Kubernetes environment which 
> proceeds as follows:
> * At most one locator and one server are restarting at any point in time
> * Each locator/server waits until the previous locator/server is fully online 
> before restarting
> * Immediately before stopping a server, a rebalance operation is performed 
> and the server is not stopped until the rebalance operation is completed
> The impact of this issue is that the "rolling restart" operation will never 
> complete, because it cannot proceed with stopping a server until the 
> rebalance operation is completed. A human is then required to intervene and 
> manually trigger a rebalance and stop the server. This type of "rolling 
> restart" operation is triggered fairly often in Kubernetes — any time part of 
> the configuration of the locators or servers changes. 
> The following JSON is a sample response from the management REST API that 
> shows the rebalance operation stuck in "IN_PROGRESS".
> {code}
> {
>   "statusCode": "IN_PROGRESS",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-05-27T22:38:30.619Z",
>   "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
>   "operation": {
> "simulate": false
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2021-07-23 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386570#comment-17386570
 ] 

Aaron Lindsey commented on GEODE-8200:
--

Re-opened because this issue has started reproducing again on develop.

> Rebalance operations stuck in "IN_PROGRESS" state forever
> -
>
> Key: GEODE-8200
> URL: https://issues.apache.org/jira/browse/GEODE-8200
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: GeodeOperationAPI
> Fix For: 1.13.1, 1.14.0
>
> Attachments: GEODE-8200-exportedLogs.zip
>
>
> We use the management REST API to call rebalance immediately before stopping 
> a server to limit the possibility of data loss. In a cluster with 3 locators, 
> 3 servers, and no regions, we noticed that sometimes the rebalance operation 
> never ends if one of the locators is restarting concurrently with the 
> rebalance operation.
> More specifically, the scenario where we see this issue crop up is during an 
> automated "rolling restart" operation in a Kubernetes environment which 
> proceeds as follows:
> * At most one locator and one server are restarting at any point in time
> * Each locator/server waits until the previous locator/server is fully online 
> before restarting
> * Immediately before stopping a server, a rebalance operation is performed 
> and the server is not stopped until the rebalance operation is completed
> The impact of this issue is that the "rolling restart" operation will never 
> complete, because it cannot proceed with stopping a server until the 
> rebalance operation is completed. A human is then required to intervene and 
> manually trigger a rebalance and stop the server. This type of "rolling 
> restart" operation is triggered fairly often in Kubernetes — any time part of 
> the configuration of the locators or servers changes. 
> The following JSON is a sample response from the management REST API that 
> shows the rebalance operation stuck in "IN_PROGRESS".
> {code}
> {
>   "statusCode": "IN_PROGRESS",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-05-27T22:38:30.619Z",
>   "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
>   "operation": {
> "simulate": false
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9407) RegionDestroyedException while executing GetMemberInformationFunction

2021-06-30 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9407:


Assignee: Aaron Lindsey

> RegionDestroyedException while executing GetMemberInformationFunction
> -
>
> Key: GEODE-9407
> URL: https://issues.apache.org/jira/browse/GEODE-9407
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh, management
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Minor
>
> GetMemberInformationFunction is used by the gfsh "describe member" command, 
> the management REST API "/members" endpoint, and is also used internally 
> within Geode. If this function is invoked while concurrently destroying a 
> region, it may throw RegionDestroyedException while trying to gather 
> information about the destroyed region's subregions.
>  
> This bug manifests as a nasty error message in the logs of the member where 
> the function was being executed (shown below). This confuses Geode 
> users/developers/operators because it looks like a problem with the system 
> while instead it's actually expected behavior. GetMemberInformationFunction 
> should probably catch RegionDestroyedException and remove the destroyed 
> region from the set of region names in ManagementUtils.getAllRegionNames.
>  
> {code:java}
> [error 2021/06/29 23:01:38.640 GMT system-test-gemfire-server-0  Execution Processor3> tid=0x94] Unable to gather runtime information on this 
> member.
> org.apache.geode.cache.RegionDestroyedException: Partitioned Region @79f60edb 
> [path='/region'; dataPolicy=PARTITION; prId=37; isDestroyed=true; 
> isClosed=false; retryTimeout=360; serialNumber=4309; partition 
> attributes=PartitionAttributes@1299510666[redundantCopies=2;localMaxMemory=594;totalMaxMemory=2147483647;totalNumBuckets=113;partitionResolver=null;colocatedWith=null;recoveryDelay=-1;startupRecoveryDelay=0;FixedPartitionAttributes=null;partitionListeners=null];
>  on VM system-test-gemfire-server-0(system-test-gemfire-server-0:1):41000]
>  at 
> org.apache.geode.internal.cache.LocalRegion.checkRegionDestroyed(LocalRegion.java:7342)
>  at 
> org.apache.geode.internal.cache.LocalRegion.checkReadiness(LocalRegion.java:2757)
>  at 
> org.apache.geode.internal.cache.LocalRegion.subregions(LocalRegion.java:1908)
>  at 
> org.apache.geode.management.internal.util.ManagementUtils.getAllRegionNames(ManagementUtils.java:167)
>  at 
> org.apache.geode.management.internal.functions.GetMemberInformationFunction.getMemberInformation(GetMemberInformationFunction.java:131)
>  at 
> org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:52)
>  at 
> org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:35)
>  at 
> org.apache.geode.management.internal.functions.CacheRealizationFunction.executeGet(CacheRealizationFunction.java:136)
>  at 
> org.apache.geode.management.internal.functions.CacheRealizationFunction.execute(CacheRealizationFunction.java:92)
>  at 
> org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:201)
>  at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
>  at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444)
>  at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.doFunctionExecutionThread(ClusterOperationExecutors.java:379)
>  at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
>  at java.base/java.lang.Thread.run(Thread.java:829){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-9407) RegionDestroyedException while executing GetMemberInformationFunction

2021-06-29 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-9407:
-
Priority: Minor  (was: Major)

> RegionDestroyedException while executing GetMemberInformationFunction
> -
>
> Key: GEODE-9407
> URL: https://issues.apache.org/jira/browse/GEODE-9407
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh, management
>Reporter: Aaron Lindsey
>Priority: Minor
>
> GetMemberInformationFunction is used by the gfsh "describe member" command, 
> the management REST API "/members" endpoint, and is also used internally 
> within Geode. If this function is invoked while concurrently destroying a 
> region, it may throw RegionDestroyedException while trying to gather 
> information about the destroyed region's subregions.
>  
> This bug manifests as a nasty error message in the logs of the member where 
> the function was being executed (shown below). This confuses Geode 
> users/developers/operators because it looks like a problem with the system 
> while instead it's actually expected behavior. GetMemberInformationFunction 
> should probably catch RegionDestroyedException and remove the destroyed 
> region from the set of region names in ManagementUtils.getAllRegionNames.
>  
> {code:java}
> [error 2021/06/29 23:01:38.640 GMT system-test-gemfire-server-0  Execution Processor3> tid=0x94] Unable to gather runtime information on this 
> member.
> org.apache.geode.cache.RegionDestroyedException: Partitioned Region @79f60edb 
> [path='/region'; dataPolicy=PARTITION; prId=37; isDestroyed=true; 
> isClosed=false; retryTimeout=360; serialNumber=4309; partition 
> attributes=PartitionAttributes@1299510666[redundantCopies=2;localMaxMemory=594;totalMaxMemory=2147483647;totalNumBuckets=113;partitionResolver=null;colocatedWith=null;recoveryDelay=-1;startupRecoveryDelay=0;FixedPartitionAttributes=null;partitionListeners=null];
>  on VM system-test-gemfire-server-0(system-test-gemfire-server-0:1):41000]
>  at 
> org.apache.geode.internal.cache.LocalRegion.checkRegionDestroyed(LocalRegion.java:7342)
>  at 
> org.apache.geode.internal.cache.LocalRegion.checkReadiness(LocalRegion.java:2757)
>  at 
> org.apache.geode.internal.cache.LocalRegion.subregions(LocalRegion.java:1908)
>  at 
> org.apache.geode.management.internal.util.ManagementUtils.getAllRegionNames(ManagementUtils.java:167)
>  at 
> org.apache.geode.management.internal.functions.GetMemberInformationFunction.getMemberInformation(GetMemberInformationFunction.java:131)
>  at 
> org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:52)
>  at 
> org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:35)
>  at 
> org.apache.geode.management.internal.functions.CacheRealizationFunction.executeGet(CacheRealizationFunction.java:136)
>  at 
> org.apache.geode.management.internal.functions.CacheRealizationFunction.execute(CacheRealizationFunction.java:92)
>  at 
> org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:201)
>  at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
>  at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444)
>  at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.doFunctionExecutionThread(ClusterOperationExecutors.java:379)
>  at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
>  at java.base/java.lang.Thread.run(Thread.java:829){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9407) RegionDestroyedException while executing GetMemberInformationFunction

2021-06-29 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-9407:


 Summary: RegionDestroyedException while executing 
GetMemberInformationFunction
 Key: GEODE-9407
 URL: https://issues.apache.org/jira/browse/GEODE-9407
 Project: Geode
  Issue Type: Bug
  Components: gfsh, management
Reporter: Aaron Lindsey


GetMemberInformationFunction is used by the gfsh "describe member" command, the 
management REST API "/members" endpoint, and is also used internally within 
Geode. If this function is invoked while concurrently destroying a region, it 
may throw RegionDestroyedException while trying to gather information about the 
destroyed region's subregions.

 

This bug manifests as a nasty error message in the logs of the member where the 
function was being executed (shown below). This confuses Geode 
users/developers/operators because it looks like a problem with the system 
while instead it's actually expected behavior. GetMemberInformationFunction 
should probably catch RegionDestroyedException and remove the destroyed region 
from the set of region names in ManagementUtils.getAllRegionNames.

 
{code:java}
[error 2021/06/29 23:01:38.640 GMT system-test-gemfire-server-0  tid=0x94] Unable to gather runtime information on this 
member.
org.apache.geode.cache.RegionDestroyedException: Partitioned Region @79f60edb 
[path='/region'; dataPolicy=PARTITION; prId=37; isDestroyed=true; 
isClosed=false; retryTimeout=360; serialNumber=4309; partition 
attributes=PartitionAttributes@1299510666[redundantCopies=2;localMaxMemory=594;totalMaxMemory=2147483647;totalNumBuckets=113;partitionResolver=null;colocatedWith=null;recoveryDelay=-1;startupRecoveryDelay=0;FixedPartitionAttributes=null;partitionListeners=null];
 on VM system-test-gemfire-server-0(system-test-gemfire-server-0:1):41000]
 at 
org.apache.geode.internal.cache.LocalRegion.checkRegionDestroyed(LocalRegion.java:7342)
 at 
org.apache.geode.internal.cache.LocalRegion.checkReadiness(LocalRegion.java:2757)
 at 
org.apache.geode.internal.cache.LocalRegion.subregions(LocalRegion.java:1908)
 at 
org.apache.geode.management.internal.util.ManagementUtils.getAllRegionNames(ManagementUtils.java:167)
 at 
org.apache.geode.management.internal.functions.GetMemberInformationFunction.getMemberInformation(GetMemberInformationFunction.java:131)
 at 
org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:52)
 at 
org.apache.geode.management.internal.configuration.realizers.MemberRealizer.get(MemberRealizer.java:35)
 at 
org.apache.geode.management.internal.functions.CacheRealizationFunction.executeGet(CacheRealizationFunction.java:136)
 at 
org.apache.geode.management.internal.functions.CacheRealizationFunction.execute(CacheRealizationFunction.java:92)
 at 
org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:201)
 at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
 at 
org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at 
org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:444)
 at 
org.apache.geode.distributed.internal.ClusterOperationExecutors.doFunctionExecutionThread(ClusterOperationExecutors.java:379)
 at 
org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:120)
 at java.base/java.lang.Thread.run(Thread.java:829){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9198) Keystore/truststore file watcher does not follow symbolic links

2021-05-03 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-9198.
--
Fix Version/s: 1.15.0
   Resolution: Fixed

> Keystore/truststore file watcher does not follow symbolic links
> ---
>
> Key: GEODE-9198
> URL: https://issues.apache.org/jira/browse/GEODE-9198
> Project: Geode
>  Issue Type: Bug
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> GEODE-9017 introduced a file watching key/trust manager to automatically 
> reload the key and trust store upon change. However, the file watcher was 
> configured to not follow symbolic links. Some environments such as Kubernetes 
> use symbolic links to mount files inside a container file system. In cases 
> like this (where the key and trust store are represented using symbolic 
> links) the file watcher should follow these links and reload the key and 
> trust store when the underlying target file changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9198) Keystore/truststore file watcher does not follow symbolic links

2021-04-28 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9198:


Assignee: Aaron Lindsey

> Keystore/truststore file watcher does not follow symbolic links
> ---
>
> Key: GEODE-9198
> URL: https://issues.apache.org/jira/browse/GEODE-9198
> Project: Geode
>  Issue Type: Bug
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: pull-request-available
>
> GEODE-9017 introduced a file watching key/trust manager to automatically 
> reload the key and trust store upon change. However, the file watcher was 
> configured to not follow symbolic links. Some environments such as Kubernetes 
> use symbolic links to mount files inside a container file system. In cases 
> like this (where the key and trust store are represented using symbolic 
> links) the file watcher should follow these links and reload the key and 
> trust store when the underlying target file changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9198) Keystore/truststore file watcher does not follow symbolic links

2021-04-27 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-9198:


 Summary: Keystore/truststore file watcher does not follow symbolic 
links
 Key: GEODE-9198
 URL: https://issues.apache.org/jira/browse/GEODE-9198
 Project: Geode
  Issue Type: Bug
Reporter: Aaron Lindsey


GEODE-9017 introduced a file watching key/trust manager to automatically reload 
the key and trust store upon change. However, the file watcher was configured 
to not follow symbolic links. Some environments such as Kubernetes use symbolic 
links to mount files inside a container file system. In cases like this (where 
the key and trust store are represented using symbolic links) the file watcher 
should follow these links and reload the key and trust store when the 
underlying target file changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-9017) Reload key store and trust store upon change

2021-03-25 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-9017.
--
Fix Version/s: 1.15.0
   Resolution: Fixed

> Reload key store and trust store upon change
> 
>
> Key: GEODE-9017
> URL: https://issues.apache.org/jira/browse/GEODE-9017
> Project: Geode
>  Issue Type: New Feature
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> [Link to 
> RFC|https://cwiki.apache.org/confluence/display/GEODE/Make+key+and+trust+stores+reload+automatically+upon+change]
> (The below text is copied from the RFC document.)
> h3. Problem
> Currently, in order to rotate certificates each member of the cluster needs 
> to be restarted to load new certs and trust. It would be preferable if 
> certificates can be rotated without having to restart members.
> h3. Solution
> When starting up a cluster member we currently read the TLS configuration 
> which, when TLS is enabled has key and trust store files defined. In case 
> those files are defined they are read, and the information inside them is 
> loaded into the key and trust manager objects that are loaded into the 
> SSLContext.
> This solution will introduce wrapper objects for the key and trust managers 
> and file/directory watcher(s) that can detect changes to the key and trust 
> store files. When key and trust store files are changed this will trigger a 
> reload into key and trust managers and through the wrapper objects these new 
> key and trust managers will be injected into the SSLContext so that the 
> context can start using the new key and trust managers in process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-9017) Reload key store and trust store upon change

2021-03-09 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-9017:


 Summary: Reload key store and trust store upon change
 Key: GEODE-9017
 URL: https://issues.apache.org/jira/browse/GEODE-9017
 Project: Geode
  Issue Type: New Feature
Reporter: Aaron Lindsey


[Link to 
RFC|https://cwiki.apache.org/confluence/display/GEODE/Make+key+and+trust+stores+reload+automatically+upon+change]

(The below text is copied from the RFC document.)
h3. Problem

Currently, in order to rotate certificates each member of the cluster needs to 
be restarted to load new certs and trust. It would be preferable if 
certificates can be rotated without having to restart members.
h3. Solution

When starting up a cluster member we currently read the TLS configuration 
which, when TLS is enabled has key and trust store files defined. In case those 
files are defined they are read, and the information inside them is loaded into 
the key and trust manager objects that are loaded into the SSLContext.

This solution will introduce wrapper objects for the key and trust managers and 
file/directory watcher(s) that can detect changes to the key and trust store 
files. When key and trust store files are changed this will trigger a reload 
into key and trust managers and through the wrapper objects these new key and 
trust managers will be injected into the SSLContext so that the context can 
start using the new key and trust managers in process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-9017) Reload key store and trust store upon change

2021-03-09 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-9017:


Assignee: Aaron Lindsey

> Reload key store and trust store upon change
> 
>
> Key: GEODE-9017
> URL: https://issues.apache.org/jira/browse/GEODE-9017
> Project: Geode
>  Issue Type: New Feature
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>
> [Link to 
> RFC|https://cwiki.apache.org/confluence/display/GEODE/Make+key+and+trust+stores+reload+automatically+upon+change]
> (The below text is copied from the RFC document.)
> h3. Problem
> Currently, in order to rotate certificates each member of the cluster needs 
> to be restarted to load new certs and trust. It would be preferable if 
> certificates can be rotated without having to restart members.
> h3. Solution
> When starting up a cluster member we currently read the TLS configuration 
> which, when TLS is enabled has key and trust store files defined. In case 
> those files are defined they are read, and the information inside them is 
> loaded into the key and trust manager objects that are loaded into the 
> SSLContext.
> This solution will introduce wrapper objects for the key and trust managers 
> and file/directory watcher(s) that can detect changes to the key and trust 
> store files. When key and trust store files are changed this will trigger a 
> reload into key and trust managers and through the wrapper objects these new 
> key and trust managers will be injected into the SSLContext so that the 
> context can start using the new key and trust managers in process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8620) Actual redundancy of -1 in restore redundancy result

2020-10-15 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-8620:


 Summary: Actual redundancy of -1 in restore redundancy result
 Key: GEODE-8620
 URL: https://issues.apache.org/jira/browse/GEODE-8620
 Project: Geode
  Issue Type: Bug
  Components: gfsh, management
Affects Versions: 1.13.0
Reporter: Aaron Lindsey


Steps to reproduce:
 # Create a geode cluster with 1 locator and 2 servers.
 # Create a region of type PARTITION_REDUNDANT.
 # Put an entry into the region.
 # Trigger a restore redundancy operation via the management REST API or gfsh.
 # The result from the restore redundancy operation states that the actual 
redundancy for the region is -1. However, the expected redundancy at this point 
is 1 because there should be enough cache servers in the cluster to hold the 
redundant copy.
 # Stop one of the servers.
 # Trigger another restore redundancy operation via the management REST API or 
gfsh.
 # The result from the second restore redundancy operation again states that 
the actual redundancy for the region is -1. However, the region should be 
counted as having zero redundant copies at this point because there is only one 
cache server.

I encountered this issue while using the management REST API, although the same 
issue happens in the gfsh command. I assume fixing the gfsh command would also 
fix the management REST API. If not, I can break this out into two separate 
JIRAs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GEODE-8241) Locator does not observe locator-wait-time

2020-06-18 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey closed GEODE-8241.


> Locator does not observe locator-wait-time
> --
>
> Key: GEODE-8241
> URL: https://issues.apache.org/jira/browse/GEODE-8241
> Project: Geode
>  Issue Type: Bug
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.14.0
>
>
> In the case where a locator starts up and is unable to connect to any other 
> locators, it may decide to become the membership coordinator even if 
> locator-wait-time has not elapsed.
> The following conditional from GMSJoinLeave.java causes the issue. There 
> should be an additional check for locator-wait-time before becoming 
> coordinator.
> {code:java}
> if (state.joinedMembersContacted <= 0 &&
> (tries >= minimumRetriesBeforeBecomingCoordinator ||
> state.locatorsContacted >= locators.size())) {
>   synchronized (viewInstallationLock) {
> becomeCoordinator();
>   }
>   return true;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8241) Locator does not observe locator-wait-time

2020-06-18 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-8241.
--
Fix Version/s: 1.14.0
   Resolution: Fixed

> Locator does not observe locator-wait-time
> --
>
> Key: GEODE-8241
> URL: https://issues.apache.org/jira/browse/GEODE-8241
> Project: Geode
>  Issue Type: Bug
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.14.0
>
>
> In the case where a locator starts up and is unable to connect to any other 
> locators, it may decide to become the membership coordinator even if 
> locator-wait-time has not elapsed.
> The following conditional from GMSJoinLeave.java causes the issue. There 
> should be an additional check for locator-wait-time before becoming 
> coordinator.
> {code:java}
> if (state.joinedMembersContacted <= 0 &&
> (tries >= minimumRetriesBeforeBecomingCoordinator ||
> state.locatorsContacted >= locators.size())) {
>   synchronized (viewInstallationLock) {
> becomeCoordinator();
>   }
>   return true;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2020-06-17 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-8200:
-
Attachment: GEODE-8200-exportedLogs.zip

> Rebalance operations stuck in "IN_PROGRESS" state forever
> -
>
> Key: GEODE-8200
> URL: https://issues.apache.org/jira/browse/GEODE-8200
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Jianxia Chen
>Priority: Major
>  Labels: GeodeOperationAPI
> Attachments: GEODE-8200-exportedLogs.zip
>
>
> We use the management REST API to call rebalance immediately before stopping 
> a server to limit the possibility of data loss. In a cluster with 3 locators, 
> 3 servers, and no regions, we noticed that sometimes the rebalance operation 
> never ends if one of the locators is restarting concurrently with the 
> rebalance operation.
> More specifically, the scenario where we see this issue crop up is during an 
> automated "rolling restart" operation in a Kubernetes environment which 
> proceeds as follows:
> * At most one locator and one server are restarting at any point in time
> * Each locator/server waits until the previous locator/server is fully online 
> before restarting
> * Immediately before stopping a server, a rebalance operation is performed 
> and the server is not stopped until the rebalance operation is completed
> The impact of this issue is that the "rolling restart" operation will never 
> complete, because it cannot proceed with stopping a server until the 
> rebalance operation is completed. A human is then required to intervene and 
> manually trigger a rebalance and stop the server. This type of "rolling 
> restart" operation is triggered fairly often in Kubernetes — any time part of 
> the configuration of the locators or servers changes. 
> The following JSON is a sample response from the management REST API that 
> shows the rebalance operation stuck in "IN_PROGRESS".
> {code}
> {
>   "statusCode": "IN_PROGRESS",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-05-27T22:38:30.619Z",
>   "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
>   "operation": {
> "simulate": false
>   }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-8241) Locator does not observe locator-wait-time

2020-06-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-8241:


Assignee: Aaron Lindsey

> Locator does not observe locator-wait-time
> --
>
> Key: GEODE-8241
> URL: https://issues.apache.org/jira/browse/GEODE-8241
> Project: Geode
>  Issue Type: Bug
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>
> In the case where a locator starts up and is unable to connect to any other 
> locators, it may decide to become the membership coordinator even if 
> locator-wait-time has not elapsed.
> The following conditional from GMSJoinLeave.java causes the issue. There 
> should be an additional check for locator-wait-time before becoming 
> coordinator.
> {code:java}
> if (state.joinedMembersContacted <= 0 &&
> (tries >= minimumRetriesBeforeBecomingCoordinator ||
> state.locatorsContacted >= locators.size())) {
>   synchronized (viewInstallationLock) {
> becomeCoordinator();
>   }
>   return true;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8241) Locator does not observe locator-wait-time

2020-06-10 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-8241:


 Summary: Locator does not observe locator-wait-time
 Key: GEODE-8241
 URL: https://issues.apache.org/jira/browse/GEODE-8241
 Project: Geode
  Issue Type: Bug
Reporter: Aaron Lindsey


In the case where a locator starts up and is unable to connect to any other 
locators, it may decide to become the membership coordinator even if 
locator-wait-time has not elapsed.

The following conditional from GMSJoinLeave.java causes the issue. There should 
be an additional check for locator-wait-time before becoming coordinator.

{code:java}
if (state.joinedMembersContacted <= 0 &&
(tries >= minimumRetriesBeforeBecomingCoordinator ||
state.locatorsContacted >= locators.size())) {
  synchronized (viewInstallationLock) {
becomeCoordinator();
  }
  return true;
}
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8200) Rebalance operations stuck in "IN_PROGRESS" state forever

2020-05-28 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-8200:


 Summary: Rebalance operations stuck in "IN_PROGRESS" state forever
 Key: GEODE-8200
 URL: https://issues.apache.org/jira/browse/GEODE-8200
 Project: Geode
  Issue Type: Bug
  Components: management
Reporter: Aaron Lindsey


We use the management REST API to call rebalance immediately before stopping a 
server to limit the possibility of data loss. In a cluster with 3 locators, 3 
servers, and no regions, we noticed that sometimes the rebalance operation 
never ends if one of the locators is restarting concurrently with the rebalance 
operation.

More specifically, the scenario where we see this issue crop up is during an 
automated "rolling restart" operation in a Kubernetes environment which 
proceeds as follows:
* At most one locator and one server are restarting at any point in time
* Each locator/server waits until the previous locator/server is fully online 
before restarting
* Immediately before stopping a server, a rebalance operation is performed and 
the server is not stopped until the rebalance operation is completed

The impact of this issue is that the "rolling restart" operation will never 
complete, because it cannot proceed with stopping a server until the rebalance 
operation is completed. A human is then required to intervene and manually 
trigger a rebalance and stop the server. This type of "rolling restart" 
operation is triggered fairly often in Kubernetes — any time part of the 
configuration of the locators or servers changes. 

The following JSON is a sample response from the management REST API that shows 
the rebalance operation stuck in "IN_PROGRESS".

{code}
{
  "statusCode": "IN_PROGRESS",
  "links": {
"self": 
"http://geodecluster-sample-locator.default/management/v1/operations/rebalances/a47f23c8-02b3-443c-a367-636fd6921ea7;,
"list": 
"http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
  },
  "operationStart": "2020-05-27T22:38:30.619Z",
  "operationId": "a47f23c8-02b3-443c-a367-636fd6921ea7",
  "operation": {
"simulate": false
  }
}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GEODE-8075) Geek squad tech support

2020-05-06 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey closed GEODE-8075.


> Geek squad tech support
> ---
>
> Key: GEODE-8075
> URL: https://issues.apache.org/jira/browse/GEODE-8075
> Project: Geode
>  Issue Type: Test
>Reporter: Jacks martin
>Priority: Major
>
> [Geek Squad Tech Support|https://igeektechs.org/] gives you on-demand 
> solutions, with highly accurate results. Best Buy offers repair services for 
> most major home appliances including refrigerators, freezers, washers, 
> dryers, dishwashers, stoves, and more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8075) Geek squad tech support

2020-05-06 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-8075.
--
Resolution: Won't Do

This looks like spam to me. If I misunderstood, please re-open the ticket.

> Geek squad tech support
> ---
>
> Key: GEODE-8075
> URL: https://issues.apache.org/jira/browse/GEODE-8075
> Project: Geode
>  Issue Type: Test
>Reporter: Jacks martin
>Priority: Major
>
> [Geek Squad Tech Support|https://igeektechs.org/] gives you on-demand 
> solutions, with highly accurate results. Best Buy offers repair services for 
> most major home appliances including refrigerators, freezers, washers, 
> dryers, dishwashers, stoves, and more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8078) Exceptions in locator logs when hitting members REST endpoint

2020-05-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-8078:
-
Description: 
I'm seeing the following exceptions in locator logs when I try to hit the REST 
endpoint /management/v1/members/\{id} before the member has finished starting 
up. The reason I need to do this is because I have a program that is polling 
that endpoint to wait until the member is online. Ideally these errors would 
not show up in logs, but instead be reflected in the status code of the REST 
response.

{quote}[error 2020/04/06 22:05:59.086 UTC  tid=0x31] class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')
java.lang.ClassCastException: class org.apache.geode.cache.CacheClosedException 
cannot be cast to class org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')
at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.list(LocatorClusterManagementService.java:417)
at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.get(LocatorClusterManagementService.java:434)
at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController.getMember(MemberManagementController.java:50)
at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$FastClassBySpringCGLIB$$3634e452.invoke()
at 
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:769)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
at 
org.springframework.security.access.intercept.aopalliance.MethodSecurityInterceptor.invoke(MethodSecurityInterceptor.java:69)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$EnhancerBySpringCGLIB$$2893b195.getMember()
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:888)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793)
at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)
at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
at 
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:760)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1617)
at 
org.apache.geode.management.internal.rest.ManagementLoggingFilter.doFilterInternal(ManagementLoggingFilter.java:44)
 

[jira] [Updated] (GEODE-8078) Exceptions in locator logs when hitting members REST endpoint

2020-05-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-8078:
-
Description: 
I'm seeing the following exceptions in locator logs when I try to hit the REST 
endpoint /management/v1/members/\{id} before the member has finished starting 
up. The reason I need to do this is because I have a program that is polling 
that endpoint to wait until the member is online. Ideally these errors would 
not show up in logs, but instead be reflected in the status code of the REST 
response.

{{ [error 2020/04/06 22:05:59.086 UTC  tid=0x31] class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')}}
{{ java.lang.ClassCastException: class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')}}
{{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.list(LocatorClusterManagementService.java:417)}}
{{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.get(LocatorClusterManagementService.java:434)}}
{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController.getMember(MemberManagementController.java:50)}}
{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$FastClassBySpringCGLIB$$3634e452.invoke()}}
{{ at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:769)}}
{{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)}}
{{ at 
org.springframework.security.access.intercept.aopalliance.MethodSecurityInterceptor.invoke(MethodSecurityInterceptor.java:69)}}
{{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)}}
{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$EnhancerBySpringCGLIB$$2893b195.getMember()}}
{{ at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)}}
{{ at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{ at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
{{ at java.base/java.lang.reflect.Method.invoke(Method.java:566)}}
{{ at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)}}
{{ at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)}}
{{ at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)}}
{{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:888)}}
{{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793)}}
{{ at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)}}
{{ at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)}}
{{ at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)}}
{{ at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)}}
{{ at 
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)}}
{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)}}
{{ at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)}}
{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)}}
{{ at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:760)}}
{{ at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1617)}}
{{ at 
org.apache.geode.management.internal.rest.ManagementLoggingFilter.doFilterInternal(ManagementLoggingFilter.java:44)}}
{{ at 
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:119)}}

[jira] [Updated] (GEODE-8078) Exceptions in locator logs when hitting members REST endpoint

2020-05-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-8078:
-
Description: 
I'm seeing the following exceptions in locator logs when I try to hit the REST 
endpoint /management/v1/members/\{id} before the member has finished starting 
up. The reason I need to do this is because I have a program that is polling 
that endpoint to wait until the member is online. Ideally these errors would 
not show up in logs, but instead be reflected in the status code of the REST 
response.

{{[error 2020/04/06 22:05:59.086 UTC  tid=0x31] class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')}}
{{ {{ {{java.lang.ClassCastException: class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')}}
{{ \{{ {{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.list(LocatorClusterManagementService.java:417)}}
{{ \{{ {{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.get(LocatorClusterManagementService.java:434)}}
{{ \{{ {{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController.getMember(MemberManagementController.java:50)}}
{{ \{{ {{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$FastClassBySpringCGLIB$$3634e452.invoke()}}
{{ \{{ {{ at 
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)}}
{{ \{{ {{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:769)}}
{{ \{{ {{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)}}
{{ \{{ {{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)}}
{{ \{{ {{ at 
org.springframework.security.access.intercept.aopalliance.MethodSecurityInterceptor.invoke(MethodSecurityInterceptor.java:69)}}
{{ \{{ {{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)}}
{{ \{{ {{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)}}
{{ \{{ {{ at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)}}
{{ \{{ {{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$EnhancerBySpringCGLIB$$2893b195.getMember()}}
{{ \{{ {{ at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)}}
{{ \{{ {{ at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{ \{{ {{ at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
{{ \{{ {{ at java.base/java.lang.reflect.Method.invoke(Method.java:566)}}
{{ \{{ {{ at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)}}
{{ \{{ {{ at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)}}
{{ \{{ {{ at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)}}
{{ \{{ {{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:888)}}
{{ \{{ {{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793)}}
{{ \{{ {{ at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)}}
{{ \{{ {{ at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)}}
{{ \{{ {{ at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)}}
{{ \{{ {{ at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)}}
{{ \{{ {{ at 
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)}}
{{ \{{ {{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)}}
{{ \{{ {{ at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)}}
{{ \{{ {{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)}}
{{ \{{ {{ at 

[jira] [Created] (GEODE-8078) Exceptions in locator logs when hitting members REST endpoint

2020-05-05 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-8078:


 Summary: Exceptions in locator logs when hitting members REST 
endpoint
 Key: GEODE-8078
 URL: https://issues.apache.org/jira/browse/GEODE-8078
 Project: Geode
  Issue Type: Bug
  Components: management
Reporter: Aaron Lindsey


I'm seeing the following exceptions in locator logs when I try to hit the REST 
endpoint /management/v1/members/\{id} before the member has finished starting 
up. The reason I need to do this is because I have a program that is polling 
that endpoint to wait until the member is online. Ideally these errors would 
not show up in logs, but instead be reflected in the status code of the REST 
response.

{{[error 2020/04/06 22:05:59.086 UTC  tid=0x31] class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')}}
{{java.lang.ClassCastException: class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')}}
{{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.list(LocatorClusterManagementService.java:417)}}
{{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.get(LocatorClusterManagementService.java:434)}}
{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController.getMember(MemberManagementController.java:50)}}
{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$FastClassBySpringCGLIB$$3634e452.invoke()}}
{{ at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:769)}}
{{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)}}
{{ at 
org.springframework.security.access.intercept.aopalliance.MethodSecurityInterceptor.invoke(MethodSecurityInterceptor.java:69)}}
{{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)}}
{{ at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)}}
{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$EnhancerBySpringCGLIB$$2893b195.getMember()}}
{{ at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)}}
{{ at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
{{ at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
{{ at java.base/java.lang.reflect.Method.invoke(Method.java:566)}}
{{ at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)}}
{{ at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)}}
{{ at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)}}
{{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:888)}}
{{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793)}}
{{ at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)}}
{{ at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)}}
{{ at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)}}
{{ at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)}}
{{ at 
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)}}
{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)}}
{{ at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)}}
{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)}}
{{ at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:760)}}
{{ at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1617)}}
{{ at 

[jira] [Updated] (GEODE-8078) Exceptions in locator logs when hitting members REST endpoint

2020-05-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-8078:
-
Description: 
I'm seeing the following exceptions in locator logs when I try to hit the REST 
endpoint /management/v1/members/\{id} before the member has finished starting 
up. The reason I need to do this is because I have a program that is polling 
that endpoint to wait until the member is online. Ideally these errors would 
not show up in logs, but instead be reflected in the status code of the REST 
response.

[error 2020/04/06 22:05:59.086 UTC  tid=0x31] class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')
{{ {{java.lang.ClassCastException: class 
org.apache.geode.cache.CacheClosedException cannot be cast to class 
org.apache.geode.management.runtime.RuntimeInfo 
(org.apache.geode.cache.CacheClosedException and 
org.apache.geode.management.runtime.RuntimeInfo are in unnamed module of loader 
'app')
{{ \{{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.list(LocatorClusterManagementService.java:417)
{{ \{{ at 
org.apache.geode.management.internal.api.LocatorClusterManagementService.get(LocatorClusterManagementService.java:434)
{{ \{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController.getMember(MemberManagementController.java:50)
{{ \{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$FastClassBySpringCGLIB$$3634e452.invoke()
{{ \{{ at 
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
{{ \{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:769)
{{ \{{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
{{ \{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
{{ \{{ at 
org.springframework.security.access.intercept.aopalliance.MethodSecurityInterceptor.invoke(MethodSecurityInterceptor.java:69)
{{ \{{ at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
{{ \{{ at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
{{ \{{ at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
{{ \{{ at 
org.apache.geode.management.internal.rest.controllers.MemberManagementController$$EnhancerBySpringCGLIB$$2893b195.getMember()
{{ \{{ at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
{{ \{{ at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
{{ \{{ at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
{{ \{{ at java.base/java.lang.reflect.Method.invoke(Method.java:566)
{{ \{{ at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
{{ \{{ at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
{{ \{{ at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
{{ \{{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:888)
{{ \{{ at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793)
{{ \{{ at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
{{ \{{ at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)
{{ \{{ at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
{{ \{{ at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
{{ \{{ at 
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:898)
{{ \{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
{{ \{{ at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
{{ \{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
{{ \{{ at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:760)
{{ \{{ at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1617)
{{ \{{ at 

[jira] [Updated] (GEODE-8077) Logging to Standard Out

2020-05-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-8077:
-
Description: 
The description below is from RFC [Logging to Standard 
Out|https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out]
{quote}
h2. Problem

Currently logging to stdout is not consistent between client, server and 
locator. If {{log-file}} is {{null}} on a client then it will log to stdout by 
default, but on servers and locators it will log to a file named after the 
member. Setting the {{log-file}} to {{""}} (empty string) on the server will 
result in logging to stdout, but on a locator it is treated like the {{null}} 
case and logs to a file. The only way get the locator to log to stdout is to 
override the log4j.xml file.
h3. Anti-Goals

Do not change the current default behavior in client, server, or locators when 
handling {{null}} or {{""}} (empty string).
h2. Solution

Introduce a new value {{log-file}} of "-" (dash) to indicate standard out, 
which is a common standard across most applications. When the logger is 
configured and thee {{log-file}} value is {{""}} then the logger will log to 
standard out and not to any files.
h3. Changes and Additions to Public Interface

Changes will be needed in documentation to reference this new value for logging 
to standard out.
h3. Performance Impact

As no changes will be made to logging itself there is not impact to performance.
h3. Backwards Compatibility and Upgrade Path

Since no changes are being made to the current behaviors there should be no 
impact to rolling upgrades and backwards compatibility.
{quote}

  was:
The description below is from RFC [Logging to Standard Out 
|[https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out]]
{quote}
h2. Problem

Currently logging to stdout is not consistent between client, server and 
locator. If {{log-file}} is {{null}} on a client then it will log to stdout by 
default, but on servers and locators it will log to a file named after the 
member. Setting the {{log-file}} to {{""}} (empty string) on the server will 
result in logging to stdout, but on a locator it is treated like the {{null}} 
case and logs to a file. The only way get the locator to log to stdout is to 
override the log4j.xml file.
h3. Anti-Goals

Do not change the current default behavior in client, server, or locators when 
handling {{null}} or {{""}} (empty string).
h2. Solution

Introduce a new value {{log-file}} of "-" (dash) to indicate standard out, 
which is a common standard across most applications. When the logger is 
configured and thee {{log-file}} value is {{""}} then the logger will log to 
standard out and not to any files.
h3. Changes and Additions to Public Interface

Changes will be needed in documentation to reference this new value for logging 
to standard out.
h3. Performance Impact

As no changes will be made to logging itself there is not impact to performance.
h3. Backwards Compatibility and Upgrade Path

Since no changes are being made to the current behaviors there should be no 
impact to rolling upgrades and backwards compatibility.
{quote}


> Logging to Standard Out
> ---
>
> Key: GEODE-8077
> URL: https://issues.apache.org/jira/browse/GEODE-8077
> Project: Geode
>  Issue Type: Improvement
>  Components: logging
>Reporter: Aaron Lindsey
>Priority: Major
>
> The description below is from RFC [Logging to Standard 
> Out|https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out]
> {quote}
> h2. Problem
> Currently logging to stdout is not consistent between client, server and 
> locator. If {{log-file}} is {{null}} on a client then it will log to stdout 
> by default, but on servers and locators it will log to a file named after the 
> member. Setting the {{log-file}} to {{""}} (empty string) on the server will 
> result in logging to stdout, but on a locator it is treated like the {{null}} 
> case and logs to a file. The only way get the locator to log to stdout is to 
> override the log4j.xml file.
> h3. Anti-Goals
> Do not change the current default behavior in client, server, or locators 
> when handling {{null}} or {{""}} (empty string).
> h2. Solution
> Introduce a new value {{log-file}} of "-" (dash) to indicate standard out, 
> which is a common standard across most applications. When the logger is 
> configured and thee {{log-file}} value is {{""}} then the logger will log to 
> standard out and not to any files.
> h3. Changes and Additions to Public Interface
> Changes will be needed in documentation to reference this new value for 
> logging to standard out.
> h3. Performance Impact
> As no changes will be made to logging itself there is not impact to 
> performance.
> h3. Backwards Compatibility and Upgrade Path
> Since no changes are being 

[jira] [Updated] (GEODE-8077) Logging to Standard Out

2020-05-05 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-8077:
-
Description: 
The description below is from RFC [Logging to Standard Out 
|[https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out]]
{quote}
h2. Problem

Currently logging to stdout is not consistent between client, server and 
locator. If {{log-file}} is {{null}} on a client then it will log to stdout by 
default, but on servers and locators it will log to a file named after the 
member. Setting the {{log-file}} to {{""}} (empty string) on the server will 
result in logging to stdout, but on a locator it is treated like the {{null}} 
case and logs to a file. The only way get the locator to log to stdout is to 
override the log4j.xml file.
h3. Anti-Goals

Do not change the current default behavior in client, server, or locators when 
handling {{null}} or {{""}} (empty string).
h2. Solution

Introduce a new value {{log-file}} of "-" (dash) to indicate standard out, 
which is a common standard across most applications. When the logger is 
configured and thee {{log-file}} value is {{""}} then the logger will log to 
standard out and not to any files.
h3. Changes and Additions to Public Interface

Changes will be needed in documentation to reference this new value for logging 
to standard out.
h3. Performance Impact

As no changes will be made to logging itself there is not impact to performance.
h3. Backwards Compatibility and Upgrade Path

Since no changes are being made to the current behaviors there should be no 
impact to rolling upgrades and backwards compatibility.
{quote}

  was:
The description below is from RFC [Logging to Standard Out 
|[https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out]]
{quote}
h2. Problem

Currently logging to stdout is not consistent between client, server and 
locator. If {{log-file}} is {{null}} on a client then it will log to stdout by 
default, but on servers and locators it will log to a file named after the 
member. Setting the {{log-file}} to {{""}} (empty string) on the server will 
result in logging to stdout, but on a locator it is treated like the {{null}} 
case and logs to a file. The only way get the locator to log to stdout is to 
override the log4j.xml file.
h3. Anti-Goals

Do not change the current default behavior in client, server, or locators when 
handling {{null}} or {{""}} (empty string).
h2. Solution

Introduce a new value {{log-file}} of {{"-"-}} (dash) to indicate standard out, 
which is a common standard across most applications. When the logger is 
configured and thee {{log-file}} value is {{""}} then the logger will log to 
standard out and not to any files.
h3. Changes and Additions to Public Interface

Changes will be needed in documentation to reference this new value for logging 
to standard out.
h3. Performance Impact

As no changes will be made to logging itself there is not impact to performance.
h3. Backwards Compatibility and Upgrade Path

Since no changes are being made to the current behaviors there should be no 
impact to rolling upgrades and backwards compatibility.
{quote}


> Logging to Standard Out
> ---
>
> Key: GEODE-8077
> URL: https://issues.apache.org/jira/browse/GEODE-8077
> Project: Geode
>  Issue Type: Improvement
>  Components: logging
>Reporter: Aaron Lindsey
>Priority: Major
>
> The description below is from RFC [Logging to Standard Out 
> |[https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out]]
> {quote}
> h2. Problem
> Currently logging to stdout is not consistent between client, server and 
> locator. If {{log-file}} is {{null}} on a client then it will log to stdout 
> by default, but on servers and locators it will log to a file named after the 
> member. Setting the {{log-file}} to {{""}} (empty string) on the server will 
> result in logging to stdout, but on a locator it is treated like the {{null}} 
> case and logs to a file. The only way get the locator to log to stdout is to 
> override the log4j.xml file.
> h3. Anti-Goals
> Do not change the current default behavior in client, server, or locators 
> when handling {{null}} or {{""}} (empty string).
> h2. Solution
> Introduce a new value {{log-file}} of "-" (dash) to indicate standard out, 
> which is a common standard across most applications. When the logger is 
> configured and thee {{log-file}} value is {{""}} then the logger will log to 
> standard out and not to any files.
> h3. Changes and Additions to Public Interface
> Changes will be needed in documentation to reference this new value for 
> logging to standard out.
> h3. Performance Impact
> As no changes will be made to logging itself there is not impact to 
> performance.
> h3. Backwards Compatibility and Upgrade Path
> Since no changes 

[jira] [Created] (GEODE-8077) Logging to Standard Out

2020-05-05 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-8077:


 Summary: Logging to Standard Out
 Key: GEODE-8077
 URL: https://issues.apache.org/jira/browse/GEODE-8077
 Project: Geode
  Issue Type: Improvement
  Components: logging
Reporter: Aaron Lindsey


The description below is from RFC [Logging to Standard Out 
|[https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out]]
{quote}
h2. Problem

Currently logging to stdout is not consistent between client, server and 
locator. If {{log-file}} is {{null}} on a client then it will log to stdout by 
default, but on servers and locators it will log to a file named after the 
member. Setting the {{log-file}} to {{""}} (empty string) on the server will 
result in logging to stdout, but on a locator it is treated like the {{null}} 
case and logs to a file. The only way get the locator to log to stdout is to 
override the log4j.xml file.
h3. Anti-Goals

Do not change the current default behavior in client, server, or locators when 
handling {{null}} or {{""}} (empty string).
h2. Solution

Introduce a new value {{log-file}} of {{"-"-}} (dash) to indicate standard out, 
which is a common standard across most applications. When the logger is 
configured and thee {{log-file}} value is {{""}} then the logger will log to 
standard out and not to any files.
h3. Changes and Additions to Public Interface

Changes will be needed in documentation to reference this new value for logging 
to standard out.
h3. Performance Impact

As no changes will be made to logging itself there is not impact to performance.
h3. Backwards Compatibility and Upgrade Path

Since no changes are being made to the current behaviors there should be no 
impact to rolling upgrades and backwards compatibility.
{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7830) Management REST API rebalance endpoints return confusing operationResults

2020-03-05 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052311#comment-17052311
 ] 

Aaron Lindsey commented on GEODE-7830:
--

The "operator" in my case is actually a Kubernetes controller. It calls 
rebalance each time Kubernetes tries to stop a Geode server to ensure data is 
not lost. In this case it is very common to call rebalance when there are no 
regions, e.g. during a scaling operation before the user has created any 
regions. Right now we have to parse the status message to determine if the 
rebalance failed due to the no-op error, and then ignore it.

Do you know if having no regions is the only reason the rebalance API will 
return the no-op error? If we were sure of that, then we could call list 
regions to make sure regions exist before calling rebalance.

FWIW, I think it would be best to assume that consumers of this REST API will 
be programs, not humans, and therefore we should design it in such a way that 
it would be easy to consume programatically. It's much more reliable to 
programmatically check the size of an array rather than parse a status message 
to determine if the rebalance succeeded.

> Management REST API rebalance endpoints return confusing operationResults
> -
>
> Key: GEODE-7830
> URL: https://issues.apache.org/jira/browse/GEODE-7830
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Darrel Schneider
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We observed odd behavior regarding the operationResult object returned in the 
> rebalance API:
> # It contains success=false if the cluster has no regions or has no servers. 
> This is confusing because the rebalance didn't fail — it just didn't have 
> anything to rebalance so it was basically a no-op. As a consumer of this API, 
> I need to be able to distinguish between "real" failures and this "no-op" 
> failure, and I should not have to write code to parse the "statusMessage" to 
> do that.
> # Sometimes, success=true and other times success=false for the same 
> statusMessage: "Distributed system has no regions that can be rebalanced." 
> This is confusing because I don't know why it sometimes considers this a 
> failure and other times considers it a success. If #1 above is fixed, then 
> this would not be an issue because it would always return success=true for 
> this particular statusMessage.
> Here is an example of two confusing operationResults we observed:
> {code:json}
> {
>   "result": [
> {
>   "statusCode": "OK",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/15dfe6ef-acaf-4a45-9b55-1d855a977ba8;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-02-25T18:53:34.058Z",
>   "operationEnd": "2020-02-25T18:53:34.063Z",
>   "operationId": "15dfe6ef-acaf-4a45-9b55-1d855a977ba8",
>   "operation": {
> "simulate": false
>   },
>   "operationResult": {
> "statusMessage": "Distributed system has no regions that can be 
> rebalanced.",
> "success": true
>   }
> },
> {
>   "statusCode": "OK",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/8218ce0d-e3b8-4c49-b925-665a28e821c3;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-02-25T18:53:45.650Z",
>   "operationEnd": "2020-02-25T18:53:45.654Z",
>   "operationId": "8218ce0d-e3b8-4c49-b925-665a28e821c3",
>   "operation": {
> "simulate": false
>   },
>   "operationResult": {
> "statusMessage": "Distributed system has no regions that can be 
> rebalanced.",
> "success": false
>   }
> }
>   ],
>   "statusCode": "OK"
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7830) Management REST API rebalance endpoints return confusing operationResults

2020-03-04 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051486#comment-17051486
 ] 

Aaron Lindsey commented on GEODE-7830:
--

[~dschneider] Thanks for that perspective. I agree that there might be a 
situation where an operator tries to perform a rebalance, but the rebalance 
does not do what they expect due to regions/servers not being online yet. In 
that specific case, however, the operator could check the 
rebalanceRegionResults which is included in the operationResult and see that it 
is empty which means that there are no regions that were actually rebalanced. 
Then they would know that the rebalance didn't do what they intended.

> Management REST API rebalance endpoints return confusing operationResults
> -
>
> Key: GEODE-7830
> URL: https://issues.apache.org/jira/browse/GEODE-7830
> Project: Geode
>  Issue Type: Bug
>  Components: management
>Reporter: Aaron Lindsey
>Assignee: Darrel Schneider
>Priority: Major
>
> We observed odd behavior regarding the operationResult object returned in the 
> rebalance API:
> # It contains success=false if the cluster has no regions or has no servers. 
> This is confusing because the rebalance didn't fail — it just didn't have 
> anything to rebalance so it was basically a no-op. As a consumer of this API, 
> I need to be able to distinguish between "real" failures and this "no-op" 
> failure, and I should not have to write code to parse the "statusMessage" to 
> do that.
> # Sometimes, success=true and other times success=false for the same 
> statusMessage: "Distributed system has no regions that can be rebalanced." 
> This is confusing because I don't know why it sometimes considers this a 
> failure and other times considers it a success. If #1 above is fixed, then 
> this would not be an issue because it would always return success=true for 
> this particular statusMessage.
> Here is an example of two confusing operationResults we observed:
> {code:json}
> {
>   "result": [
> {
>   "statusCode": "OK",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/15dfe6ef-acaf-4a45-9b55-1d855a977ba8;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-02-25T18:53:34.058Z",
>   "operationEnd": "2020-02-25T18:53:34.063Z",
>   "operationId": "15dfe6ef-acaf-4a45-9b55-1d855a977ba8",
>   "operation": {
> "simulate": false
>   },
>   "operationResult": {
> "statusMessage": "Distributed system has no regions that can be 
> rebalanced.",
> "success": true
>   }
> },
> {
>   "statusCode": "OK",
>   "links": {
> "self": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances/8218ce0d-e3b8-4c49-b925-665a28e821c3;,
> "list": 
> "http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
>   },
>   "operationStart": "2020-02-25T18:53:45.650Z",
>   "operationEnd": "2020-02-25T18:53:45.654Z",
>   "operationId": "8218ce0d-e3b8-4c49-b925-665a28e821c3",
>   "operation": {
> "simulate": false
>   },
>   "operationResult": {
> "statusMessage": "Distributed system has no regions that can be 
> rebalanced.",
> "success": false
>   }
> }
>   ],
>   "statusCode": "OK"
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-7830) Management REST API rebalance endpoints return confusing operationResults

2020-02-28 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-7830:


 Summary: Management REST API rebalance endpoints return confusing 
operationResults
 Key: GEODE-7830
 URL: https://issues.apache.org/jira/browse/GEODE-7830
 Project: Geode
  Issue Type: Bug
  Components: management
Reporter: Aaron Lindsey


We observed odd behavior regarding the operationResult object returned in the 
rebalance API:
# It contains success=false if the cluster has no regions or has no servers. 
This is confusing because the rebalance didn't fail — it just didn't have 
anything to rebalance so it was basically a no-op. As a consumer of this API, I 
need to be able to distinguish between "real" failures and this "no-op" 
failure, and I should not have to write code to parse the "statusMessage" to do 
that.
# Sometimes, success=true and other times success=false for the same 
statusMessage: "Distributed system has no regions that can be rebalanced." This 
is confusing because I don't know why it sometimes considers this a failure and 
other times considers it a success. If #1 above is fixed, then this would not 
be an issue because it would always return success=true for this particular 
statusMessage.

Here is an example of two confusing operationResults we observed:
{code:json}
{
  "result": [
{
  "statusCode": "OK",
  "links": {
"self": 
"http://geodecluster-sample-locator.default/management/v1/operations/rebalances/15dfe6ef-acaf-4a45-9b55-1d855a977ba8;,
"list": 
"http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
  },
  "operationStart": "2020-02-25T18:53:34.058Z",
  "operationEnd": "2020-02-25T18:53:34.063Z",
  "operationId": "15dfe6ef-acaf-4a45-9b55-1d855a977ba8",
  "operation": {
"simulate": false
  },
  "operationResult": {
"statusMessage": "Distributed system has no regions that can be 
rebalanced.",
"success": true
  }
},
{
  "statusCode": "OK",
  "links": {
"self": 
"http://geodecluster-sample-locator.default/management/v1/operations/rebalances/8218ce0d-e3b8-4c49-b925-665a28e821c3;,
"list": 
"http://geodecluster-sample-locator.default/management/v1/operations/rebalances;
  },
  "operationStart": "2020-02-25T18:53:45.650Z",
  "operationEnd": "2020-02-25T18:53:45.654Z",
  "operationId": "8218ce0d-e3b8-4c49-b925-665a28e821c3",
  "operation": {
"simulate": false
  },
  "operationResult": {
"statusMessage": "Distributed system has no regions that can be 
rebalanced.",
"success": false
  }
}
  ],
  "statusCode": "OK"
}
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-7091) Add Micrometer binders to default meter registry

2020-01-06 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-7091.
--
Resolution: Fixed

> Add Micrometer binders to default meter registry
> 
>
> Key: GEODE-7091
> URL: https://issues.apache.org/jira/browse/GEODE-7091
> Project: Geode
>  Issue Type: Improvement
>  Components: statistics
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> As a user, there are specific JVM metrics, GC metrics, Uptime, and 
> FileDescriptor metrics that help indicate and track down issues with health 
> of the cluster, that I want to access in order to understand the health of my 
> cluster.
> Add the following Micrometer binders:
> * JvmGcMetrics
> * ProcessorMetrics
> * JvmThreadMetrics
> * UptimeMetrics
> * FileDescriptorMetrics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2020-01-06 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-7237.
--
Resolution: Fixed

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-7326) Add cache gets timers

2019-12-19 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-7326.
--
Fix Version/s: 1.11.0
   Resolution: Fixed

> Add cache gets timers
> -
>
> Key: GEODE-7326
> URL: https://issues.apache.org/jira/browse/GEODE-7326
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> h3. Why
> Users want to understand the performance of their operations within the 
> server.
> h3. Acceptance Criteria
> Type: timer
> Name: geode.cache.gets
> Tags: region, result=hit/miss
> Lifecycle of meter: The hit/miss meter for each region is created when the 
> region is created. The meter(s) are removed when the region is 
> destroyed/closed.
> Description for meter: The total time and count for GET requests from clients.
> Thing to measure : A count and total time for GET operations that didn't 
> error, by this specific Server (1 or many cacheservers) in the geode cluster 
> from when the server receives the request to when it sends the response.
> Business Rule for this measurement: This meter records any operation sent 
> through a CacheServer
> h3. Scenarios
> *Scenario: Java client hits*
> Given a cluster with a Server1 and a Locator1 with time statistics enabled
> When the oldest supported java client issues 5 get operations using the 
> region.get(key) command
> Then a meter on Server1 exists such that:
> - Meter Name = 'geode.cache.gets'
> - Count = 5
> - Total Time = total time spent from received request to response to client 
> for these 5 requests
> - Tag: region = region that the 'get' method was called against
> - Tag: result=hit
> *Scenario: Java client misses*
> Given a cluster with a Server1 and a Locator1 with time statistics enabled
> When the oldest supported java client issues 5 get operations where the user 
> is getting a key that doesn't exist in the region using the region.get(key) 
> command
> Then a meter on Server1 exists such that:
> - Meter Name = 'geode.cache.gets'
> - Count = 5
> - Total Time = total time spent from received request to response to client 
> for these 5 requests
> - Tag: region = region that the 'get' method was called against
> - Tag: result=miss
> *Scenario: Java client hits with time stats disabled*
> Given a cluster with a Server1 and a Locator1 with time statistics disabled
> When a java client issues a get operation using the region.get(key) command 
> where the key exists
> Then a meter on Server1 exists such that:
> - Meter Name = 'geode.cache.gets'
> - Count = 1
> - Total Time = 0
> - Tag: region = region that the 'get' method was called against
> - Tag: result=hit
> *Scenario: Java client error response*
> Given a cluster with a Server1
> And a RegionA exists with NO entry with a Key="1"
> And the client is unauthorized for Key="1"
> When the client issues a region.get(1) request
> Then no meter on Server1 should exist like:
> - Meter Name = 'geode.cache.gets'
> - Tag: region = RegionA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-7184) Add function execution timers

2019-12-19 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-7184.
--
Fix Version/s: 1.11.0
   Resolution: Fixed

> Add function execution timers
> -
>
> Key: GEODE-7184
> URL: https://issues.apache.org/jira/browse/GEODE-7184
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Developers oftentimes deploy their own functions to the system to enable 
> decorator pattern for caching to add information to specific key/value pairs. 
> In doing so, they can introduce bottlenecks into the system where server-side 
> functions can cause issues or make things slower than intended. We want a way 
> that users can view functions that they create, and see what the average 
> execution time looks like.
>  * *Meter Type*: Timer
>  * *Name*: geode.function.executions
>  * *Description*: TBD
>  * *Tags*: , function (getId on function, if DNE present 
> getClass.getname of deployed function), succeeded (true/false)
> h3. Acceptance Criteria
> *Meter creation/deletion*: Create meter on function execution
> *Measurement*: On an individual server, start the timer when a *USER* 
> function is invoked/executed, and stop the timer when the user function 
> completes OR errors. If it throws a Function Execution or another error then 
> the tag function.isSuccessful=false
> Details on Functions and their execution: 
> [https://geode.apache.org/docs/guide/110/developing/function_exec/function_execution.html]
> h3. Scenarios
> *Scenario: The timers are created when the function is first executed*
> Given a user executed a function with ID functionToTime on a cluster with 1 
> locator/1 server
> And functionToTime has not been executed previously
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count > 1
> - totalTime >= 5,000,000,000ns
> And the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = false
> - count = 0
> - totalTime = 0
> *Scenario: Successful singular function execution (registered execution)*
> Given a user registers a function with ID functionToTime (that waits for 5 
> seconds) on a cluster with 1 locator/1 server
> When functionToTime is triggered using gfsh command: "execute function 
> --id=functionToTime"
> And the function completes without error
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 1
> - totalTime >= 5,000,000,000ns
> *Scenario: Successful singular function execution (unregistered execution)*
> Given an unregistered function with ID functionToTime (that waits for 5 
> seconds) exists 
> When triggered on a client using  
> "FunctionService.onServers(cache).execute(new FunctionToTime())"
> And the function completes without error
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 1
> - totalTime >= 5,000,000,000ns
> *Scenario: Singular function execution with Any Exception*
> Given an unregistered function with ID functionToTime (that waits for 5 
> seconds) exists 
> When triggered on a client using  
> "FunctionService.onServers(cache).execute(new FunctionToTime())"
> And the function exits with a Any exception error after running for 5 seconds
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = false
> - count = 1
> - totalTime >= 5,000,000,000ns
> *Scenario: Function execution onRegion multi-server*
> Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
> And a region called RR1 that is a replicate region
> When a function execution is triggered against that replicate region using  
> "FunctionService.onRegion(regionRR1).execute(new FunctionToTime())"
> Then one server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 1
> - totalTime >= 5,000,000,000ns
> And the other server has the following timer:
> - name: geode.cache.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 0
> - totalTime = 0
> *Scenario: Function execution onRegion with partition region multiple times*
> *Scenario: Function execution onRegion multi-server*
> Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
> And a partition region called PR1 that only exists on S1
> When a function execution is triggered 10 times against that replicate region 
> using  

[jira] [Updated] (GEODE-7164) IntelliJ IDEA 2019 error: the output path is not specified for modules

2019-12-19 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7164:
-
Fix Version/s: 1.11.0

> IntelliJ IDEA 2019 error: the output path is not specified for modules
> --
>
> Key: GEODE-7164
> URL: https://issues.apache.org/jira/browse/GEODE-7164
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.11.0
>
> Attachments: Screen Shot 2019-09-04 at 10.45.47 AM.png, Screen Shot 
> 2019-09-04 at 10.51.04 AM.png, image-2019-10-25-16-54-38-061.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When delegating build/run actions to IntelliJ IDEA instead of Gradle, 
> IntelliJ IDEA 2019 fails to build geode with an error similar to the one 
> shown in the screenshot below:
>  !Screen Shot 2019-09-04 at 10.45.47 AM.png|width=370,height=298!
> h4. Steps to Reproduce:
> (Tested on IntelliJ IDEA CE versions 2019.1.4 and 2019.2.1)
>  # Make sure Gradle delegation is disabled for build/run
>  ** Instructions for 2019.1.4:
>  *** Go to Preferences | Build, Execution, Deployment | Build Tools | Gradle 
> | Runner
>  *** Make sure "Delegate build/run actions to Gradle" is unchecked
>  ** Instructions for 2019.2.1:
>  *** Go to Preferences | Build, Execution, Deployment | Build Tools | Gradle
>  *** Make sure "Build and Run using:" is set to "IntelliJ IDEA"
>  # Clone geode into an empty directory
>  # Follow the instructions 
> [here|https://github.com/apache/geode/blob/develop/BUILDING.md] to import and 
> build geode using IntelliJ IDEA
>  # Enable Gradle build/run delegation
>  ** Instructions for 2019.1.4:
>  *** Go to Preferences | Build, Execution, Deployment | Build Tools | Gradle 
> | Runner
>  *** Check "Delegate build/run actions to Gradle"
>  ** Instructions for 2019.2.1:
>  *** Go to Preferences | Build, Execution, Deployment | Build Tools | Gradle
>  *** Set "Build and Run using:" to "Gradle"
>  # Select "Build Project" from the Build menu to build geode
>  # After the build succeeds, revert the change from step 4 to switch back to 
> the IntelliJ build runner
>  # Repeat step 5 to build the project again
>  # The popup error message shown in the screenshot should show and IntelliJ 
> will not initiate the build



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-7390) Add Micrometer metrics example to geode-examples

2019-11-01 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7390:
-
Fix Version/s: 1.11.0

> Add Micrometer metrics example to geode-examples
> 
>
> Key: GEODE-7390
> URL: https://issues.apache.org/jira/browse/GEODE-7390
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> h3. Why
> We want users of Geode to be able to spin up a Prometheus/Grafana setup along 
> with a simple meter registry that exposes Prometheus endpoints to be scraped 
> with all of the meters in the registry.
> h3. Acceptance Criteria
> Add Micrometer metrics example to geode-examples repository.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-7390) Add Micrometer metrics example to geode-examples

2019-11-01 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-7390.
--
Resolution: Done

> Add Micrometer metrics example to geode-examples
> 
>
> Key: GEODE-7390
> URL: https://issues.apache.org/jira/browse/GEODE-7390
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> h3. Why
> We want users of Geode to be able to spin up a Prometheus/Grafana setup along 
> with a simple meter registry that exposes Prometheus endpoints to be scraped 
> with all of the meters in the registry.
> h3. Acceptance Criteria
> Add Micrometer metrics example to geode-examples repository.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-7400) Prevent RejectedExecutionException in FederatingManager

2019-11-01 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-7400:


Assignee: Aaron Lindsey

> Prevent RejectedExecutionException in FederatingManager
> ---
>
> Key: GEODE-7400
> URL: https://issues.apache.org/jira/browse/GEODE-7400
> Project: Geode
>  Issue Type: Bug
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The fix for GEODE-7330 changed the {{FederatingManager}} class so that it 
> reuses the same {{ExecutorService}} between restarts. Now, if we start the 
> manager after previously starting and stopping it, we get 
> {{RejectedExecutionException}} because it tries to invoke a task on the same 
> {{ExecutorService}} which has been shut down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-7400) Prevent RejectedExecutionException in FederatingManager

2019-11-01 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-7400:


 Summary: Prevent RejectedExecutionException in FederatingManager
 Key: GEODE-7400
 URL: https://issues.apache.org/jira/browse/GEODE-7400
 Project: Geode
  Issue Type: Bug
Reporter: Aaron Lindsey


The fix for GEODE-7330 changed the {{FederatingManager}} class so that it 
reuses the same {{ExecutorService}} between restarts. Now, if we start the 
manager after previously starting and stopping it, we get 
{{RejectedExecutionException}} because it tries to invoke a task on the same 
{{ExecutorService}} which has been shut down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-7390) Add Micrometer metrics example to geode-examples

2019-10-30 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-7390:


 Summary: Add Micrometer metrics example to geode-examples
 Key: GEODE-7390
 URL: https://issues.apache.org/jira/browse/GEODE-7390
 Project: Geode
  Issue Type: Improvement
Reporter: Aaron Lindsey


h3. Why
We want users of Geode to be able to spin up a Prometheus/Grafana setup along 
with a simple meter registry that exposes Prometheus endpoints to be scraped 
with all of the meters in the registry.

h3. Acceptance Criteria
Add Micrometer metrics example to geode-examples repository.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-7390) Add Micrometer metrics example to geode-examples

2019-10-30 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-7390:


Assignee: Aaron Lindsey

> Add Micrometer metrics example to geode-examples
> 
>
> Key: GEODE-7390
> URL: https://issues.apache.org/jira/browse/GEODE-7390
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>
> h3. Why
> We want users of Geode to be able to spin up a Prometheus/Grafana setup along 
> with a simple meter registry that exposes Prometheus endpoints to be scraped 
> with all of the meters in the registry.
> h3. Acceptance Criteria
> Add Micrometer metrics example to geode-examples repository.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-7326) Add cache gets timers

2019-10-22 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7326:
-
Description: 
h3. Why
Users want to understand the performance of their operations within the server.

h3. Acceptance Criteria
Type: timer
Name: geode.cache.gets
Tags: region, result=hit/miss
Lifecycle of meter: The hit/miss meter for each region is created when the 
region is created. The meter(s) are removed when the region is destroyed/closed.
Description for meter: The total time and count for GET requests from clients.
Thing to measure : A count and total time for GET operations that didn't error, 
by this specific Server (1 or many cacheservers) in the geode cluster from when 
the server receives the request to when it sends the response.

Business Rule for this measurement: This meter records any operation sent 
through a CacheServer

h3. Scenarios

*Scenario: Java client hits*
Given a cluster with a Server1 and a Locator1 with time statistics enabled
When the oldest supported java client issues 5 get operations using the 
region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java client misses*
Given a cluster with a Server1 and a Locator1 with time statistics enabled
When the oldest supported java client issues 5 get operations where the user is 
getting a key that doesn't exist in the region using the region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=miss

*Scenario: Java client hits with time stats disabled*
Given a cluster with a Server1 and a Locator1 with time statistics disabled
When a java client issues a get operation using the region.get(key) command 
where the key exists
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 1
- Total Time = 0
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java client error response*
Given a cluster with a Server1
And a RegionA exists with NO entry with a Key="1"
And the client is unauthorized for Key="1"
When the client issues a region.get(1) request
Then no meter on Server1 should exist like:
- Meter Name = 'geode.cache.gets'
- Tag: region = RegionA


  was:
h3. Why
Users want to understand the performance of their operations within the server.

h3. Acceptance Criteria
Type: timer
Name: geode.cache.gets
Tags: region, result=hit/miss
Lifecycle of meter: The hit/miss meter for each region is created when the 
region is created. The meter(s) are removed when the region is destroyed/closed.
Description for meter: The total time and count for GET requests from clients.
Thing to measure : A count and total time for GET operations that didn't error, 
by this specific Server (1 or many cacheservers) in the geode cluster from when 
the server receives the request to when it sends the response.

Business Rule for this measurement: This meter records any operation sent 
through a CacheServer

h3. Scenarios

*Scenario: Java Client Hits*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations using the 
region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java Client Misses*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations where the user is 
getting a key that doesn't exist in the region using the region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=miss

*Scenario: Java client error response*
Given a cluster with a Server1
And a RegionA exists with NO entry with a Key="1"
And the client is unauthorized for Key="1"
When the client issues a region.get(1) request
Then no meter on Server1 should exist like:
- Meter Name = 'geode.cache.gets'
- Tag: region = RegionA



> Add cache gets timers
> -
>
> Key: GEODE-7326
> URL: https://issues.apache.org/jira/browse/GEODE-7326
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>

[jira] [Updated] (GEODE-7326) Add cache gets timers

2019-10-22 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7326:
-
Description: 
h3. Why
Users want to understand the performance of their operations within the server.

h3. Acceptance Criteria
Type: timer
Name: geode.cache.gets
Tags: region, result=hit/miss
Lifecycle of meter: The hit/miss meter for each region is created when the 
region is created. The meter(s) are removed when the region is destroyed/closed.
Description for meter: The total time and count for GET requests from clients.
Thing to measure : A count and total time for GET operations that didn't error, 
by this specific Server (1 or many cacheservers) in the geode cluster from when 
the server receives the request to when it sends the response.

Business Rule for this measurement: This meter records any operation sent 
through a CacheServer

h3. Scenarios

*Scenario: Java Client Hits*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations using the 
region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java Client Misses*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations where the user is 
getting a key that doesn't exist in the region using the region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=miss

*Scenario: Java client error response*
Given a cluster with a Server1
And a RegionA exists with NO entry with a Key="1"
And a CacheLoader that throws an exception when invoked
When the client issues a region.get(1) request
Then no meter on Server1 should exist like:
- Meter Name = 'geode.cache.gets'
- Tag: region = RegionA


  was:
h3. Why
Users want to understand the performance of their operations within the server.

h3. Acceptance Criteria
Type: timer
Name: geode.cache.gets
Tags: region, result=hit/miss
Lifecycle of meter: The hit/miss meter for each region is created when the 
first GET on that user region happens. The meter(s) are only removed when the 
cache is closed.
Description for meter: The total time and count for GET requests from clients.
Thing to measure : A count and total time for GET operations that didn't error, 
by this specific Server (1 or many cacheservers) in the geode cluster from when 
the server receives the request to when it sends the response.

Business Rule for this measurement: This meter records any operation sent 
through a CacheServer

h3. Scenarios

*Scenario: Java Client Hits*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations using the 
region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java Client Misses*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations where the user is 
getting a key that doesn't exist in the region using the region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=miss

*Scenario: Java client error response*
Given a cluster with a Server1
And a RegionA exists with NO entry with a Key="1"
And a CacheLoader that throws an exception when invoked
When the client issues a region.get(1) request
Then no meter on Server1 should exist like:
- Meter Name = 'geode.cache.gets'
- Tag: region = RegionA



> Add cache gets timers
> -
>
> Key: GEODE-7326
> URL: https://issues.apache.org/jira/browse/GEODE-7326
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>
> h3. Why
> Users want to understand the performance of their operations within the 
> server.
> h3. Acceptance Criteria
> Type: timer
> Name: geode.cache.gets
> Tags: region, result=hit/miss
> Lifecycle of meter: The hit/miss meter for each region is created when the 
> region is created. The meter(s) are removed when the region is 
> destroyed/closed.
> Description 

[jira] [Updated] (GEODE-7326) Add cache gets timers

2019-10-22 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7326:
-
Description: 
h3. Why
Users want to understand the performance of their operations within the server.

h3. Acceptance Criteria
Type: timer
Name: geode.cache.gets
Tags: region, result=hit/miss
Lifecycle of meter: The hit/miss meter for each region is created when the 
region is created. The meter(s) are removed when the region is destroyed/closed.
Description for meter: The total time and count for GET requests from clients.
Thing to measure : A count and total time for GET operations that didn't error, 
by this specific Server (1 or many cacheservers) in the geode cluster from when 
the server receives the request to when it sends the response.

Business Rule for this measurement: This meter records any operation sent 
through a CacheServer

h3. Scenarios

*Scenario: Java Client Hits*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations using the 
region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java Client Misses*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations where the user is 
getting a key that doesn't exist in the region using the region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=miss

*Scenario: Java client error response*
Given a cluster with a Server1
And a RegionA exists with NO entry with a Key="1"
And the client is unauthorized for Key="1"
When the client issues a region.get(1) request
Then no meter on Server1 should exist like:
- Meter Name = 'geode.cache.gets'
- Tag: region = RegionA


  was:
h3. Why
Users want to understand the performance of their operations within the server.

h3. Acceptance Criteria
Type: timer
Name: geode.cache.gets
Tags: region, result=hit/miss
Lifecycle of meter: The hit/miss meter for each region is created when the 
region is created. The meter(s) are removed when the region is destroyed/closed.
Description for meter: The total time and count for GET requests from clients.
Thing to measure : A count and total time for GET operations that didn't error, 
by this specific Server (1 or many cacheservers) in the geode cluster from when 
the server receives the request to when it sends the response.

Business Rule for this measurement: This meter records any operation sent 
through a CacheServer

h3. Scenarios

*Scenario: Java Client Hits*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations using the 
region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java Client Misses*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations where the user is 
getting a key that doesn't exist in the region using the region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=miss

*Scenario: Java client error response*
Given a cluster with a Server1
And a RegionA exists with NO entry with a Key="1"
And a CacheLoader that throws an exception when invoked
When the client issues a region.get(1) request
Then no meter on Server1 should exist like:
- Meter Name = 'geode.cache.gets'
- Tag: region = RegionA



> Add cache gets timers
> -
>
> Key: GEODE-7326
> URL: https://issues.apache.org/jira/browse/GEODE-7326
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>
> h3. Why
> Users want to understand the performance of their operations within the 
> server.
> h3. Acceptance Criteria
> Type: timer
> Name: geode.cache.gets
> Tags: region, result=hit/miss
> Lifecycle of meter: The hit/miss meter for each region is created when the 
> region is created. The meter(s) are removed when the region is 
> destroyed/closed.
> Description for meter: The total time and 

[jira] [Updated] (GEODE-7326) Add cache gets timers

2019-10-22 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7326:
-
Summary: Add cache gets timers  (was: Add cache gets timer)

> Add cache gets timers
> -
>
> Key: GEODE-7326
> URL: https://issues.apache.org/jira/browse/GEODE-7326
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Priority: Major
>
> h3. Why
> Users want to understand the performance of their operations within the 
> server.
> h3. Acceptance Criteria
> Type: timer
> Name: geode.cache.gets
> Tags: region, result=hit/miss
> Lifecycle of meter: The hit/miss meter for each region is created when the 
> first GET on that user region happens. The meter(s) are only removed when the 
> cache is closed.
> Description for meter: The total time and count for GET requests from clients.
> Thing to measure : A count and total time for GET operations that didn't 
> error, by this specific Server (1 or many cacheservers) in the geode cluster 
> from when the server receives the request to when it sends the response.
> Business Rule for this measurement: This meter records any operation sent 
> through a CacheServer
> h3. Scenarios
> *Scenario: Java Client Hits*
> Given a cluster with a Server1 and a Locator1 
> When the oldest supported java client issues 5 get operations using the 
> region.get(key) command
> Then a meter on Server1 exists such that:
> - Meter Name = 'geode.cache.gets'
> - Count = 5
> - Total Time = total time spent from received request to response to client 
> for these 5 requests
> - Tag: region = region that the 'get' method was called against
> - Tag: result=hit
> *Scenario: Java Client Misses*
> Given a cluster with a Server1 and a Locator1 
> When the oldest supported java client issues 5 get operations where the user 
> is getting a key that doesn't exist in the region using the region.get(key) 
> command
> Then a meter on Server1 exists such that:
> - Meter Name = 'geode.cache.gets'
> - Count = 5
> - Total Time = total time spent from received request to response to client 
> for these 5 requests
> - Tag: region = region that the 'get' method was called against
> - Tag: result=miss
> *Scenario: Java client error response*
> Given a cluster with a Server1
> And a RegionA exists with NO entry with a Key="1"
> And a CacheLoader that throws an exception when invoked
> When the client issues a region.get(1) request
> Then no meter on Server1 should exist like:
> - Meter Name = 'geode.cache.gets'
> - Tag: region = RegionA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-7326) Add cache gets timers

2019-10-22 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-7326:


Assignee: Aaron Lindsey

> Add cache gets timers
> -
>
> Key: GEODE-7326
> URL: https://issues.apache.org/jira/browse/GEODE-7326
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>
> h3. Why
> Users want to understand the performance of their operations within the 
> server.
> h3. Acceptance Criteria
> Type: timer
> Name: geode.cache.gets
> Tags: region, result=hit/miss
> Lifecycle of meter: The hit/miss meter for each region is created when the 
> first GET on that user region happens. The meter(s) are only removed when the 
> cache is closed.
> Description for meter: The total time and count for GET requests from clients.
> Thing to measure : A count and total time for GET operations that didn't 
> error, by this specific Server (1 or many cacheservers) in the geode cluster 
> from when the server receives the request to when it sends the response.
> Business Rule for this measurement: This meter records any operation sent 
> through a CacheServer
> h3. Scenarios
> *Scenario: Java Client Hits*
> Given a cluster with a Server1 and a Locator1 
> When the oldest supported java client issues 5 get operations using the 
> region.get(key) command
> Then a meter on Server1 exists such that:
> - Meter Name = 'geode.cache.gets'
> - Count = 5
> - Total Time = total time spent from received request to response to client 
> for these 5 requests
> - Tag: region = region that the 'get' method was called against
> - Tag: result=hit
> *Scenario: Java Client Misses*
> Given a cluster with a Server1 and a Locator1 
> When the oldest supported java client issues 5 get operations where the user 
> is getting a key that doesn't exist in the region using the region.get(key) 
> command
> Then a meter on Server1 exists such that:
> - Meter Name = 'geode.cache.gets'
> - Count = 5
> - Total Time = total time spent from received request to response to client 
> for these 5 requests
> - Tag: region = region that the 'get' method was called against
> - Tag: result=miss
> *Scenario: Java client error response*
> Given a cluster with a Server1
> And a RegionA exists with NO entry with a Key="1"
> And a CacheLoader that throws an exception when invoked
> When the client issues a region.get(1) request
> Then no meter on Server1 should exist like:
> - Meter Name = 'geode.cache.gets'
> - Tag: region = RegionA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-7326) Add cache gets timer

2019-10-21 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-7326:


 Summary: Add cache gets timer
 Key: GEODE-7326
 URL: https://issues.apache.org/jira/browse/GEODE-7326
 Project: Geode
  Issue Type: Improvement
Reporter: Aaron Lindsey


h3. Why
Users want to understand the performance of their operations within the server.

h3. Acceptance Criteria
Type: timer
Name: geode.cache.gets
Tags: region, result=hit/miss
Lifecycle of meter: The hit/miss meter for each region is created when the 
first GET on that user region happens. The meter(s) are only removed when the 
cache is closed.
Description for meter: The total time and count for GET requests from clients.
Thing to measure : A count and total time for GET operations that didn't error, 
by this specific Server (1 or many cacheservers) in the geode cluster from when 
the server receives the request to when it sends the response.

Business Rule for this measurement: This meter records any operation sent 
through a CacheServer

h3. Scenarios

*Scenario: Java Client Hits*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations using the 
region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=hit

*Scenario: Java Client Misses*
Given a cluster with a Server1 and a Locator1 
When the oldest supported java client issues 5 get operations where the user is 
getting a key that doesn't exist in the region using the region.get(key) command
Then a meter on Server1 exists such that:
- Meter Name = 'geode.cache.gets'
- Count = 5
- Total Time = total time spent from received request to response to client for 
these 5 requests
- Tag: region = region that the 'get' method was called against
- Tag: result=miss

*Scenario: Java client error response*
Given a cluster with a Server1
And a RegionA exists with NO entry with a Key="1"
And a CacheLoader that throws an exception when invoked
When the client issues a region.get(1) request
Then no meter on Server1 should exist like:
- Meter Name = 'geode.cache.gets'
- Tag: region = RegionA




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-7325) Ignore or tag internal function executions

2019-10-18 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-7325:


 Summary: Ignore or tag internal function executions
 Key: GEODE-7325
 URL: https://issues.apache.org/jira/browse/GEODE-7325
 Project: Geode
  Issue Type: Bug
Reporter: Aaron Lindsey


Currently, the geode.function.executions timer will record execution times for 
both internal functions (those extending InternalEntity or InternalFunction), 
as well as other functions. Typical users do not want to see metrics for these 
internal functions as they are used to implement internal operations in Geode.

We should either ignore internal functions (i.e. don't create meters for them), 
or add a tag to identify which functions are internal functions so the user can 
filter them out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-6616) Flaky: AutoConnectionSourceDUnitTest > testClientDynamicallyDropsStoppedLocator FAILED

2019-10-11 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949845#comment-16949845
 ] 

Aaron Lindsey commented on GEODE-6616:
--

Failed again: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1158

{noformat}
org.apache.geode.cache.client.internal.AutoConnectionSourceDUnitTest > 
testClientDynamicallyDropsStoppedLocator FAILED
java.lang.AssertionError: Suspicious strings were written to the log during 
this run.
Fix the strings or use IgnoredException.addIgnoredException to ignore.
---
Found suspect string in log4j at line 1894

java.net.ConnectException: Connection refused (Connection refused)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at 
java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
at 
java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.base/java.net.Socket.connect(Socket.java:591)
at 
org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:898)
at 
org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:839)
at 
org.apache.geode.internal.net.SocketCreator.connect(SocketCreator.java:828)
at 
org.apache.geode.distributed.internal.tcpserver.TcpClient.requestToServer(TcpClient.java:205)
at 
org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocatorUsingConnection(AutoConnectionSourceImpl.java:202)
at 
org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocator(AutoConnectionSourceImpl.java:192)
at 
org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryLocators(AutoConnectionSourceImpl.java:274)
at 
org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.access$200(AutoConnectionSourceImpl.java:63)
at 
org.apache.geode.cache.client.internal.AutoConnectionSourceImpl$UpdateLocatorListTask.run2(AutoConnectionSourceImpl.java:477)
at 
org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1304)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at 
java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at 
org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:276)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}



> Flaky: AutoConnectionSourceDUnitTest > 
> testClientDynamicallyDropsStoppedLocator FAILED
> --
>
> Key: GEODE-6616
> URL: https://issues.apache.org/jira/browse/GEODE-6616
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Mark Hanson
>Priority: Minor
>
> Failed connection..
> {noformat}
> [vm3] [info 2019/04/09 06:48:44.919 UTC  
> tid=0x20] Got result: EXCEPTION_OCCURRED
> [vm3] org.apache.geode.cache.client.ServerOperationException: remote server 
> on 16f27a14ad79(255:loner):52816:5f2bdb00: : While performing a remote put
> [vm3] at 
> org.apache.geode.cache.client.internal.PutOp$PutOpImpl.processAck(PutOp.java:389)
> [vm3] at 
> org.apache.geode.cache.client.internal.PutOp$PutOpImpl.processResponse(PutOp.java:313)
> [vm3] at 
> org.apache.geode.cache.client.internal.PutOp$PutOpImpl.attemptReadResponse(PutOp.java:454)
> [vm3] at 
> org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:387)
> [vm3] at 
> org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:289)
> [vm3] at 
> org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:351)
> [vm3] at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:908)
> [vm3] at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:172)
> [vm3] at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:130)
> [vm3] at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:792)
> [vm3] at 
> 

[jira] [Updated] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-10-11 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7237:
-
Labels:   (was: observability)

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-10-11 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7237:
-
Labels: observability  (was: )

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Kirk Lund
>Priority: Major
>  Labels: observability
> Fix For: 1.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-10-11 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-7237:


Assignee: Aaron Lindsey  (was: Kirk Lund)

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: observability
> Fix For: 1.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7096) GemcachedBinaryClientJUnitTest > testExpiration timed out waiting for operation

2019-10-11 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949825#comment-16949825
 ] 

Aaron Lindsey commented on GEODE-7096:
--

Failed again: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/IntegrationTestOpenJDK8/builds/1143

> GemcachedBinaryClientJUnitTest > testExpiration timed out waiting for 
> operation
> ---
>
> Key: GEODE-7096
> URL: https://issues.apache.org/jira/browse/GEODE-7096
> Project: Geode
>  Issue Type: Bug
>Reporter: Bill Burcham
>Priority: Major
>
> In this CI build:
> https://concourse.gemfire-ci.info/teams/main/pipelines/gemfire-develop-main/jobs/WindowsIntegrationTestOpenJDK8/builds/668
> {code}
> org.apache.geode.memcached.GemcachedBinaryClientJUnitTest > testExpiration 
> FAILED
> java.lang.RuntimeException: Timed out waiting for operation
> Caused by:
> net.spy.memcached.internal.CheckedOperationTimeoutException: Timed 
> out waiting for operation - failing node: 
> packer-5d4b73b4-8160-44b6-cb53-08c803700c64/10.0.0.64:28196
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-10-11 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949679#comment-16949679
 ] 

Aaron Lindsey commented on GEODE-7237:
--

Failed again: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1147

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Kirk Lund
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-10-11 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reopened GEODE-7237:
--

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Kirk Lund
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7189) CI Failure: ServerLauncherTest > startWaitsForStartupTasksToComplete failed

2019-10-11 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949659#comment-16949659
 ] 

Aaron Lindsey commented on GEODE-7189:
--

Failed again: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UnitTestOpenJDK11/builds/1158

> CI Failure: ServerLauncherTest > startWaitsForStartupTasksToComplete failed
> ---
>
> Key: GEODE-7189
> URL: https://issues.apache.org/jira/browse/GEODE-7189
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Reporter: Barrett Oglesby
>Priority: Major
>
> {noformat}
> org.apache.geode.distributed.ServerLauncherTest > 
> startWaitsForStartupTasksToComplete FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.distributed.ServerLauncherTest that uses 
> java.util.concurrent.CompletableFuture 
> Wanted but not invoked:
> completableFuture.thenRun();
> -> at 
> org.apache.geode.distributed.ServerLauncherTest.lambda$startWaitsForStartupTasksToComplete$14(ServerLauncherTest.java:428)
> Actually, there were zero interactions with this mock.
>  within 300 seconds.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:145)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:122)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:902)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723)
> at 
> org.apache.geode.distributed.ServerLauncherTest.startWaitsForStartupTasksToComplete(ServerLauncherTest.java:428)
> Caused by:
> Wanted but not invoked:
> completableFuture.thenRun();
> -> at 
> org.apache.geode.distributed.ServerLauncherTest.lambda$startWaitsForStartupTasksToComplete$14(ServerLauncherTest.java:428)
> Actually, there were zero interactions with this mock.
> at 
> org.apache.geode.distributed.ServerLauncherTest.lambda$startWaitsForStartupTasksToComplete$14(ServerLauncherTest.java:428)
> {noformat}
> UnitTestOpenJDK11 #943:
> https://concourse.gemfire-ci.info/teams/main/pipelines/gemfire-develop-main/jobs/UnitTestOpenJDK11/builds/943
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> gs://gemfire-test-artifacts/builds/gemfire-develop-main/9.10.0-build.0108/test-results/test/1568154432/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> gs://gemfire-test-artifacts/builds/gemfire-develop-main/9.10.0-build.0108/test-artifacts/1568154432/unittestfiles-OpenJDK11-9.10.0-build.0108.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-10-10 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948844#comment-16948844
 ] 

Aaron Lindsey commented on GEODE-7237:
--

Failed again: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1146

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Kirk Lund
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-7184) Add function execution timers

2019-10-08 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7184:
-
Summary: Add function execution timers  (was: Add function executions timer)

> Add function execution timers
> -
>
> Key: GEODE-7184
> URL: https://issues.apache.org/jira/browse/GEODE-7184
> Project: Geode
>  Issue Type: Improvement
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Developers oftentimes deploy their own functions to the system to enable 
> decorator pattern for caching to add information to specific key/value pairs. 
> In doing so, they can introduce bottlenecks into the system where server-side 
> functions can cause issues or make things slower than intended. We want a way 
> that users can view functions that they create, and see what the average 
> execution time looks like.
>  * *Meter Type*: Timer
>  * *Name*: geode.function.executions
>  * *Description*: TBD
>  * *Tags*: , function (getId on function, if DNE present 
> getClass.getname of deployed function), succeeded (true/false)
> h3. Acceptance Criteria
> *Meter creation/deletion*: Create meter on function execution
> *Measurement*: On an individual server, start the timer when a *USER* 
> function is invoked/executed, and stop the timer when the user function 
> completes OR errors. If it throws a Function Execution or another error then 
> the tag function.isSuccessful=false
> Details on Functions and their execution: 
> [https://geode.apache.org/docs/guide/110/developing/function_exec/function_execution.html]
> h3. Scenarios
> *Scenario: The timers are created when the function is first executed*
> Given a user executed a function with ID functionToTime on a cluster with 1 
> locator/1 server
> And functionToTime has not been executed previously
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count > 1
> - totalTime >= 5,000,000,000ns
> And the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = false
> - count = 0
> - totalTime = 0
> *Scenario: Successful singular function execution (registered execution)*
> Given a user registers a function with ID functionToTime (that waits for 5 
> seconds) on a cluster with 1 locator/1 server
> When functionToTime is triggered using gfsh command: "execute function 
> --id=functionToTime"
> And the function completes without error
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 1
> - totalTime >= 5,000,000,000ns
> *Scenario: Successful singular function execution (unregistered execution)*
> Given an unregistered function with ID functionToTime (that waits for 5 
> seconds) exists 
> When triggered on a client using  
> "FunctionService.onServers(cache).execute(new FunctionToTime())"
> And the function completes without error
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 1
> - totalTime >= 5,000,000,000ns
> *Scenario: Singular function execution with Any Exception*
> Given an unregistered function with ID functionToTime (that waits for 5 
> seconds) exists 
> When triggered on a client using  
> "FunctionService.onServers(cache).execute(new FunctionToTime())"
> And the function exits with a Any exception error after running for 5 seconds
> Then the server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = false
> - count = 1
> - totalTime >= 5,000,000,000ns
> *Scenario: Function execution onRegion multi-server*
> Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
> And a region called RR1 that is a replicate region
> When a function execution is triggered against that replicate region using  
> "FunctionService.onRegion(regionRR1).execute(new FunctionToTime())"
> Then one server has the following timer:
> - name: geode.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 1
> - totalTime >= 5,000,000,000ns
> And the other server has the following timer:
> - name: geode.cache.function.executions
> - tag: id = functionToTime
> - tag: succeeded = true
> - count = 0
> - totalTime = 0
> *Scenario: Function execution onRegion with partition region multiple times*
> *Scenario: Function execution onRegion multi-server*
> Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
> And a partition region called PR1 that only exists on S1
> When a function execution is triggered 10 times against that replicate region 
> using  

[jira] [Updated] (GEODE-7184) Add function executions timer

2019-10-08 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7184:
-
Description: 
Developers oftentimes deploy their own functions to the system to enable 
decorator pattern for caching to add information to specific key/value pairs. 
In doing so, they can introduce bottlenecks into the system where server-side 
functions can cause issues or make things slower than intended. We want a way 
that users can view functions that they create, and see what the average 
execution time looks like.
 * *Meter Type*: Timer
 * *Name*: geode.function.executions
 * *Description*: TBD
 * *Tags*: , function (getId on function, if DNE present 
getClass.getname of deployed function), succeeded (true/false)

h3. Acceptance Criteria

*Meter creation/deletion*: Create meter on function execution
*Measurement*: On an individual server, start the timer when a *USER* function 
is invoked/executed, and stop the timer when the user function completes OR 
errors. If it throws a Function Execution or another error then the tag 
function.isSuccessful=false

Details on Functions and their execution: 
[https://geode.apache.org/docs/guide/110/developing/function_exec/function_execution.html]
h3. Scenarios

*Scenario: The timers are created when the function is first executed*
Given a user executed a function with ID functionToTime on a cluster with 1 
locator/1 server
And functionToTime has not been executed previously
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count > 1
- totalTime >= 5,000,000,000ns

And the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = false
- count = 0
- totalTime = 0

*Scenario: Successful singular function execution (registered execution)*
Given a user registers a function with ID functionToTime (that waits for 5 
seconds) on a cluster with 1 locator/1 server
When functionToTime is triggered using gfsh command: "execute function 
--id=functionToTime"
And the function completes without error
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns

*Scenario: Successful singular function execution (unregistered execution)*
Given an unregistered function with ID functionToTime (that waits for 5 
seconds) exists 
When triggered on a client using  "FunctionService.onServers(cache).execute(new 
FunctionToTime())"
And the function completes without error
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns

*Scenario: Singular function execution with Any Exception*
Given an unregistered function with ID functionToTime (that waits for 5 
seconds) exists 
When triggered on a client using  "FunctionService.onServers(cache).execute(new 
FunctionToTime())"
And the function exits with a Any exception error after running for 5 seconds
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = false
- count = 1
- totalTime >= 5,000,000,000ns

*Scenario: Function execution onRegion multi-server*
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a region called RR1 that is a replicate region
When a function execution is triggered against that replicate region using  
"FunctionService.onRegion(regionRR1).execute(new FunctionToTime())"
Then one server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns

And the other server has the following timer:
- name: geode.cache.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 0
- totalTime = 0

*Scenario: Function execution onRegion with partition region multiple times*
*Scenario: Function execution onRegion multi-server*
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a partition region called PR1 that only exists on S1
When a function execution is triggered 10 times against that replicate region 
using  "FunctionService.onRegion(regionPR1).execute(new FunctionToTime())"
Then S1 has the following timer:
- name: geode.function.executions
- tag:id = functionToTime
- tag:succeeded = true
- count = 10

And S2 has the following timer:
- name: geode.cache.function.executions
- tag:id = functionToTime
- tag:succeeded = true
- count = 0

*Scenario: Function execution onRegion with replicate region multiple times*
*Scenario: Function execution onRegion multi-server*
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a replicate region called RR1 exists
When a function execution is triggered 10 times against that 

[jira] [Updated] (GEODE-7184) Add function executions timer

2019-10-08 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-7184:
-
Description: 
Developers oftentimes deploy their own functions to the system to enable 
decorator pattern for caching to add information to specific key/value pairs. 
In doing so, they can introduce bottlenecks into the system where server-side 
functions can cause issues or make things slower than intended. We want a way 
that users can view functions that they create, and see what the average 
execution time looks like.
 * *Meter Type*: Timer
 * *Name*: geode.function.executions
 * *Description*: TBD
 * *Tags*: , function (getId on function, if DNE present 
getClass.getname of deployed function), succeeded (true/false)

h3. Acceptance Criteria

*Meter creation/deletion*: Create meter on function execution
*Measurement*: On an individual server, start the timer when a *USER* function 
is invoked/executed, and stop the timer when the user function completes OR 
errors. If it throws a Function Execution or another error then the tag 
function.isSuccessful=false

Details on Functions and their execution: 
[https://gemfire.docs.pivotal.io/97/geode/developing/function_exec/function_execution.html]
h3. Scenarios

*Scenario: The timers are created when the function is first executed*
Given a user executed a function with ID functionToTime on a cluster with 1 
locator/1 server
And functionToTime has not been executed previously
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count > 1
- totalTime >= 5,000,000,000ns

And the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = false
- count = 0
- totalTime = 0

*Scenario: Successful singular function execution (registered execution)*
Given a user registers a function with ID functionToTime (that waits for 5 
seconds) on a cluster with 1 locator/1 server
When functionToTime is triggered using gfsh command: "execute function 
--id=functionToTime"
And the function completes without error
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns

*Scenario: Successful singular function execution (unregistered execution)*
Given an unregistered function with ID functionToTime (that waits for 5 
seconds) exists 
When triggered on a client using  "FunctionService.onServers(cache).execute(new 
FunctionToTime())"
And the function completes without error
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns

*Scenario: Singular function execution with Any Exception*
Given an unregistered function with ID functionToTime (that waits for 5 
seconds) exists 
When triggered on a client using  "FunctionService.onServers(cache).execute(new 
FunctionToTime())"
And the function exits with a Any exception error after running for 5 seconds
Then the server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = false
- count = 1
- totalTime >= 5,000,000,000ns

*Scenario: Function execution onRegion multi-server*
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a region called RR1 that is a replicate region
When a function execution is triggered against that replicate region using  
"FunctionService.onRegion(regionRR1).execute(new FunctionToTime())"
Then one server has the following timer:
- name: geode.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 1
- totalTime >= 5,000,000,000ns

And the other server has the following timer:
- name: geode.cache.function.executions
- tag: id = functionToTime
- tag: succeeded = true
- count = 0
- totalTime = 0

*Scenario: Function execution onRegion with partition region multiple times*
*Scenario: Function execution onRegion multi-server*
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a partition region called PR1 that only exists on S1
When a function execution is triggered 10 times against that replicate region 
using  "FunctionService.onRegion(regionPR1).execute(new FunctionToTime())"
Then S1 has the following timer:
- name: geode.function.executions
- tag:id = functionToTime
- tag:succeeded = true
- count = 10

And S2 has the following timer:
- name: geode.cache.function.executions
- tag:id = functionToTime
- tag:succeeded = true
- count = 0

*Scenario: Function execution onRegion with replicate region multiple times*
*Scenario: Function execution onRegion multi-server*
Given a cluster with 1 locator (named L1) as well as 2 servers (named S1,S2)
And a replicate region called RR1 exists
When a function execution is triggered 10 times against that 

[jira] [Commented] (GEODE-6070) CI Failure: ShutdownCommandOverHttpDUnitTest > testShutdownAll

2019-10-02 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942872#comment-16942872
 ] 

Aaron Lindsey commented on GEODE-6070:
--

I tried looking at this yesterday. [~upthewaterspout]'s idea seems plausible, 
but something I don't understand is: The code in ShutdownFunction.execute seems 
to _wait_ until the cache is shutdown, _then_ send the reply. So it seems that 
the shutdown would _always_ happen before the reply. Given that, I don't 
understand why this test doesn't fail every time. 

> CI Failure: ShutdownCommandOverHttpDUnitTest > testShutdownAll
> --
>
> Key: GEODE-6070
> URL: https://issues.apache.org/jira/browse/GEODE-6070
> Project: Geode
>  Issue Type: Bug
>Reporter: Helena Bales
>Priority: Major
>  Labels: GeodeCommons
>
> Failed with stacktrace:
> {noformat}
> org.apache.geode.management.internal.cli.commands.ShutdownCommandOverHttpDUnitTest
>  > testShutdownAll FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 302
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Distribution manager on 172.17.0.3(server-1:496):41002 started at Thu Nov 
> 15 19:47:58 UTC 2018: Message distribution has terminated
> {noformat}
> Test results can be found here:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.9.0-build.158/test-results/distributedTest/1542315851/classes/org.apache.geode.management.internal.cli.commands.ShutdownCommandOverHttpDUnitTest.html#testShutdownAll
>  
> Test Artifacts can be found here:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.9.0-build.158/test-artifacts/1542315851/distributedtestfiles-OpenJDK8-1.9.0-build.158.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-6070) CI Failure: ShutdownCommandOverHttpDUnitTest > testShutdownAll

2019-10-01 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-6070:


Assignee: (was: Aaron Lindsey)

> CI Failure: ShutdownCommandOverHttpDUnitTest > testShutdownAll
> --
>
> Key: GEODE-6070
> URL: https://issues.apache.org/jira/browse/GEODE-6070
> Project: Geode
>  Issue Type: Bug
>Reporter: Helena Bales
>Priority: Major
>  Labels: GeodeCommons
>
> Failed with stacktrace:
> {noformat}
> org.apache.geode.management.internal.cli.commands.ShutdownCommandOverHttpDUnitTest
>  > testShutdownAll FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 302
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Distribution manager on 172.17.0.3(server-1:496):41002 started at Thu Nov 
> 15 19:47:58 UTC 2018: Message distribution has terminated
> {noformat}
> Test results can be found here:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.9.0-build.158/test-results/distributedTest/1542315851/classes/org.apache.geode.management.internal.cli.commands.ShutdownCommandOverHttpDUnitTest.html#testShutdownAll
>  
> Test Artifacts can be found here:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.9.0-build.158/test-artifacts/1542315851/distributedtestfiles-OpenJDK8-1.9.0-build.158.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-6070) CI Failure: ShutdownCommandOverHttpDUnitTest > testShutdownAll

2019-10-01 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-6070:


Assignee: Aaron Lindsey

> CI Failure: ShutdownCommandOverHttpDUnitTest > testShutdownAll
> --
>
> Key: GEODE-6070
> URL: https://issues.apache.org/jira/browse/GEODE-6070
> Project: Geode
>  Issue Type: Bug
>Reporter: Helena Bales
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: GeodeCommons
>
> Failed with stacktrace:
> {noformat}
> org.apache.geode.management.internal.cli.commands.ShutdownCommandOverHttpDUnitTest
>  > testShutdownAll FAILED
> java.lang.AssertionError: Suspicious strings were written to the log 
> during this run.
> Fix the strings or use IgnoredException.addIgnoredException to ignore.
> ---
> Found suspect string in log4j at line 302
> org.apache.geode.distributed.DistributedSystemDisconnectedException: 
> Distribution manager on 172.17.0.3(server-1:496):41002 started at Thu Nov 
> 15 19:47:58 UTC 2018: Message distribution has terminated
> {noformat}
> Test results can be found here:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.9.0-build.158/test-results/distributedTest/1542315851/classes/org.apache.geode.management.internal.cli.commands.ShutdownCommandOverHttpDUnitTest.html#testShutdownAll
>  
> Test Artifacts can be found here:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.9.0-build.158/test-artifacts/1542315851/distributedtestfiles-OpenJDK8-1.9.0-build.158.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-6183) CI Failure: LocatorLauncherRemoteFileIntegrationTest.startDeletesStaleControlFiles failed with ConditionTimeoutException

2019-09-24 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937218#comment-16937218
 ] 

Aaron Lindsey commented on GEODE-6183:
--

Failed again: 
[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK11/builds/349]

> CI Failure: 
> LocatorLauncherRemoteFileIntegrationTest.startDeletesStaleControlFiles failed 
> with ConditionTimeoutException
> 
>
> Key: GEODE-6183
> URL: https://issues.apache.org/jira/browse/GEODE-6183
> Project: Geode
>  Issue Type: Bug
>  Components: ci, gfsh
>Reporter: Eric Shu
>Assignee: Kirk Lund
>Priority: Major
> Fix For: 1.10.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Test failed in 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/IntegrationTestOpenJDK8/builds/223
> org.apache.geode.distributed.LocatorLauncherRemoteFileIntegrationTest > 
> startDeletesStaleControlFiles FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.distributed.LocatorLauncherRemoteIntegrationTestCase that 
> uses org.apache.geode.distributed.LocatorLauncher expected:<[online]> but 
> was:<[not responding]> within 300 seconds.
> Caused by:
> org.junit.ComparisonFailure: expected:<[online]> but was:<[not 
> responding]>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7026) LocatorLauncherRemoteFileIntegrationTest > startDeletesStaleControlFiles fails on Windows

2019-09-24 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937214#comment-16937214
 ] 

Aaron Lindsey commented on GEODE-7026:
--

Failed again: 
[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK11/builds/349]

> LocatorLauncherRemoteFileIntegrationTest > startDeletesStaleControlFiles 
> fails on Windows
> -
>
> Key: GEODE-7026
> URL: https://issues.apache.org/jira/browse/GEODE-7026
> Project: Geode
>  Issue Type: Bug
>  Components: tests
>Reporter: Donal Evans
>Priority: Major
>  Labels: IntegrationTest, Windows, flaky
>
> org.apache.geode.distributed.LocatorLauncherRemoteFileIntegrationTest > 
> startDeletesStaleControlFiles FAILED
> [ 
> |https://concourse.gemfire-ci.info/teams/main/pipelines/gemfire-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK11/builds/159#L5d3266c7:1681]
>  org.awaitility.core.ConditionTimeoutException: Assertion condition defined 
> as a lambda expression in 
> org.apache.geode.distributed.LocatorLauncherRemoteIntegrationTestCase 
> expected:<[online]> but was:<[not responding]> within 300 seconds.
> [ 
> |https://concourse.gemfire-ci.info/teams/main/pipelines/gemfire-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK11/builds/159#L5d3266c7:1682]
>  
> [ 
> |https://concourse.gemfire-ci.info/teams/main/pipelines/gemfire-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK11/builds/159#L5d3266c7:1683]
>  Caused by:
> [ 
> |https://concourse.gemfire-ci.info/teams/main/pipelines/gemfire-develop-main/jobs/WindowsCoreIntegrationTestOpenJDK11/builds/159#L5d3266c7:1684]
>  org.junit.ComparisonFailure: expected:<[online]> but was:<[not responding]>
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-6601) CI Failure: Timeout in LuceneIndexDestroyDUnitTest verifyDestroyAllIndexesWhileDoingPuts(PARTITION_OVERFLOW_TO_DISK)

2019-09-24 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-6601:


Assignee: (was: xiaojian zhou)

> CI Failure: Timeout in LuceneIndexDestroyDUnitTest 
> verifyDestroyAllIndexesWhileDoingPuts(PARTITION_OVERFLOW_TO_DISK)
> 
>
> Key: GEODE-6601
> URL: https://issues.apache.org/jira/browse/GEODE-6601
> Project: Geode
>  Issue Type: Bug
>Reporter: Helena Bales
>Priority: Major
>
> DistributedTestJDK11 failed due to timeout with a hang in 
> org.apache.geode.cache.lucene.LuceneIndexDestroyDUnitTest 
> verifyDestroyAllIndexesWhileDoingPuts(PARTITION_OVERFLOW_TO_DISK).
> CI Failure here: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/579
> Test results here: 
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0137/test-results/distributedTest/1554409876/
> Test artifacts here: 
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0137/test-artifacts/1554409876/distributedtestfiles-OpenJDK11-1.10.0-SNAPSHOT.0137.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-6601) CI Failure: Timeout in LuceneIndexDestroyDUnitTest verifyDestroyAllIndexesWhileDoingPuts(PARTITION_OVERFLOW_TO_DISK)

2019-09-24 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937143#comment-16937143
 ] 

Aaron Lindsey commented on GEODE-6601:
--

I saw tests in LuceneIndexDestroyDUnitTest fail again today due to OOM errors. 
The SHA was dffcb9446aef09c7bf6e626121f4d2ec5c74586f.

> CI Failure: Timeout in LuceneIndexDestroyDUnitTest 
> verifyDestroyAllIndexesWhileDoingPuts(PARTITION_OVERFLOW_TO_DISK)
> 
>
> Key: GEODE-6601
> URL: https://issues.apache.org/jira/browse/GEODE-6601
> Project: Geode
>  Issue Type: Bug
>Reporter: Helena Bales
>Assignee: xiaojian zhou
>Priority: Major
>
> DistributedTestJDK11 failed due to timeout with a hang in 
> org.apache.geode.cache.lucene.LuceneIndexDestroyDUnitTest 
> verifyDestroyAllIndexesWhileDoingPuts(PARTITION_OVERFLOW_TO_DISK).
> CI Failure here: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/579
> Test results here: 
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0137/test-results/distributedTest/1554409876/
> Test artifacts here: 
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0137/test-artifacts/1554409876/distributedtestfiles-OpenJDK11-1.10.0-SNAPSHOT.0137.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-5407) CI failure: JMXMBeanReconnectDUnitTest.testRemoteBeanKnowledge_MaintainServerAndCrashLocator and testLocalBeans_MaintainServerAndCrashLocator

2019-09-24 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937136#comment-16937136
 ] 

Aaron Lindsey commented on GEODE-5407:
--

Failed again: 
[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/1121]

> CI failure: 
> JMXMBeanReconnectDUnitTest.testRemoteBeanKnowledge_MaintainServerAndCrashLocator
>  and testLocalBeans_MaintainServerAndCrashLocator
> -
>
> Key: GEODE-5407
> URL: https://issues.apache.org/jira/browse/GEODE-5407
> Project: Geode
>  Issue Type: Bug
>Reporter: Jinmei Liao
>Priority: Major
>  Labels: flaky, pull-request-available, swat
> Attachments: Test results - Class 
> org.apache.geode.management.JMXMBeanReconnectDUnitTest.html
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> org.apache.geode.management.JMXMBeanReconnectDUnitTest > 
> testRemoteBeanKnowledge_MaintainServerAndCrashLocator FAILED
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:249]
>  org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.test.dunit.rules.MemberVM$$Lambda$73/2140274979.run in VM 0 
> running on Host 640ab3da6905 with 4 VMs
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:250]
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:436)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:251]
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:405)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:252]
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:348)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:253]
>  at 
> org.apache.geode.test.dunit.rules.MemberVM.waitTilLocatorFullyReconnected(MemberVM.java:113)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:254]
>  at 
> org.apache.geode.management.JMXMBeanReconnectDUnitTest.testRemoteBeanKnowledge_MaintainServerAndCrashLocator(JMXMBeanReconnectDUnitTest.java:161)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:255]
>  
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:256]
>  Caused by:
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:257]
>  org.awaitility.core.ConditionTimeoutException: Condition with 
> org.apache.geode.test.dunit.rules.MemberVM was not fulfilled within 30 
> seconds.
>  
> org.apache.geode.management.JMXMBeanReconnectDUnitTest > 
> testLocalBeans_MaintainServerAndCrashLocator FAILED
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:260]
>  org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.test.dunit.rules.MemberVM$$Lambda$73/2140274979.run in VM 0 
> running on Host 640ab3da6905 with 4 VMs
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:261]
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:436)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:262]
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:405)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:263]
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:348)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:264]
>  at 
> org.apache.geode.test.dunit.rules.MemberVM.waitTilLocatorFullyReconnected(MemberVM.java:113)
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:265]
>  at 
> org.apache.geode.management.JMXMBeanReconnectDUnitTest.testLocalBeans_MaintainServerAndCrashLocator(JMXMBeanReconnectDUnitTest.java:112)
>  
>  Caused by:
> [ 
> |https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/103#L5b401925:268]
>  org.awaitility.core.ConditionTimeoutException: Condition with 
> org.apache.geode.test.dunit.rules.MemberVM was not fulfilled within 30 
> seconds.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-09-24 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-7237:


Assignee: Kirk Lund  (was: Aaron Lindsey)

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Kirk Lund
>Priority: Major
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-09-24 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-7237:


Assignee: Aaron Lindsey

> CI failure: ConnectCommandAcceptanceTest.invalidHostname
> 
>
> Key: GEODE-7237
> URL: https://issues.apache.org/jira/browse/GEODE-7237
> Project: Geode
>  Issue Type: Bug
>  Components: ci, tests
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>
> [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
> {code:java}
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest
>  > invalidHostname FAILED
> java.lang.AssertionError: 
> Expecting:
>  <"">
> to contain:
>  <"can't be reached. Hostname or IP address could not be found."> 
> at 
> org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-7237) CI failure: ConnectCommandAcceptanceTest.invalidHostname

2019-09-24 Thread Aaron Lindsey (Jira)
Aaron Lindsey created GEODE-7237:


 Summary: CI failure: ConnectCommandAcceptanceTest.invalidHostname
 Key: GEODE-7237
 URL: https://issues.apache.org/jira/browse/GEODE-7237
 Project: Geode
  Issue Type: Bug
  Components: ci, tests
Reporter: Aaron Lindsey


[https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK8/builds/1101]
{code:java}
org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest 
> invalidHostname FAILED
java.lang.AssertionError: 
Expecting:
 <"">
to contain:
 <"can't be reached. Hostname or IP address could not be found."> 
at 
org.apache.geode.management.internal.cli.commands.ConnectCommandAcceptanceTest.invalidHostname(ConnectCommandAcceptanceTest.java:59)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-6903) CI Failure: GemFireTransactionDataSourceIntegrationTest.testExceptionHandlingGetConnection failed with Assertion

2019-09-24 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-6903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937014#comment-16937014
 ] 

Aaron Lindsey commented on GEODE-6903:
--

Failed again today with SHA 802687154131af16d350bf39d152feb4685ba7e6
{code:java}
org.apache.geode.internal.datasource.GemFireTransactionDataSourceIntegrationTest
 > testExceptionHandlingGetConnection FAILED
 org.junit.ComparisonFailure: expected:<[0]> but was:<[2]>
 at sun.reflect.GeneratedConstructorAccessor26.newInstance(Unknown Source)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at 
org.apache.geode.internal.datasource.GemFireTransactionDataSourceIntegrationTest.testExceptionHandlingGetConnection(GemFireTransactionDataSourceIntegrationTest.java:149){code}

> CI Failure: 
> GemFireTransactionDataSourceIntegrationTest.testExceptionHandlingGetConnection
>  failed with Assertion
> 
>
> Key: GEODE-6903
> URL: https://issues.apache.org/jira/browse/GEODE-6903
> Project: Geode
>  Issue Type: Bug
>  Components: transactions
>Reporter: Eric Shu
>Assignee: Mark Hanson
>Priority: Major
>
> org.apache.geode.internal.datasource.GemFireTransactionDataSourceIntegrationTest
>  > testExceptionHandlingGetConnection FAILED
> org.junit.ComparisonFailure: expected:<[0]> but was:<[2]>
> at sun.reflect.GeneratedConstructorAccessor26.newInstance(Unknown 
> Source)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at 
> org.apache.geode.internal.datasource.GemFireTransactionDataSourceIntegrationTest.testExceptionHandlingGetConnection(GemFireTransactionDataSourceIntegrationTest.java:141)
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results URI 
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0399/test-results/integrationTest/1561170841/
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Test report artifacts from this job are available at:
> http://files.apachegeode-ci.info/builds/apache-develop-main/1.10.0-SNAPSHOT.0399/test-artifacts/1561170841/integrationtestfiles-OpenJDK8-1.10.0-SNAPSHOT.0399.tgz



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7224) Add metrics for FunctionExecution

2019-09-20 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934506#comment-16934506
 ] 

Aaron Lindsey commented on GEODE-7224:
--

[~mhansonp] Is this a duplicate of GEODE-7184?

> Add metrics for FunctionExecution
> -
>
> Key: GEODE-7224
> URL: https://issues.apache.org/jira/browse/GEODE-7224
> Project: Geode
>  Issue Type: Improvement
>  Components: statistics
>Reporter: Mark Hanson
>Priority: Major
>
> Add metrics so that FunctionExecutions information is available to micrometer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-7177) Move membership's logging dependencies to its own module

2019-09-11 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927705#comment-16927705
 ] 

Aaron Lindsey commented on GEODE-7177:
--

There is an open PR to move classes that depend on log4j-core to a submodule: 
[https://github.com/apache/geode/pull/4003]. That might be relevant to this 
issue. You may want to coordinate with [~klund] as he is in the process of 
refactoring a large amount of logging code.

> Move membership's logging dependencies to its own module
> 
>
> Key: GEODE-7177
> URL: https://issues.apache.org/jira/browse/GEODE-7177
> Project: Geode
>  Issue Type: Improvement
>  Components: logging, membership
>Reporter: Ryan McMahon
>Assignee: Ryan McMahon
>Priority: Major
>
> As part of eliminating membership's dependency on geode-core, we want to move 
> LogService and some other supporting classes to its own module which 
> membership can depend on.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (GEODE-5790) gfsh command alter runtime --log-level has no effect

2019-09-10 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927079#comment-16927079
 ] 

Aaron Lindsey commented on GEODE-5790:
--

{{It looks like there is a DUnit test that uses this option: 
geode/geode-web/src/distributedTest/java/org/apache/geode/management/internal/cli/commands/AlterRuntimeCommandDUnitTest.java}}

> gfsh command alter runtime --log-level has no effect
> 
>
> Key: GEODE-5790
> URL: https://issues.apache.org/jira/browse/GEODE-5790
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>
> The gfsh command alter runtime --log-level has no effect. It looks to me like 
> this option was never implemented in the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (GEODE-5790) gfsh command alter runtime --log-level has no effect

2019-09-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-5790:
-
Component/s: (was: logging)

> gfsh command alter runtime --log-level has no effect
> 
>
> Key: GEODE-5790
> URL: https://issues.apache.org/jira/browse/GEODE-5790
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>
> The gfsh command alter runtime --log-level has no effect. It looks to me like 
> this option was never implemented in the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (GEODE-5785) Add file permissions support to LogWriter appender

2019-09-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-5785:
-
Labels: observability  (was: )

> Add file permissions support to LogWriter appender
> --
>
> Key: GEODE-5785
> URL: https://issues.apache.org/jira/browse/GEODE-5785
> Project: Geode
>  Issue Type: New Feature
>  Components: logging
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Minor
>  Labels: observability
>
> Log4J2 FileAppender supports specifying file permissions during 
> configuration. It may be valuable to provide similar support for LogWriter 
> appender, especially the SecurityLogWriter.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (GEODE-5692) ClassNotFound error with fine level loggging

2019-09-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey closed GEODE-5692.


> ClassNotFound error with fine level loggging
> 
>
> Key: GEODE-5692
> URL: https://issues.apache.org/jira/browse/GEODE-5692
> Project: Geode
>  Issue Type: Bug
>  Components: logging
>Reporter: Gregory Green
>Priority: Major
>
> When performing a
> CacheTransactionManager.commit() with client side code.
> With fine level logging on server, I observed a ClassNotFoundException being 
> found server side.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (GEODE-5692) ClassNotFound error with fine level loggging

2019-09-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey resolved GEODE-5692.
--
Resolution: Incomplete

Closing as there is no way to assign back to reporter to get further details. 
Please open a separate ticket with a stack trace and geode version number.

> ClassNotFound error with fine level loggging
> 
>
> Key: GEODE-5692
> URL: https://issues.apache.org/jira/browse/GEODE-5692
> Project: Geode
>  Issue Type: Bug
>  Components: logging
>Reporter: Gregory Green
>Priority: Major
>
> When performing a
> CacheTransactionManager.commit() with client side code.
> With fine level logging on server, I observed a ClassNotFoundException being 
> found server side.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (GEODE-4837) Remove unneeded JNA code

2019-09-10 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927070#comment-16927070
 ] 

Aaron Lindsey commented on GEODE-4837:
--

[~jens.deppe], could you specify which classes contain the unused code and 
address Alberto's question?

> Remove unneeded JNA code
> 
>
> Key: GEODE-4837
> URL: https://issues.apache.org/jira/browse/GEODE-4837
> Project: Geode
>  Issue Type: Improvement
>  Components: core, gfsh, statistics
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>
> We have quite a bit of JNA code but it seems that a lot of it is unused now. 
> Let's clean that up so we only keep what's actually being run.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (GEODE-4837) Remove unneeded JNA code

2019-09-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey updated GEODE-4837:
-
Component/s: (was: statistics)

> Remove unneeded JNA code
> 
>
> Key: GEODE-4837
> URL: https://issues.apache.org/jira/browse/GEODE-4837
> Project: Geode
>  Issue Type: Improvement
>  Components: core, gfsh
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>
> We have quite a bit of JNA code but it seems that a lot of it is unused now. 
> Let's clean that up so we only keep what's actually being run.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (GEODE-4837) Remove unneeded JNA code

2019-09-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-4837:


Assignee: Jens Deppe  (was: Aaron Lindsey)

> Remove unneeded JNA code
> 
>
> Key: GEODE-4837
> URL: https://issues.apache.org/jira/browse/GEODE-4837
> Project: Geode
>  Issue Type: Improvement
>  Components: core, gfsh, statistics
>Reporter: Jens Deppe
>Assignee: Jens Deppe
>Priority: Major
>
> We have quite a bit of JNA code but it seems that a lot of it is unused now. 
> Let's clean that up so we only keep what's actually being run.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (GEODE-4837) Remove unneeded JNA code

2019-09-10 Thread Aaron Lindsey (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Lindsey reassigned GEODE-4837:


Assignee: Aaron Lindsey

> Remove unneeded JNA code
> 
>
> Key: GEODE-4837
> URL: https://issues.apache.org/jira/browse/GEODE-4837
> Project: Geode
>  Issue Type: Improvement
>  Components: core, gfsh, statistics
>Reporter: Jens Deppe
>Assignee: Aaron Lindsey
>Priority: Major
>
> We have quite a bit of JNA code but it seems that a lot of it is unused now. 
> Let's clean that up so we only keep what's actually being run.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


  1   2   3   >