[jira] [Updated] (GEODE-5522) Upgrade to Gradle 4.9
[ https://issues.apache.org/jira/browse/GEODE-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-5522: -- Labels: pull-request-available (was: ) > Upgrade to Gradle 4.9 > - > > Key: GEODE-5522 > URL: https://issues.apache.org/jira/browse/GEODE-5522 > Project: Geode > Issue Type: Improvement > Components: build >Reporter: Jacob S. Barrett >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5518) some records in the region are not fetched when executing fetch query
[ https://issues.apache.org/jira/browse/GEODE-5518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567802#comment-16567802 ] Nilkanth Patel commented on GEODE-5518: --- As you said putAll with 1 record works fine, One possible reason could be, you are hitting duplicate records while creating a map for putAll, in that case, even though you are making a batch of 1000, if its duplicate (Map Key), record will be over written by the existing record and effectively ingestion into region would be lesser. Pls check whether are you hitting this case or not. > some records in the region are not fetched when executing fetch query > - > > Key: GEODE-5518 > URL: https://issues.apache.org/jira/browse/GEODE-5518 > Project: Geode > Issue Type: Bug > Components: core, querying >Reporter: yossi reginiano >Priority: Major > > hi all, > we are using geode 1.4 and facing the following: > we are starting to adopt the putAll functions which accepts a bulk of records > and persists them into the region > we have noticed that the process that fetches the records from the region > (executing fetch command with bulks of 1000) , from time to time missing a > record or two , which is causing this records to be left in the region as a > "Zombie" - because now current index is greater then this record's index > now this started to happen only when we started to use the putAll function - > prior to this we did not face any such issue > also - when we are using putAll with only 1 record at a time it is also > working fine > has anybody faced this? > is there some constraint on the number of records that can be sent to the > putAll function? > thanks in advance > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GEODE-5522) Upgrade to Gradle 4.9
Jacob S. Barrett created GEODE-5522: --- Summary: Upgrade to Gradle 4.9 Key: GEODE-5522 URL: https://issues.apache.org/jira/browse/GEODE-5522 Project: Geode Issue Type: Improvement Components: build Reporter: Jacob S. Barrett -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5503) Reduce CI overhead on pull request to decrease feedback time.
[ https://issues.apache.org/jira/browse/GEODE-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacob S. Barrett resolved GEODE-5503. - Resolution: Fixed Fix Version/s: 1.8.0 > Reduce CI overhead on pull request to decrease feedback time. > - > > Key: GEODE-5503 > URL: https://issues.apache.org/jira/browse/GEODE-5503 > Project: Geode > Issue Type: Improvement > Components: ci >Reporter: Jacob S. Barrett >Priority: Major > Fix For: 1.8.0 > > > Reorganize and reduce overhead in the CI pull request pipeline to give faster > feedback. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5497) Generated restore.sh script fails when incremental backups are restored
[ https://issues.apache.org/jira/browse/GEODE-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe resolved GEODE-5497. --- Resolution: Fixed Fix Version/s: 1.7.0 > Generated restore.sh script fails when incremental backups are restored > > > Key: GEODE-5497 > URL: https://issues.apache.org/jira/browse/GEODE-5497 > Project: Geode > Issue Type: Bug > Components: regions >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Add a test which validates the generated {{restore,sh}} script for > incremental backups. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5468) Make line separators platform independent
[ https://issues.apache.org/jira/browse/GEODE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe resolved GEODE-5468. --- Resolution: Fixed Fix Version/s: 1.7.0 > Make line separators platform independent > - > > Key: GEODE-5468 > URL: https://issues.apache.org/jira/browse/GEODE-5468 > Project: Geode > Issue Type: Bug > Components: tests >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5474) Fix test flakiness in DiskRegionAsyncRecoveryJUnitTest and DiskOldAPIsJUnitTest
[ https://issues.apache.org/jira/browse/GEODE-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe resolved GEODE-5474. --- Resolution: Fixed Fix Version/s: 1.7.0 > Fix test flakiness in DiskRegionAsyncRecoveryJUnitTest and > DiskOldAPIsJUnitTest > --- > > Key: GEODE-5474 > URL: https://issues.apache.org/jira/browse/GEODE-5474 > Project: Geode > Issue Type: Bug > Components: core >Reporter: Jens Deppe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 50m > Remaining Estimate: 0h > > These tests will fail intermittently with: > {noformat} > java.lang.AssertionError > at > org.apache.geode.internal.cache.AbstractDiskRegion.markInitialized(AbstractDiskRegion.java:417) > at > org.apache.geode.internal.cache.DiskInitFile.cmnMarkInitialized(DiskInitFile.java:1181) > at > org.apache.geode.internal.cache.DiskInitFile.markInitialized(DiskInitFile.java:2273) > at > org.apache.geode.internal.cache.DiskStoreImpl.setInitialized(DiskStoreImpl.java:3059) > at > org.apache.geode.internal.cache.AbstractDiskRegion.setInitialized(AbstractDiskRegion.java:594) > at > org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.setOnline(PersistenceAdvisorImpl.java:376) > at > org.apache.geode.internal.cache.DistributedRegion.cleanUpDestroyedTokensAndMarkGIIComplete(DistributedRegion.java:1506) > at > org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1288) > at > org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1082) > at > org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3087) > at > org.apache.geode.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2982) > at > org.apache.geode.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2970) > at > org.apache.geode.internal.cache.DiskOldAPIsJUnitTest.doSyncBitTest(DiskOldAPIsJUnitTest.java:107) > at > org.apache.geode.internal.cache.DiskOldAPIsJUnitTest.testSyncBit(DiskOldAPIsJUnitTest.java:86) > at sun.reflect.GeneratedMethodAccessor333.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:67) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5463) The generated diskstore restore script is incorrect on Windows
[ https://issues.apache.org/jira/browse/GEODE-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe resolved GEODE-5463. --- Resolution: Fixed Fix Version/s: 1.7.0 > The generated diskstore restore script is incorrect on Windows > -- > > Key: GEODE-5463 > URL: https://issues.apache.org/jira/browse/GEODE-5463 > Project: Geode > Issue Type: Bug > Components: persistence >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The auto-generated {{restore.bat}} script is incorrect on Windows as there > are slight differences in the implementations of {{ScriptGenerator}} between > Windows and Unix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5512) Do not test for OS process stats on Windows
[ https://issues.apache.org/jira/browse/GEODE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe resolved GEODE-5512. --- Resolution: Fixed Fix Version/s: 1.7.0 > Do not test for OS process stats on Windows > --- > > Key: GEODE-5512 > URL: https://issues.apache.org/jira/browse/GEODE-5512 > Project: Geode > Issue Type: Bug > Components: statistics >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We don't gather OS level stats on Windows and tests which check those fail > when run on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5508) Fix build on Windows - encoding issue
[ https://issues.apache.org/jira/browse/GEODE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe resolved GEODE-5508. --- Resolution: Fixed Fix Version/s: 1.7.0 > Fix build on Windows - encoding issue > - > > Key: GEODE-5508 > URL: https://issues.apache.org/jira/browse/GEODE-5508 > Project: Geode > Issue Type: Bug > Components: build >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Building on windows fails with: > {noformat} > > Task :geode-junit:compileJava > > Task :geode-junit:processResources NO-SOURCE > > Task :geode-junit:classes > > Task :geode-junit:jar > C:\var\vcap\data\houdini-windows\containers\3g7fag8\tmp\build\28c4f8c4\built-geode\test\geode-core\src\test\java\org\apache\geode\cache\query\data\Portfolio.java:152: > error: unmappable character for encoding Cp1252 > unicodeṤtring = i % 2 == 0 ? "ṤṶ��?" : "Ṥ��?Ṷ"; > ^ > C:\var\vcap\data\houdini-windows\containers\3g7fag8\tmp\build\28c4f8c4\built-geode\test\geode-core\src\test\java\org\apache\geode\cache\query\data\Portfolio.java:152: > error: unmappable character for encoding Cp1252 > unicodeṤtring = i % 2 == 0 ? "ṤṶ��?" : "Ṥ��?Ṷ"; > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5517) Have StatArchiveReader implement AutoCloseable
[ https://issues.apache.org/jira/browse/GEODE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Deppe resolved GEODE-5517. --- Resolution: Fixed Fix Version/s: 1.7.0 > Have StatArchiveReader implement AutoCloseable > -- > > Key: GEODE-5517 > URL: https://issues.apache.org/jira/browse/GEODE-5517 > Project: Geode > Issue Type: Improvement > Components: statistics >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5521) After an exception is received from a remote server function execution, local threads should not send back result to client later
[ https://issues.apache.org/jira/browse/GEODE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-5521: -- Labels: pull-request-available (was: ) > After an exception is received from a remote server function execution, local > threads should not send back result to client later > - > > Key: GEODE-5521 > URL: https://issues.apache.org/jira/browse/GEODE-5521 > Project: Geode > Issue Type: Bug > Components: functions >Reporter: nabarun >Priority: Major > Labels: pull-request-available > > In the method cmdExecute() > if the local co-ordinator receives an FunctionException of type > FunctionInvocationTargetException or QueryInvocationTargetException from the > remote server, setException is called which sets the lastResultReceived flag. > This flag prevents other results from other threads to be sent to the client, > as the client may have moved on. > If there were any other function exception it will just send the exception > but not set the flag. > {code:java} > if (cause instanceof FunctionInvocationTargetException > || cause instanceof QueryInvocationTargetException) { > if (cause instanceof InternalFunctionInvocationTargetException) { > // Fix for #44709: User should not be aware of > // InternalFunctionInvocationTargetException. No instance of > // InternalFunctionInvocationTargetException is giving useful > // information to user to take any corrective action hence logging > // this at fine level logging > // 1> When bucket is moved > // 2> Incase of HA FucntionInvocationTargetException thrown. Since > // it is HA, fucntion will be reexecuted on right node > // 3> Multiple target nodes found for single hop operation > // 4> in case of HA member departed > if (logger.isDebugEnabled()) { > logger.debug(LocalizedMessage.create( > > LocalizedStrings.ExecuteFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, > new Object[] {function}), fe); > } > } else if (functionObject.isHA()) { > logger.warn(LocalizedMessage.create( > > LocalizedStrings.ExecuteRegionFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, > function + " :" + message)); > } else { > logger.warn(LocalizedMessage.create( > > LocalizedStrings.ExecuteRegionFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, > function), fe); > } > resultSender.setException(fe); > } else { > if(!resultSender.isLastResultReceived()){ > resultSender.setLastResultReceived(true); > logger.warn(LocalizedMessage.create( > > LocalizedStrings.ExecuteRegionFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, > function), fe); > sendException(hasResult, clientMessage, message, serverConnection, > fe); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5055) CI failure: LuceneSearchWithRollingUpgradeDUnit.luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion
[ https://issues.apache.org/jira/browse/GEODE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567739#comment-16567739 ] ASF subversion and git services commented on GEODE-5055: Commit 06124e0c4191a90fa384a1368fc59a7d8dfdc0c8 in geode's branch refs/heads/develop from [~nabarunnag] [ https://gitbox.apache.org/repos/asf?p=geode.git;h=06124e0 ] GEODE-5055: Handle index in progress for old clients (#1961) * If the Lucene query function is executed by an old client (< 1.6.0) on a new server, it will wait for the index to be created. * Server wont return a LuceneIndexCreationInProgressException back to the old client resulting in a ClassNotFoundException. * LuceneIndexCreationInProgressException is wrapped in a FunctionException and sent to the caller function * The caller unwraps and send the LuceneQueryException back to the user. > CI failure: > LuceneSearchWithRollingUpgradeDUnit.luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion > - > > Key: GEODE-5055 > URL: https://issues.apache.org/jira/browse/GEODE-5055 > Project: Geode > Issue Type: Sub-task > Components: lucene >Reporter: Jason Huynh >Assignee: nabarun >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > This is related to changes for GEODE-3926. The roll is occuring but because > we added the new complete file, the new server is probably "reindexing" at > the moment. > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit > > luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion[from_v140] > FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit$$Lambda$27/1996438298.run > in VM 0 running on Host eada542bc38a with 4 VMs > at org.apache.geode.test.dunit.VM.invoke(VM.java:436) > at org.apache.geode.test.dunit.VM.invoke(VM.java:405) > at org.apache.geode.test.dunit.VM.invoke(VM.java:348) > at > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.verifyLuceneQueryResultInEachVM(LuceneSearchWithRollingUpgradeDUnit.java:678) > at > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.executeLuceneQueryWithServerRollOvers(LuceneSearchWithRollingUpgradeDUnit.java:572) > at > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion(LuceneSearchWithRollingUpgradeDUnit.java:134) > Caused by: > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.verifyLuceneQueryResults(LuceneSearchWithRollingUpgradeDUnit.java:661) > at > org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.lambda$verifyLuceneQueryResultInEachVM$b83f705c$2(LuceneSearchWithRollingUpgradeDUnit.java:678) > Caused by: > > org.apache.geode.cache.lucene.internal.LuceneIndexCreationInProgressException: > Lucene Index creation in progress for bucket: 1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GEODE-5521) After an exception is received from a remote server function execution, local threads should not send back result to client later
nabarun created GEODE-5521: -- Summary: After an exception is received from a remote server function execution, local threads should not send back result to client later Key: GEODE-5521 URL: https://issues.apache.org/jira/browse/GEODE-5521 Project: Geode Issue Type: Bug Components: functions Reporter: nabarun In the method cmdExecute() if the local co-ordinator receives an FunctionException of type FunctionInvocationTargetException or QueryInvocationTargetException from the remote server, setException is called which sets the lastResultReceived flag. This flag prevents other results from other threads to be sent to the client, as the client may have moved on. If there were any other function exception it will just send the exception but not set the flag. {code:java} if (cause instanceof FunctionInvocationTargetException || cause instanceof QueryInvocationTargetException) { if (cause instanceof InternalFunctionInvocationTargetException) { // Fix for #44709: User should not be aware of // InternalFunctionInvocationTargetException. No instance of // InternalFunctionInvocationTargetException is giving useful // information to user to take any corrective action hence logging // this at fine level logging // 1> When bucket is moved // 2> Incase of HA FucntionInvocationTargetException thrown. Since // it is HA, fucntion will be reexecuted on right node // 3> Multiple target nodes found for single hop operation // 4> in case of HA member departed if (logger.isDebugEnabled()) { logger.debug(LocalizedMessage.create( LocalizedStrings.ExecuteFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, new Object[] {function}), fe); } } else if (functionObject.isHA()) { logger.warn(LocalizedMessage.create( LocalizedStrings.ExecuteRegionFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, function + " :" + message)); } else { logger.warn(LocalizedMessage.create( LocalizedStrings.ExecuteRegionFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, function), fe); } resultSender.setException(fe); } else { if(!resultSender.isLastResultReceived()){ resultSender.setLastResultReceived(true); logger.warn(LocalizedMessage.create( LocalizedStrings.ExecuteRegionFunction_EXCEPTION_ON_SERVER_WHILE_EXECUTIONG_FUNCTION_0, function), fe); sendException(hasResult, clientMessage, message, serverConnection, fe); } } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5493) Client Statistics fail to publish on clusters with security enabled
[ https://issues.apache.org/jira/browse/GEODE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Bales resolved GEODE-5493. - Resolution: Fixed Resolved by this PR: https://github.com/apache/geode/pull/2218 Client health statistics are now published regardless of if security is enabled on the cluster, the ID used to publish the statistics is now consistent throughout the system (see GEODE-5157), and the logging for null old stats has been removed. > Client Statistics fail to publish on clusters with security enabled > --- > > Key: GEODE-5493 > URL: https://issues.apache.org/jira/browse/GEODE-5493 > Project: Geode > Issue Type: Bug >Reporter: Helena Bales >Assignee: Helena Bales >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > In a cluster with security enabled and a pool with a small statistics > interval (like 1000ms), errors are logged for "Failed to publish" the > statistics, and "got oldStats null" for attempting to retrieve the previous > statistics during publication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (GEODE-5493) Client Statistics fail to publish on clusters with security enabled
[ https://issues.apache.org/jira/browse/GEODE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Bales reassigned GEODE-5493: --- Assignee: Helena Bales > Client Statistics fail to publish on clusters with security enabled > --- > > Key: GEODE-5493 > URL: https://issues.apache.org/jira/browse/GEODE-5493 > Project: Geode > Issue Type: Bug >Reporter: Helena Bales >Assignee: Helena Bales >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > In a cluster with security enabled and a pool with a small statistics > interval (like 1000ms), errors are logged for "Failed to publish" the > statistics, and "got oldStats null" for attempting to retrieve the previous > statistics during publication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5157) ClientHealthStats may not be propagated when system has a hostname
[ https://issues.apache.org/jira/browse/GEODE-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Bales resolved GEODE-5157. - Resolution: Fixed Fix Version/s: 1.8.0 > ClientHealthStats may not be propagated when system has a hostname > -- > > Key: GEODE-5157 > URL: https://issues.apache.org/jira/browse/GEODE-5157 > Project: Geode > Issue Type: Bug > Components: cq, jmx >Reporter: Jens Deppe >Assignee: Helena Bales >Priority: Major > Fix For: 1.8.0 > > > For CQs, the client publishes stats from > {{ClientStatsManager.publishClientStats}}. Here the client memberId is used > as a key for putting the stats into an admin region. If the client has a > valid hostname then the memberId contains the hostname. If there is no valid > hostname, then the memberId is just the IP address. > On the server side, clientIDs are determined from {{CacheClientProxy}} > objects - see {{CacheServerBridge.getUniqueClientIds}}. It appears that these > IDs are always IP-address based. > Thus if there is this mismatch then ClientHealthStats are not published > correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5499) SUPERFLAKY: DistributedNoAckRegionOffHeapDUnitTest.testTXRmtMirror
[ https://issues.apache.org/jira/browse/GEODE-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-5499: -- Labels: pull-request-available swat (was: swat) > SUPERFLAKY: DistributedNoAckRegionOffHeapDUnitTest.testTXRmtMirror > -- > > Key: GEODE-5499 > URL: https://issues.apache.org/jira/browse/GEODE-5499 > Project: Geode > Issue Type: Bug >Reporter: Dan Smith >Assignee: Dan Smith >Priority: Major > Labels: pull-request-available, swat > Attachments: Test results - Class > org.apache.geode.cache30.DistributedNoAckRegionOffHeapDUnitTest.html > > > This test is failing fairly frequently in CI > {noformat} > org.apache.geode.cache30.DistributedNoAckRegionOffHeapDUnitTest: 6 failures > (98.101% success rate) > | .testTXRmtMirror: 6 failures (98.101% success rate) > | | Failed build 315 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/315 > | | Failed build 296 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/296 > | | Failed build 233 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/233 > | | Failed build 123 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/123 > | | Failed build 84 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/84 > | | Failed build 51 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/51 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5520) Inconsistency created by delta-propagation interaction with concurrency control
[ https://issues.apache.org/jira/browse/GEODE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-5520: -- Labels: pull-request-available (was: ) > Inconsistency created by delta-propagation interaction with concurrency > control > --- > > Key: GEODE-5520 > URL: https://issues.apache.org/jira/browse/GEODE-5520 > Project: Geode > Issue Type: Bug > Components: client/server, messaging, regions, serialization >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt >Priority: Major > Labels: pull-request-available > > I tracked a cache inconsistency down to a delta propagation operation that > failed over from one server to another and then failed to apply the delta on > the new server. > When the full value is sent from the client the message is not marked as a > possible-duplicate. Because it was missing this flag the server did not try > to recover a concurrency version tag for the operation but instead generated > a new version. > When this server propagated the operation to another server it was rejected > by that server because it had already processed the operation from the > client's original attempt. It recognized this by way of the operation's > EventID being recorded in its EventTracker. > This resulted in one server having version X and the other having version X+1 > for the entry. > The client then destroyed the same entry with the same server, generating > version X+1 in that server. When the server propagated the operation the > other server already had X+1 and threw a > ConcurrentCacheModificationException. The result was one server having a > tombstone for the entry and the other having the value from the > delta-propagation operation. > This can be fixed by setting the possible-duplicate flag on the message from > the client that contains the full value. The server will then search for a > concurrency version tag and use it instead of generating a new one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (GEODE-5499) SUPERFLAKY: DistributedNoAckRegionOffHeapDUnitTest.testTXRmtMirror
[ https://issues.apache.org/jira/browse/GEODE-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith reassigned GEODE-5499: Assignee: Dan Smith > SUPERFLAKY: DistributedNoAckRegionOffHeapDUnitTest.testTXRmtMirror > -- > > Key: GEODE-5499 > URL: https://issues.apache.org/jira/browse/GEODE-5499 > Project: Geode > Issue Type: Bug >Reporter: Dan Smith >Assignee: Dan Smith >Priority: Major > Labels: swat > Attachments: Test results - Class > org.apache.geode.cache30.DistributedNoAckRegionOffHeapDUnitTest.html > > > This test is failing fairly frequently in CI > {noformat} > org.apache.geode.cache30.DistributedNoAckRegionOffHeapDUnitTest: 6 failures > (98.101% success rate) > | .testTXRmtMirror: 6 failures (98.101% success rate) > | | Failed build 315 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/315 > | | Failed build 296 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/296 > | | Failed build 233 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/233 > | | Failed build 123 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/123 > | | Failed build 84 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/84 > | | Failed build 51 at > https://concourse.apachegeode-ci.info/teams/staging/pipelines/mhansonp-pipelinework/jobs/DistributedTest/builds/51 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GEODE-5520) Inconsistency created by delta-propagation interaction with concurrency control
Bruce Schuchardt created GEODE-5520: --- Summary: Inconsistency created by delta-propagation interaction with concurrency control Key: GEODE-5520 URL: https://issues.apache.org/jira/browse/GEODE-5520 Project: Geode Issue Type: Bug Components: client/server, messaging, regions, serialization Reporter: Bruce Schuchardt I tracked a cache inconsistency down to a delta propagation operation that failed over from one server to another and then failed to apply the delta on the new server. When the full value is sent from the client the message is not marked as a possible-duplicate. Because it was missing this flag the server did not try to recover a concurrency version tag for the operation but instead generated a new version. When this server propagated the operation to another server it was rejected by that server because it had already processed the operation from the client's original attempt. It recognized this by way of the operation's EventID being recorded in its EventTracker. This resulted in one server having version X and the other having version X+1 for the entry. The client then destroyed the same entry with the same server, generating version X+1 in that server. When the server propagated the operation the other server already had X+1 and threw a ConcurrentCacheModificationException. The result was one server having a tombstone for the entry and the other having the value from the delta-propagation operation. This can be fixed by setting the possible-duplicate flag on the message from the client that contains the full value. The server will then search for a concurrency version tag and use it instead of generating a new one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (GEODE-5520) Inconsistency created by delta-propagation interaction with concurrency control
[ https://issues.apache.org/jira/browse/GEODE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Schuchardt reassigned GEODE-5520: --- Assignee: Bruce Schuchardt > Inconsistency created by delta-propagation interaction with concurrency > control > --- > > Key: GEODE-5520 > URL: https://issues.apache.org/jira/browse/GEODE-5520 > Project: Geode > Issue Type: Bug > Components: client/server, messaging, regions, serialization >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt >Priority: Major > > I tracked a cache inconsistency down to a delta propagation operation that > failed over from one server to another and then failed to apply the delta on > the new server. > When the full value is sent from the client the message is not marked as a > possible-duplicate. Because it was missing this flag the server did not try > to recover a concurrency version tag for the operation but instead generated > a new version. > When this server propagated the operation to another server it was rejected > by that server because it had already processed the operation from the > client's original attempt. It recognized this by way of the operation's > EventID being recorded in its EventTracker. > This resulted in one server having version X and the other having version X+1 > for the entry. > The client then destroyed the same entry with the same server, generating > version X+1 in that server. When the server propagated the operation the > other server already had X+1 and threw a > ConcurrentCacheModificationException. The result was one server having a > tombstone for the entry and the other having the value from the > delta-propagation operation. > This can be fixed by setting the possible-duplicate flag on the message from > the client that contains the full value. The server will then search for a > concurrency version tag and use it instead of generating a new one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5517) Have StatArchiveReader implement AutoCloseable
[ https://issues.apache.org/jira/browse/GEODE-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567499#comment-16567499 ] ASF subversion and git services commented on GEODE-5517: Commit c5b923b0d83d00d5fa2bbcd84d783d51600e899b in geode's branch refs/heads/develop from [~jens.deppe] [ https://gitbox.apache.org/repos/asf?p=geode.git;h=c5b923b ] GEODE-5517: Have StatArchiveReader implement AutoCloseable (#2249) - This commit also fixes a resource leak in StatTypesAreRolledOverRegressionTest which was causing the test to fail on Windows. > Have StatArchiveReader implement AutoCloseable > -- > > Key: GEODE-5517 > URL: https://issues.apache.org/jira/browse/GEODE-5517 > Project: Geode > Issue Type: Improvement > Components: statistics >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5493) Client Statistics fail to publish on clusters with security enabled
[ https://issues.apache.org/jira/browse/GEODE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567493#comment-16567493 ] ASF subversion and git services commented on GEODE-5493: Commit cd3783a6a0c5597ccd6799d3833e864b05719f15 in geode's branch refs/heads/develop from Helena A. Bales [ https://gitbox.apache.org/repos/asf?p=geode.git;h=cd3783a ] GEODE-5493: fix publication of client statistics Fix publication of client statistics with security enabled. Remove dead code related to retrieving old statistics that never worked. Add test for fix. Signed-off-by: Dan Smith > Client Statistics fail to publish on clusters with security enabled > --- > > Key: GEODE-5493 > URL: https://issues.apache.org/jira/browse/GEODE-5493 > Project: Geode > Issue Type: Bug >Reporter: Helena Bales >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > In a cluster with security enabled and a pool with a small statistics > interval (like 1000ms), errors are logged for "Failed to publish" the > statistics, and "got oldStats null" for attempting to retrieve the previous > statistics during publication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5157) ClientHealthStats may not be propagated when system has a hostname
[ https://issues.apache.org/jira/browse/GEODE-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567494#comment-16567494 ] ASF subversion and git services commented on GEODE-5157: Commit 555149dc9f6c9c283e3eac96d3656842ee188723 in geode's branch refs/heads/develop from Helena A. Bales [ https://gitbox.apache.org/repos/asf?p=geode.git;h=555149d ] GEODE-5157: use distributedMember as client ID Changed the client to put stats information in the monitoring region using the DistributedMember instead of the toString of DistributedMember. On the server, if that fails, the toString of DistributedMember is used, for backward compatibility. This closes #2218 > ClientHealthStats may not be propagated when system has a hostname > -- > > Key: GEODE-5157 > URL: https://issues.apache.org/jira/browse/GEODE-5157 > Project: Geode > Issue Type: Bug > Components: cq, jmx >Reporter: Jens Deppe >Assignee: Helena Bales >Priority: Major > > For CQs, the client publishes stats from > {{ClientStatsManager.publishClientStats}}. Here the client memberId is used > as a key for putting the stats into an admin region. If the client has a > valid hostname then the memberId contains the hostname. If there is no valid > hostname, then the memberId is just the IP address. > On the server side, clientIDs are determined from {{CacheClientProxy}} > objects - see {{CacheServerBridge.getUniqueClientIds}}. It appears that these > IDs are always IP-address based. > Thus if there is this mismatch then ClientHealthStats are not published > correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (GEODE-5417) All UITest tests are flaky and failing in CI
[ https://issues.apache.org/jira/browse/GEODE-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-5417: -- Assignee: (was: Mark Hanson) > All UITest tests are flaky and failing in CI > > > Key: GEODE-5417 > URL: https://issues.apache.org/jira/browse/GEODE-5417 > Project: Geode > Issue Type: Bug > Components: pulse, tests >Reporter: Dan Smith >Priority: Major > > All of these tests with the UITest category seem to be flaky, with the > exception of those that extend BaseServiceTest. I've attached the results of > the metrics job against the current concourse pipeline with 144 runs. > We should fix the systematic failures in these tests, or replace/remove them > if we have coverage elsewhere. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5417) All UITest tests are flaky and failing in CI
[ https://issues.apache.org/jira/browse/GEODE-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-5417: --- Labels: (was: swat) > All UITest tests are flaky and failing in CI > > > Key: GEODE-5417 > URL: https://issues.apache.org/jira/browse/GEODE-5417 > Project: Geode > Issue Type: Bug > Components: pulse, tests >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > > All of these tests with the UITest category seem to be flaky, with the > exception of those that extend BaseServiceTest. I've attached the results of > the metrics job against the current concourse pipeline with 144 runs. > We should fix the systematic failures in these tests, or replace/remove them > if we have coverage elsewhere. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5513) Clients may miss PR region events due to race during registerInterest
[ https://issues.apache.org/jira/browse/GEODE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Gingade resolved GEODE-5513. -- Resolution: Fixed Fix Version/s: 1.7.0 > Clients may miss PR region events due to race during registerInterest > - > > Key: GEODE-5513 > URL: https://issues.apache.org/jira/browse/GEODE-5513 > Project: Geode > Issue Type: Bug > Components: client queues >Affects Versions: 1.6.0 >Reporter: Kenneth Howe >Assignee: Kenneth Howe >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Here is the scenario: > Consider two servers and client: > - Server1 hosting the primary bucket > - Server2 hosting secondary bucket and also primary queue for the Client2 > - Client1 Doing remove operation > - Client2 doing register interest > - The Client1 starts remove-all operation > - At the same time Client2 is registering interest > - Server1 receives the remove-all operation processes it, and sends the > adjunct message to the Server2 (Its still not yet received the interest info > from server1) > - While the remove-all to server2 in flight > - Server2 sends interest profile info to Server1 for client2; and then > Server2 (as it is hosting the primary queue) starts building the initial > image snapshot for the interest. When building initial image for PR > preference is given to collect data from local node. During this time the > removal message is still in flight and hasn't applied on Server2. The initial > image for interest registration calculates the snapshot from local data, and > sends it to client, missing the remove-all op. > This could happen with non-bulk ops; but it gets worse with bulk ops as the > time taken to replicate the bulk ops will take more time. > The solution is to build the initial register interest response by getting > the data from primary bucket. This will add little overhead in building the > interest response; but considering that most or always the register response > will involve remote node, this may be negligible. > Clients registering interest in a region -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5513) Clients may miss PR region events due to race during registerInterest
[ https://issues.apache.org/jira/browse/GEODE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567457#comment-16567457 ] ASF subversion and git services commented on GEODE-5513: Commit 4abb1d024977fa62bb617facd40bc899f9ebf1bc in geode's branch refs/heads/develop from agingade [ https://gitbox.apache.org/repos/asf?p=geode.git;h=4abb1d0 ] GEODE-5513: Changes to build register interest initial snapshot from primary bucket (#2246) * Changes to build register interest initial snapshot from primary bucket > Clients may miss PR region events due to race during registerInterest > - > > Key: GEODE-5513 > URL: https://issues.apache.org/jira/browse/GEODE-5513 > Project: Geode > Issue Type: Bug > Components: client queues >Affects Versions: 1.6.0 >Reporter: Kenneth Howe >Assignee: Kenneth Howe >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Here is the scenario: > Consider two servers and client: > - Server1 hosting the primary bucket > - Server2 hosting secondary bucket and also primary queue for the Client2 > - Client1 Doing remove operation > - Client2 doing register interest > - The Client1 starts remove-all operation > - At the same time Client2 is registering interest > - Server1 receives the remove-all operation processes it, and sends the > adjunct message to the Server2 (Its still not yet received the interest info > from server1) > - While the remove-all to server2 in flight > - Server2 sends interest profile info to Server1 for client2; and then > Server2 (as it is hosting the primary queue) starts building the initial > image snapshot for the interest. When building initial image for PR > preference is given to collect data from local node. During this time the > removal message is still in flight and hasn't applied on Server2. The initial > image for interest registration calculates the snapshot from local data, and > sends it to client, missing the remove-all op. > This could happen with non-bulk ops; but it gets worse with bulk ops as the > time taken to replicate the bulk ops will take more time. > The solution is to build the initial register interest response by getting > the data from primary bucket. This will add little overhead in building the > interest response; but considering that most or always the register response > will involve remote node, this may be negligible. > Clients registering interest in a region -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5519) Add stackdriver monitoring to heavy-lifters
[ https://issues.apache.org/jira/browse/GEODE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567401#comment-16567401 ] ASF subversion and git services commented on GEODE-5519: Commit c431d24be1c00e25feefbeca26268079882c464f in geode's branch refs/heads/develop from [~smgoller] [ https://gitbox.apache.org/repos/asf?p=geode.git;h=c431d24 ] [GEODE-5519] Add stackdriver monitoring to instances. * Add stackdriver agent to google-geode-builder. * Add labels identifying heavy-lifters Signed-off-by: Patrick Rhomberg > Add stackdriver monitoring to heavy-lifters > --- > > Key: GEODE-5519 > URL: https://issues.apache.org/jira/browse/GEODE-5519 > Project: Geode > Issue Type: Improvement > Components: ci >Reporter: Sean Goller >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In order to properly gauge resource requirements implement stackdriver > monitoring. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-4928) DistributedLockService doesn't work as expected while the dlock grantor is initialized
[ https://issues.apache.org/jira/browse/GEODE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith updated GEODE-4928: - Priority: Minor (was: Major) > DistributedLockService doesn't work as expected while the dlock grantor is > initialized > -- > > Key: GEODE-4928 > URL: https://issues.apache.org/jira/browse/GEODE-4928 > Project: Geode > Issue Type: New Feature > Components: distributed lock service >Reporter: Bruce Schuchardt >Priority: Minor > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > I wrote a function that obtained a dlock and then performed a transaction. > It always operates on the same dlock key and the same keys in my region. > That protects against getting a commit conflict exception BUT this sometimes > fails if the JVM holding the lock crashes. One of the functions appears to > get the dlock okay but then its transaction fails when it goes to commit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-4928) DistributedLockService doesn't work as expected while the dlock grantor is initialized
[ https://issues.apache.org/jira/browse/GEODE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith updated GEODE-4928: - Fix Version/s: (was: 1.6.0) > DistributedLockService doesn't work as expected while the dlock grantor is > initialized > -- > > Key: GEODE-4928 > URL: https://issues.apache.org/jira/browse/GEODE-4928 > Project: Geode > Issue Type: New Feature > Components: distributed lock service >Reporter: Bruce Schuchardt >Priority: Minor > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > I wrote a function that obtained a dlock and then performed a transaction. > It always operates on the same dlock key and the same keys in my region. > That protects against getting a commit conflict exception BUT this sometimes > fails if the JVM holding the lock crashes. One of the functions appears to > get the dlock okay but then its transaction fails when it goes to commit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (GEODE-4928) DistributedLockService doesn't work as expected while the dlock grantor is initialized
[ https://issues.apache.org/jira/browse/GEODE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith reopened GEODE-4928: -- Assignee: (was: Bruce Schuchardt) The fix attempted for this doesn't quite work, and here's why: When a member/client leaves, the change causes both the transaction lock and DLock to clean up whatever locks that member had. DLocks clean up rather quickly, whereas transaction lock cleanup happens asynchronously in the background. The fix was to have the transaction lock grantor set some state letting the DLock grantor know that the transaction lock is cleaning up locks, and have the DLock grantor wait. However, the DLock grantor can get beyond that check without the Transaction grantor having notified, meaning that the window is reduced but not closed. In addition to the above problem, if the lock grantors are actually different members, making the dlock grantor wait on the transaction lock grantor in the same process doesn't help. The test written for this issue is demonstrating these issues sporadically, see GEODE-5470. I don't think we ever really designed these two services to work together that way, so I think this would be a feature to be implemented, not a bug to be fixed. Maybe we could have DLock and Transaction locks synchronize on a view number, rather than a boolean flag, but that requires new messages and significant reworking of how views are processed. Maybe we could mark certain DLock services as having to be synchronized with transaction locks? > DistributedLockService doesn't work as expected while the dlock grantor is > initialized > -- > > Key: GEODE-4928 > URL: https://issues.apache.org/jira/browse/GEODE-4928 > Project: Geode > Issue Type: Bug > Components: distributed lock service >Reporter: Bruce Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.6.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > I wrote a function that obtained a dlock and then performed a transaction. > It always operates on the same dlock key and the same keys in my region. > That protects against getting a commit conflict exception BUT this sometimes > fails if the JVM holding the lock crashes. One of the functions appears to > get the dlock okay but then its transaction fails when it goes to commit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-4928) DistributedLockService doesn't work as expected while the dlock grantor is initialized
[ https://issues.apache.org/jira/browse/GEODE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Smith updated GEODE-4928: - Issue Type: New Feature (was: Bug) > DistributedLockService doesn't work as expected while the dlock grantor is > initialized > -- > > Key: GEODE-4928 > URL: https://issues.apache.org/jira/browse/GEODE-4928 > Project: Geode > Issue Type: New Feature > Components: distributed lock service >Reporter: Bruce Schuchardt >Priority: Major > Labels: pull-request-available > Fix For: 1.6.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > I wrote a function that obtained a dlock and then performed a transaction. > It always operates on the same dlock key and the same keys in my region. > That protects against getting a commit conflict exception BUT this sometimes > fails if the JVM holding the lock crashes. One of the functions appears to > get the dlock okay but then its transaction fails when it goes to commit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5519) Add stackdriver monitoring to heavy-lifters
[ https://issues.apache.org/jira/browse/GEODE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-5519: -- Labels: pull-request-available (was: ) > Add stackdriver monitoring to heavy-lifters > --- > > Key: GEODE-5519 > URL: https://issues.apache.org/jira/browse/GEODE-5519 > Project: Geode > Issue Type: Improvement > Components: ci >Reporter: Sean Goller >Priority: Major > Labels: pull-request-available > > In order to properly gauge resource requirements implement stackdriver > monitoring. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GEODE-5519) Add stackdriver monitoring to heavy-lifters
Sean Goller created GEODE-5519: -- Summary: Add stackdriver monitoring to heavy-lifters Key: GEODE-5519 URL: https://issues.apache.org/jira/browse/GEODE-5519 Project: Geode Issue Type: Improvement Components: ci Reporter: Sean Goller In order to properly gauge resource requirements implement stackdriver monitoring. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (GEODE-5515) Transaction originated from peer sets the onBehalfOfClientMember on remote transaction host
[ https://issues.apache.org/jira/browse/GEODE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Shu resolved GEODE-5515. - Resolution: Fixed Fix Version/s: 1.7.0 > Transaction originated from peer sets the onBehalfOfClientMember on remote > transaction host > --- > > Key: GEODE-5515 > URL: https://issues.apache.org/jira/browse/GEODE-5515 > Project: Geode > Issue Type: Bug > Components: transactions >Affects Versions: 1.4.0, 1.5.0, 1.6.0 >Reporter: Eric Shu >Assignee: Eric Shu >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently a transaction originated from a peer and if the transaction host is > on another server, the transaction onBehalfOfClientMember is incorrectly set > on the transaction host. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5515) Transaction originated from peer sets the onBehalfOfClientMember on remote transaction host
[ https://issues.apache.org/jira/browse/GEODE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567300#comment-16567300 ] ASF subversion and git services commented on GEODE-5515: Commit a3debf771b753061d8df8854ee4940e09495e8de in geode's branch refs/heads/develop from pivotal-eshu [ https://gitbox.apache.org/repos/asf?p=geode.git;h=a3debf7 ] GEODE-5515: Only initTXMemberId in PartitionMessage if transaction is originated from client. (#2242) * Only initTXMemberId in PartitionMessage if transaction is originated from client. * Returns client member id in TXState.getOriginatingMember() if it is a client originated transaction so it can be correctly set in the PartitionMessage. > Transaction originated from peer sets the onBehalfOfClientMember on remote > transaction host > --- > > Key: GEODE-5515 > URL: https://issues.apache.org/jira/browse/GEODE-5515 > Project: Geode > Issue Type: Bug > Components: transactions >Affects Versions: 1.4.0, 1.5.0, 1.6.0 >Reporter: Eric Shu >Assignee: Eric Shu >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently a transaction originated from a peer and if the transaction host is > on another server, the transaction onBehalfOfClientMember is incorrectly set > on the transaction host. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (GEODE-5157) ClientHealthStats may not be propagated when system has a hostname
[ https://issues.apache.org/jira/browse/GEODE-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Bales reassigned GEODE-5157: --- Assignee: Helena Bales > ClientHealthStats may not be propagated when system has a hostname > -- > > Key: GEODE-5157 > URL: https://issues.apache.org/jira/browse/GEODE-5157 > Project: Geode > Issue Type: Bug > Components: cq, jmx >Reporter: Jens Deppe >Assignee: Helena Bales >Priority: Major > > For CQs, the client publishes stats from > {{ClientStatsManager.publishClientStats}}. Here the client memberId is used > as a key for putting the stats into an admin region. If the client has a > valid hostname then the memberId contains the hostname. If there is no valid > hostname, then the memberId is just the IP address. > On the server side, clientIDs are determined from {{CacheClientProxy}} > objects - see {{CacheServerBridge.getUniqueClientIds}}. It appears that these > IDs are always IP-address based. > Thus if there is this mismatch then ClientHealthStats are not published > correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-5502) Cluster configuration can contain member-specific gateway receiver definitions which cause members to fail to start during rolling
[ https://issues.apache.org/jira/browse/GEODE-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567005#comment-16567005 ] ASF subversion and git services commented on GEODE-5502: Commit 6a7ac34c9707d61969757a6556f49b9dba3d9341 in geode's branch refs/heads/feature/GEODE-5502 from [~barry.oglesby] [ https://gitbox.apache.org/repos/asf?p=geode.git;h=6a7ac34 ] GEODE-5502: Force CI to re-run > Cluster configuration can contain member-specific gateway receiver > definitions which cause members to fail to start during rolling > -- > > Key: GEODE-5502 > URL: https://issues.apache.org/jira/browse/GEODE-5502 > Project: Geode > Issue Type: Bug > Components: configuration >Reporter: Barry Oglesby >Assignee: Barry Oglesby >Priority: Major > Labels: pull-request-available, swat > Time Spent: 0.5h > Remaining Estimate: 0h > > In versions before 1.4.0, cluster configuration could contain multiple > member-specific gateway receiver definitions like: > {noformat} > > > > > {noformat} > Starting in 1.4.0, multiple receivers are no longer allowed, so a > configuration like this causes the member to throw an exception and fail to > start. > These member-specific receivers should be removed before sending the cluster > configuration to new members to avoid attempting to create multiple receivers > in a single member. > Note: In this case, the member will start with no receivers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GEODE-5518) some records in the region are not fetched when executing fetch query
yossi reginiano created GEODE-5518: -- Summary: some records in the region are not fetched when executing fetch query Key: GEODE-5518 URL: https://issues.apache.org/jira/browse/GEODE-5518 Project: Geode Issue Type: Bug Components: core, querying Reporter: yossi reginiano hi all, we are using geode 1.4 and facing the following: we are starting to adopt the putAll functions which accepts a bulk of records and persists them into the region we have noticed that the process that fetches the records from the region (executing fetch command with bulks of 1000) , from time to time missing a record or two , which is causing this records to be left in the region as a "Zombie" - because now current index is greater then this record's index now this started to happen only when we started to use the putAll function - prior to this we did not face any such issue also - when we are using putAll with only 1 record at a time it is also working fine has anybody faced this? is there some constraint on the number of records that can be sent to the putAll function? thanks in advance -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GEODE-5314) MBeanStatsMonitor child classes should use atomics instead of volatiles to avoid data race
[ https://issues.apache.org/jira/browse/GEODE-5314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated GEODE-5314: -- Labels: pull-request-available (was: ) > MBeanStatsMonitor child classes should use atomics instead of volatiles to > avoid data race > -- > > Key: GEODE-5314 > URL: https://issues.apache.org/jira/browse/GEODE-5314 > Project: Geode > Issue Type: Bug > Components: statistics >Reporter: Galen O'Sullivan >Assignee: Juan José Ramos Cassella >Priority: Major > Labels: pull-request-available > > {{GcStatsMonitor}} has the following: > {code} > private volatile long collections = 0; > private volatile long collectionTime = 0; > {code} > > {code} > collections -= > statsMap.getOrDefault(StatsKey.VM_GC_STATS_COLLECTIONS,0).intValue(); > collectionTime -= > statsMap.getOrDefault(StatsKey.VM_GC_STATS_COLLECTION_TIME,0).intValue(); > {code} > Because these are volatile and not atomic fields, there will be a race > condition. Other subclasses of {{MBeanStatsMonitor}} also use volatiles: > AggregateRegionStatsMonitor, GatewaySenderOverflowMonitor, > MemberLevelDiskMonitor, VMStatsMonitor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)