[ 
https://issues.apache.org/jira/browse/GEODE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Murmann updated GEODE-10323:
--------------------------------------
    Labels: needsTriage  (was: )

> OffHeapStorageJUnitTest testCreateOffHeapStorage fails with AssertionError: 
> expected:<100> but was:<1048576>
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-10323
>                 URL: https://issues.apache.org/jira/browse/GEODE-10323
>             Project: Geode
>          Issue Type: Bug
>          Components: offheap
>            Reporter: Kirk Lund
>            Priority: Major
>              Labels: needsTriage
>
> {{OffHeapStorageJUnitTest testCreateOffHeapStorage}} started failing 
> intermittently due to commit a350ed2 which went in as 
> "[GEODE-10087|https://issues.apache.org/jira/browse/GEODE-10087]: Enhance 
> off-heap fragmentation visibility. 
> ([#7407|https://github.com/apache/geode/pull/7407/commits/1640effc5c760afa8d9ec4c2743950bb1de0ef8f])".
> The failure stack looks like:
> {noformat}
> OffHeapStorageJUnitTest > testCreateOffHeapStorage FAILED
>     java.lang.AssertionError: expected:<100> but was:<1048576>
>         at org.junit.Assert.fail(Assert.java:89)
>         at org.junit.Assert.failNotEquals(Assert.java:835)
>         at org.junit.Assert.assertEquals(Assert.java:647)
>         at org.junit.Assert.assertEquals(Assert.java:633)
>         at 
> org.apache.geode.internal.offheap.OffHeapStorageJUnitTest.testCreateOffHeapStorage(OffHeapStorageJUnitTest.java:220)
> {noformat}
> The cause is in {{MemoryAllocatorImpl}}. A new scheduled executor 
> {{updateNonRealTimeStatsExecutor}} was added near the bottom of the 
> constructor which immediately gets scheduled to invoke 
> {{freeList::updateNonRealTimeStats}} repeatedly at the specified frequency of 
> {{updateOffHeapStatsFrequencyMs}} whenever an instance of 
> {{MemoryAllocatorImpl}} is created.
> {{freeList::updateNonRealTimeStats}} then updates {{setLargestFragment}} and 
> {{setFreedChunks}}.
> Scheduling or starting any sort of threads from a constructor is considered a 
> bad practice and should only be done via some sort of activation method such 
> as {{start()}} which can be invoked from the static factory method for 
> {{MemoryAllocatorImpl}}. The unit tests would then continue to construct 
> instances via a constructor that does NOT invoke {{start()}}.
> {{OffHeapStorageJUnitTest testCreateOffHeapStorage}}, and possibly other 
> tests, is written with the assumption that no other thread or component can 
> or will be updating the {{OffHeapStats}}. Hence it is then safe for the test 
> to setLargestFragment and then assert that it has the value passed in:
> {noformat}
> 219:      stats.setLargestFragment(100);
> 220:      assertEquals(100, stats.getLargestFragment());
> {noformat}
> Line 220 is the source of the assertion failure because 
> {{updateNonRealTimeStatsExecutor}} is racing with the JUnit thread and 
> changes the value of {{setLargestFragment}} immediately after the test sets 
> the value to 100, but before the test invokes the assertion.
> Looking back at the [PR test 
> results|https://cio.hdb.gemfire-ci.info/commits/pr/7407/junit?days=90] we can 
> see that stress-new-test failed 5 times with this exact failure before being 
> merged to develop:
> * 
> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646140652/
> * 
> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646154134/
>  (x2)
> * 
> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646156997/
> * 
> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-7407/test-results/repeatTest/1646170346/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to