[ https://issues.apache.org/jira/browse/GEODE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bill updated GEODE-5568: ------------------------ Summary: Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate notify/wait bugs and improve performance (was: Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate notify/wait bugs and improve worst-case performance) > Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate > notify/wait bugs and improve performance > ------------------------------------------------------------------------------------------------------------- > > Key: GEODE-5568 > URL: https://issues.apache.org/jira/browse/GEODE-5568 > Project: Geode > Issue Type: Bug > Components: tests > Reporter: Michael Oleske > Assignee: Bill > Priority: Major > Labels: pull-request-available, swat > Time Spent: 6h 20m > Remaining Estimate: 0h > > h2. Original Description > (original title of this ticket was *testCacheOpAfterQueryCancel in > QueryMonitorDUnitTest fails intermittently*) > *We* should remove flakiness from test > *Before* we add more features around Query Monitor > *Because* flaky tests do not inspire confidence > *Notes* > When running [PR 2311|https://github.com/apache/geode/pull/2311], a test > failed that passed on a rerun (see > [here|http://files.apachegeode-ci.info/builds/geode-pr-2311/test-results/distributedTest/1534096152/classes/org.apache.geode.cache.query.dunit.QueryMonitorDUnitTest.html#testCacheOpAfterQueryCancel]) > The build artifacts are available > [here|http://files.apachegeode-ci.info/builds/geode-pr-2311/test-artifacts/1534096152/distributedtestfiles-geode-pr-2311.tgz] > Since it is a timing issue, it could be a few things so not sure on best path > forward. > h2. Update October, 2018 > After seeing more failures on October 2, we decided to fix this bug. We > discovered an actual product bug in the {{QueryMonitor}}. We also discovered > that the "hot path" of of monitoring and then un-monitoring a query scaled > poorly: time complexity was O(N) where N is the number of queries currently > being monitored. As a result, our fix entailed mostly rewriting the > {{QueryMonitor}} class to use a more appropriate (off-the-shelf) data > structure. > See [GEODE-5568 Analysis and > Results|https://docs.google.com/document/d/1FFKH2XJMshCE-dBDXtMCYSylr73H54w4TODj4BRW9Y0/edit#] > for details. -- This message was sent by Atlassian JIRA (v7.6.3#76005)