[ 
https://issues.apache.org/jira/browse/IMPALA-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18049048#comment-18049048
 ] 

ASF subversion and git services commented on IMPALA-14651:
----------------------------------------------------------

Commit 52403541f2e11b6eeaaac849b2a3c739e80a6c2d in impala's branch 
refs/heads/master from Arnab Karmakar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=52403541f ]

IMPALA-14651: Fix flaky test_loaded_tables_metric due to report delay

test_loaded_tables_metric() added in IMPALA-13863 was failing
intermittently because it didn't account for the random delay in
ImpaladTableUsageTracker's table usage reporting.
The tracker sleeps for [0.5, 1.5) * REPORT_INTERVAL_MS (5-15s) before
sending usage reports to catalogd, after which the TTL countdown
begins.

The test was waiting for timeout * 2, but the actual max time is:
- Invalidation TTL: timeout
- Report delay: up to 15s (1.5 * 10s REPORT_INTERVAL_MS)
- Metric update + RPC/serde buffer: ~2s

Changed the wait timeout from (timeout * 2) to (timeout + 17) to properly
account for the maximum report delay plus TTL and metric update.

Change-Id: I7a0a1df5a398a0c0d74c561a0a4b4a0defbac7a7
Reviewed-on: http://gerrit.cloudera.org:8080/23822
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> TestAutomaticCatalogInvalidation.test_loaded_tables_metric() seems to be flaky
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-14651
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14651
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Fang-Yu Rao
>            Assignee: Arnab Karmakar
>            Priority: Major
>              Labels: broken-build
>
> We found at 
> [https://jenkins.impala.io/job/ubuntu-20.04-from-scratch/7562/testReport/junit/custom_cluster.test_automatic_invalidation/TestAutomaticCatalogInvalidation/test_loaded_tables_metric/]
>  that {{test_loaded_tables_metric()}} could fail with the following error.
> {code}
> custom_cluster/test_automatic_invalidation.py:206: in 
> test_loaded_tables_metric
>     catalogd.wait_for_metric_value(metric_name, 0, timeout=self.timeout * 2)
> common/impala_service.py:164: in wait_for_metric_value
>     self.__metric_timeout_assert(metric_name, expected_value, timeout, value)
> common/impala_service.py:251: in __metric_timeout_assert
>     assert 0, assert_string
> E   AssertionError: Metric catalog.num-loaded-tables did not reach value 0 in 
> 20s. Actual value was '1'.
> E   Dumping debug webpages in JSON format...
> E   Dumped memz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20260101_03:08:33/json/memz.json
> E   Dumped metrics JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20260101_03:08:33/json/metrics.json
> E   Dumped queries JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20260101_03:08:33/json/queries.json
> E   Dumped sessions JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20260101_03:08:33/json/sessions.json
> E   Dumped threadz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20260101_03:08:33/json/threadz.json
> E   Dumped rpcz JSON to 
> $IMPALA_HOME/logs/metric_timeout_diags_20260101_03:08:33/json/rpcz.json
> E   Dumping minidumps for impalads/catalogds...
> E   Dumped minidump for Impalad PID 2593780
> E   Dumped minidump for Impalad PID 2593782
> E   Dumped minidump for Impalad PID 2593784
> E   Dumped minidump for Catalogd PID 2593752
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to