[ 
https://issues.apache.org/jira/browse/IMPALA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18049049#comment-18049049
 ] 

ASF subversion and git services commented on IMPALA-13863:
----------------------------------------------------------

Commit 52403541f2e11b6eeaaac849b2a3c739e80a6c2d in impala's branch 
refs/heads/master from Arnab Karmakar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=52403541f ]

IMPALA-14651: Fix flaky test_loaded_tables_metric due to report delay

test_loaded_tables_metric() added in IMPALA-13863 was failing
intermittently because it didn't account for the random delay in
ImpaladTableUsageTracker's table usage reporting.
The tracker sleeps for [0.5, 1.5) * REPORT_INTERVAL_MS (5-15s) before
sending usage reports to catalogd, after which the TTL countdown
begins.

The test was waiting for timeout * 2, but the actual max time is:
- Invalidation TTL: timeout
- Report delay: up to 15s (1.5 * 10s REPORT_INTERVAL_MS)
- Metric update + RPC/serde buffer: ~2s

Changed the wait timeout from (timeout * 2) to (timeout + 17) to properly
account for the maximum report delay plus TTL and metric update.

Change-Id: I7a0a1df5a398a0c0d74c561a0a4b4a0defbac7a7
Reviewed-on: http://gerrit.cloudera.org:8080/23822
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Show number of loaded tables in metrics
> ---------------------------------------
>
>                 Key: IMPALA-13863
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13863
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Arnab Karmakar
>            Priority: Major
>             Fix For: Impala 5.0.0
>
>         Attachments: Screenshot 2025-12-23 at 1.40.26 AM.png
>
>
> It'd be helpful to show the number of loaded tables (i.e. not 
> IncompleteTable) in catalogd since there are some mechanisms that will 
> implicitly invalidate tables, e.g. invalidate_tables_on_memory_pressure, 
> invalidate_tables_timeout_s, invalidate_metadata_on_event_processing_failure.
> If few tables are actually loaded, it will impact query performance that many 
> queries will be in the CREATED state waiting for catalogd to load the 
> metadata of their tables. We should tune catalogd, e.g. bumping JVM heap 
> size, for this.
> There are several places that we can track the total number of loaded tables:
>  # While catalogd is collecting catalog updates in getCatalogDelta(), it 
> iterates through all the tables and can count this. However, it takes time 
> and some tables might change the state during the iteration.
>  # When a table is loaded and replaces an IncompleteTable, we bumps the 
> count. And decrease the count when a loaded table is invalidated.
> The 2nd option can show the real time count in metrics. The 1st option can be 
> used to improve logging, e.g. add a log saying "saw N tables are loaded".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to