[ 
https://issues.apache.org/jira/browse/IMPALA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arnab Karmakar resolved IMPALA-13863.
-------------------------------------
    Resolution: Fixed

> Show number of loaded tables in metrics
> ---------------------------------------
>
>                 Key: IMPALA-13863
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13863
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Arnab Karmakar
>            Priority: Major
>         Attachments: Screenshot 2025-12-23 at 1.40.26 AM.png
>
>
> It'd be helpful to show the number of loaded tables (i.e. not 
> IncompleteTable) in catalogd since there are some mechanisms that will 
> implicitly invalidate tables, e.g. invalidate_tables_on_memory_pressure, 
> invalidate_tables_timeout_s, invalidate_metadata_on_event_processing_failure.
> If few tables are actually loaded, it will impact query performance that many 
> queries will be in the CREATED state waiting for catalogd to load the 
> metadata of their tables. We should tune catalogd, e.g. bumping JVM heap 
> size, for this.
> There are several places that we can track the total number of loaded tables:
>  # While catalogd is collecting catalog updates in getCatalogDelta(), it 
> iterates through all the tables and can count this. However, it takes time 
> and some tables might change the state during the iteration.
>  # When a table is loaded and replaces an IncompleteTable, we bumps the 
> count. And decrease the count when a loaded table is invalidated.
> The 2nd option can show the real time count in metrics. The 1st option can be 
> used to improve logging, e.g. add a log saying "saw N tables are loaded".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to