[
https://issues.apache.org/jira/browse/IMPALA-12426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888070#comment-17888070
]
ASF subversion and git services commented on IMPALA-12426:
----------------------------------------------------------
Commit 653e5388dd34252c3c6357172a4b81f030b4651f in impala's branch
refs/heads/master from Andrew Sherman
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=653e5388d ]
IMPALA-13408: use a specific flag for the topic prefix cluster identifier.
The cluster_id flag was introduced in IMPALA-12426 to identify Impala
clusters in systems where a single query_log table could be shared. In
IMPALA-13208 the cluster_id flag was reused as a prefix to topic names
for backend membership, to allow sub-clusters of backends within a
Statestore service.
There have been some problems with the interaction of these two usages.
An important difference is that the query_log cluster_id must be set
only on coordinators, whereas the topic prefix cluster_id must be set
simultaneously on coordinators, executors, and admission daemons
(if present). If a system is started with cluster_id set only on
coordinators then there are split-brain problems where coordinators and
executors are tracked in different topics. In addition, the query_log
cluster_id is more likely to be user-settable as it is used for data in
query_log which will be read by humans, who may want to write queries
selecting data from their ‘production’ or ‘dev’ clusters.
Avoid these problems by using a separate flag for the topic prefix
cluster_id ‘cluster_membership_topic_id’.
Change-Id: Icd3f7e1c73c00a7aaeee79ecb461209e3939c422
Reviewed-on: http://gerrit.cloudera.org:8080/21867
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> SQL Interface to Completed Queries/DDLs/DMLs
> --------------------------------------------
>
> Key: IMPALA-12426
> URL: https://issues.apache.org/jira/browse/IMPALA-12426
> Project: IMPALA
> Issue Type: New Feature
> Components: Backend
> Reporter: Jason Fehr
> Assignee: Jason Fehr
> Priority: Major
> Labels: impala, workload-management
> Fix For: Impala 4.4.0
>
>
> Implement a way of querying (via SQL) information about completed
> queries/ddls/dmls. Adds coordinator startup flags for users to specify that
> Impala will track completed queries in an internal table.
> Impala will create and maintain an internal Iceberg table named
> "impala_query_log" in the "system database" that contains all completed
> queries. This table is automatically created at startup by each coordinator
> if it does not exist. Then, each completed query is queued in memory and
> flushed to the query history table either at a set interval (user specified
> number of minutes) or when a user specified number of completed queries are
> queued in memory. Partition this table by the hour of the query end time.
> Data in this table must match the corresponding data in the query profile.
> Develop automated testing that asserts this requirement is true.
> Don't write use, show, and set queries to this table.
> Add the following metrics to the "impala-server" metrics group:
> * Number of completed queries queued in memory waiting to be written to the
> table.
> * Number of completed queries successfully written to the table.
> * Number of attempts that failed to write completed queries to the table.
> * Number of times completed queries were written at the regularly scheduled
> time.
> * Number of times completed queries were written before the scheduled time
> because the max number of queued records was reached.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]