[
https://issues.apache.org/jira/browse/IMPALA-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Sherman updated IMPALA-13408:
------------------------------------
Description:
The cluster_id flag was introduced in IMPALA-12426 to identify Impala clusters
in systems where a single query_log table could be shared.
In IMPALA-13208 the cluster_id flag was reused as a prefix to topic names for
backend membership, to allow sub-clusters of backends within a Statestore
service.
There have been some problems with the interaction of these two usages. An
important difference is that the query_log cluster_id must be set only on
coordinators, whereas the topic prefix cluster_id must be set simultaneously on
coordinators, executors, and admission daemons (if present). If a system is
started with cluster_id set only on coordinators then there are split-brain
problems where coordinators and executors are tracked in different topics.
In addition, the query_log cluster_id is more likely to be user-settable as it
is used for data in query_log which will be read by humans, who may want to
write queries selecting data from their ‘production’ or ‘dev’ clusters.
Avoid these problems by using a separate flag for the topic prefix cluster_id,
perhaps ‘cluster_membership_topic_id’
was:
The cluster_id flag was introduced in IMPALA-12426 to identify Impala clusters
in systems where a single query_log table could be shared.
In IMPALA-13208 the cluster_id flag was reused as a prefix to topic names for
backend membership, to allow sub-clusters of backends within a Statstore
service.
There have been some problems with the interaction of these two usages. An
important difference is that the query_log cluster_id must be set only on
coordinators, whereas the topic prefix cluster_id must be set simultaneously on
coordinators, executors, and admission daemons (if present). If a system is
started with cluster_id set only on coordinators then there are split-brain
problems where coordinators and executors are tracked in different topics.
In addition, the query_log cluster_id is more likely to be user-settable as it
is used for data in query_log which will be read by humans, who may want to
write queries selecting data from their ‘production’ or ‘dev’ clusters.
Avoid these problems by using a separate flag for the topic prefix cluster_id,
perhaps ‘cluster_membership_topic_id’
> use a specific flag for the topic prefix cluster identifier.
> -------------------------------------------------------------
>
> Key: IMPALA-13408
> URL: https://issues.apache.org/jira/browse/IMPALA-13408
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.5.0
> Reporter: Andrew Sherman
> Assignee: Andrew Sherman
> Priority: Critical
>
> The cluster_id flag was introduced in IMPALA-12426 to identify Impala
> clusters in systems where a single query_log table could be shared.
> In IMPALA-13208 the cluster_id flag was reused as a prefix to topic names for
> backend membership, to allow sub-clusters of backends within a Statestore
> service.
> There have been some problems with the interaction of these two usages. An
> important difference is that the query_log cluster_id must be set only on
> coordinators, whereas the topic prefix cluster_id must be set simultaneously
> on coordinators, executors, and admission daemons (if present). If a system
> is started with cluster_id set only on coordinators then there are
> split-brain problems where coordinators and executors are tracked in
> different topics.
> In addition, the query_log cluster_id is more likely to be user-settable as
> it is used for data in query_log which will be read by humans, who may want
> to write queries selecting data from their ‘production’ or ‘dev’ clusters.
> Avoid these problems by using a separate flag for the topic prefix
> cluster_id, perhaps ‘cluster_membership_topic_id’
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]