[ 
https://issues.apache.org/jira/browse/IMPALA-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated IMPALA-13408:
------------------------------------
    Description: 
The cluster_id flag was introduced in IMPALA-12426 to identify Impala clusters 
in systems where a single query_log table could be shared.

In IMPALA-13208 the cluster_id flag was reused as a prefix to topic names for 
backend membership, to allow sub-clusters of backends within a Statestore 
service.

There have been some problems with the interaction of these two usages. An 
important difference is that the query_log cluster_id must be set only on 
coordinators, whereas the topic prefix cluster_id must be set simultaneously on 
coordinators, executors, and admission daemons (if present). If a system is 
started with cluster_id set only on coordinators then there are split-brain 
problems where coordinators and executors are tracked in different topics.

In addition, the query_log cluster_id is more likely to be user-settable as it 
is used for data in query_log which will be read by humans, who may want to 
write queries selecting data from their ‘production’ or ‘dev’ clusters.

Avoid these problems by using a separate flag for the  topic prefix cluster_id, 
perhaps ‘cluster_membership_topic_id’

  was:
The cluster_id flag was introduced in IMPALA-12426 to identify Impala clusters 
in systems where a single query_log table could be shared.

In IMPALA-13208 the cluster_id flag was reused as a prefix to topic names for 
backend membership, to allow sub-clusters of backends within a Statstore 
service.

There have been some problems with the interaction of these two usages. An 
important difference is that the query_log cluster_id must be set only on 
coordinators, whereas the topic prefix cluster_id must be set simultaneously on 
coordinators, executors, and admission daemons (if present). If a system is 
started with cluster_id set only on coordinators then there are split-brain 
problems where coordinators and executors are tracked in different topics.

In addition, the query_log cluster_id is more likely to be user-settable as it 
is used for data in query_log which will be read by humans, who may want to 
write queries selecting data from their ‘production’ or ‘dev’ clusters.

Avoid these problems by using a separate flag for the  topic prefix cluster_id, 
perhaps ‘cluster_membership_topic_id’


>  use a specific flag for the topic prefix cluster identifier.
> -------------------------------------------------------------
>
>                 Key: IMPALA-13408
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13408
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.5.0
>            Reporter: Andrew Sherman
>            Assignee: Andrew Sherman
>            Priority: Critical
>
> The cluster_id flag was introduced in IMPALA-12426 to identify Impala 
> clusters in systems where a single query_log table could be shared.
> In IMPALA-13208 the cluster_id flag was reused as a prefix to topic names for 
> backend membership, to allow sub-clusters of backends within a Statestore 
> service.
> There have been some problems with the interaction of these two usages. An 
> important difference is that the query_log cluster_id must be set only on 
> coordinators, whereas the topic prefix cluster_id must be set simultaneously 
> on coordinators, executors, and admission daemons (if present). If a system 
> is started with cluster_id set only on coordinators then there are 
> split-brain problems where coordinators and executors are tracked in 
> different topics.
> In addition, the query_log cluster_id is more likely to be user-settable as 
> it is used for data in query_log which will be read by humans, who may want 
> to write queries selecting data from their ‘production’ or ‘dev’ clusters.
> Avoid these problems by using a separate flag for the  topic prefix 
> cluster_id, perhaps ‘cluster_membership_topic_id’



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to