Re:Impala's single point of failure

Quanlong Huang Wed, 06 Dec 2017 03:37:47 -0800

I have questions about this as well.

If the catalogd restart, its catalog version will start from 0. However, the
catalog version in Impala daemon still not change. This cause the Impala
daemons not accepting catalog updates until the catalog version match. The
cluster is in an abnormal state during that time.

Thus, there's still a single point of failure in Impala. Please correct me if
I'm wrong.

Thanks,
Quanlong

At 2017-12-05 18:50:34, "Aleksei Maželis" <[email protected]> wrote:

Hi,

Referring to the chapter of impala_faq about single point of failure at
https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_faq.html#faq_ha__faq_spof:

<quote>
There is not a single point of failure in Impala. All Impala daemons are fully
able to handle incoming queries. If a machine fails however, all queries with
fragments running on that machine will fail. Because queries are expected to
return quickly, you can just rerun the query if there is a failure. See Impala
Concepts and Architecture for details about the Impala architecture.
The longer answer: Impala must be able to connect to the Hive metastore. Impala
aggressively caches metadata so the metastore host should have minimal load.
Impala relies on the HDFS NameNode, and, in CDH4, you can configure HA for
HDFS. Impala also has centralized services, known as the statestore and catalog
services, that run on one host only. Impala continues to execute queries if the
statestore host is down, but it will not get state updates. For example, if a
host is added to the cluster while the statestore host is down, the existing
instances of impalad running on the other hosts will not find out about this
new host. Once the statestore process is restarted, all the information it
serves is automatically reconstructed from all running Impala daemons.
</quote>

It appears that (despite the first sentence in the quote) the centralized
services (statestore and catalog) do represent a single point of failure. Is it
so, or am I missing something? If so, what is a workaround in case high
availability is a requirement?

Regards,
Aleksei Maželis

Re:Impala's single point of failure

Reply via email to