Hi, Referring to the chapter of impala_faq about single point of failure at https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_faq.html#faq_ha__faq_spof :
<quote> > There is not a single point of failure in Impala. All Impala daemons are > fully able to handle incoming queries. If a machine fails however, all > queries with fragments running on that machine will fail. Because queries > are expected to return quickly, you can just rerun the query if there is a > failure. See Impala Concepts and Architecture for details about the Impala > architecture. > The longer answer: Impala must be able to connect to the Hive metastore. > Impala aggressively caches metadata so the metastore host should have > minimal load. Impala relies on the HDFS NameNode, and, in CDH4, you can > configure HA for HDFS. Impala also has centralized services, known as the > statestore and catalog services, that run on one host only. Impala > continues to execute queries if the statestore host is down, but it will > not get state updates. For example, if a host is added to the cluster while > the statestore host is down, the existing instances of impalad running on > the other hosts will not find out about this new host. Once the statestore > process is restarted, all the information it serves is automatically > reconstructed from all running Impala daemons. </quote> It appears that (despite the first sentence in the quote) the centralized services (statestore and catalog) do represent a single point of failure. Is it so, or am I missing something? If so, what is a workaround in case high availability is a requirement? Regards, Aleksei Maželis
