[jira] [Created] (IMPALA-10788) Statestore Scalability document should mention statestore_subscriber_timeout_secs and statestore_heartbeat_tcp_timeout_seconds

Quanlong Huang (Jira) Mon, 12 Jul 2021 01:00:36 -0700

Quanlong Huang created IMPALA-10788:
---------------------------------------


             Summary: Statestore Scalability document should mention 
statestore_subscriber_timeout_secs and statestore_heartbeat_tcp_timeout_seconds
                 Key: IMPALA-10788
                 URL: https://issues.apache.org/jira/browse/IMPALA-10788
             Project: IMPALA
          Issue Type: Documentation
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


The current document about statestore scalability is 
[http://impala.apache.org/docs/build/html/topics/impala_scalability.html#ariaid-title3]

We should add statestore_heartbeat_tcp_timeout_seconds and 
statestore_subscriber_timeout_secs to the doc. They also impact the heartbeat 
timeout detection. Incorrect settings may result in false positive liveness 
dection and queries failed by "Cancelled due to unreachable impalad(s)" error.

Current document:
 ---------------------------------------------------
h3. Scalability Considerations for the Impala Statestore

Before Impala 2.1, the statestore sent only one kind of message to its 
subscribers. This message contained all updates for any topics that a 
subscriber had subscribed to. It also served to let subscribers know that the 
statestore had not failed, and conversely the statestore used the success of 
sending a heartbeat to a subscriber to decide whether or not the subscriber had 
failed.

Combining topic updates and failure detection in a single message led to 
bottlenecks in clusters with large numbers of tables, partitions, and HDFS data 
blocks. When the statestore was overloaded with metadata updates to transmit, 
heartbeat messages were sent less frequently, sometimes causing subscribers to 
time out their connection with the statestore. Increasing the subscriber 
timeout and decreasing the frequency of statestore heartbeats worked around the 
problem, but reduced responsiveness when the statestore failed or restarted.

As of Impala 2.1, the statestore now sends topic updates and heartbeats in 
separate messages. This allows the statestore to send and receive a steady 
stream of lightweight heartbeats, and removes the requirement to send topic 
updates according to a fixed schedule, reducing statestore network overhead.

The statestore now has the following relevant configuration flags for the 
statestored daemon:

{{-statestore_num_update_threads}} 
 The number of threads inside the statestore dedicated to sending topic 
updates. You should not typically need to change this value.
 *Default:* 10

{{-statestore_update_frequency_ms}} 
 The frequency, in milliseconds, with which the statestore tries to send topic 
updates to each subscriber. This is a best-effort value; if the statestore is 
unable to meet this frequency, it sends topic updates as fast as it can. You 
should not typically need to change this value.
 *Default:* 2000

{{-statestore_num_heartbeat_threads}} 
 The number of threads inside the statestore dedicated to sending heartbeats. 
You should not typically need to change this value.
 *Default:* 10

{{-statestore_heartbeat_frequency_ms}} 
 The frequency, in milliseconds, with which the statestore tries to send 
heartbeats to each subscriber. This value should be good for large catalogs and 
clusters up to approximately 150 nodes. Beyond that, you might need to increase 
this value to make the interval longer between heartbeat messages.
 *Default:* 1000 (one heartbeat message every second)

If it takes a very long time for a cluster to start up, and impala-shell 
consistently displays {{This Impala daemon is not ready to accept user 
requests}}, the statestore might be taking too long to send the entire catalog 
topic to the cluster. In this case, consider adding 
{{--load_catalog_in_background=false}} to your catalog service configuration. 
This setting stops the statestore from loading the entire catalog into memory 
at cluster startup. Instead, metadata for each table is loaded when the table 
is accessed for the first time.

-----------------------------------------------------
 *We need to add these:*
 -----------------------------------------------------
 {{-statestore_heartbeat_tcp_timeout_seconds}}
 The time after which a heartbeat RPC to a subscriber will timeout. This 
setting protects against badly hung machines that are not able to respond to 
the heartbeat RPC in short order. Increase this if there are intermittent 
heartbeat RPC timeouts shown in statestore's log. You can reference the max 
value of "statestore.priority-topic-update-durations" metric on statestore to 
get a reasonable value. Note that priority topic updates are assumed to be 
small amounts of data that take a small amount of time to process (similar to 
the heartbeat complexity).
 *Default:* 3

{{-statestore_max_missed_heartbeats}}
 Maximum number of consecutive heartbeat messages an impalad can miss before 
being declared failed by the statestore. You should not typically need to 
change this value.
 *Default:* 10

{{-statestore_subscriber_timeout_secs}}
 The amount of time (in seconds) that may elapse before the connection with the 
statestore is considered lost. This should be comparable to 
{{(statestore_heartbeat_frequency_ms + 
statestore_heartbeat_tcp_timeout_seconds) * statestore_max_missed_heartbeats}}, 
so subscribers won't reregister themselves too early and allow statestore to 
resend heartbeats. You can also reference the max value of 
"statestore-subscriber.heartbeat-interval-time" metrics on impalads to get a 
reasonable value.
 *Default:* 30
 -----------------------------------------------------



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (IMPALA-10788) Statestore Scalability document should mention statestore_subscriber_timeout_secs and statestore_heartbeat_tcp_timeout_seconds

Reply via email to