[
https://issues.apache.org/jira/browse/IMPALA-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767728#comment-16767728
]
Tim Armstrong commented on IMPALA-7665:
---------------------------------------
The part I wasn't sure able is whether emulating the statestore behaviour on
the Impalad is a good thing. I understand we want a grace period to allow other
impalads to connect back to the statestore and for the topic update to go out,
but it's less clear how much additional delay we want and if it needs to be the
same value. I guess the idea is if RPCs are failing or delayed, we want to be a
little slow in considering the impalad to be dead, same as the statestore?
We also won't have the statestore flags set on the impalad, so if we were going
to embark on that path we'd need to propagate them somehow.
> Bringing up stopped statestore causes queries to fail
> -----------------------------------------------------
>
> Key: IMPALA-7665
> URL: https://issues.apache.org/jira/browse/IMPALA-7665
> Project: IMPALA
> Issue Type: Bug
> Components: Distributed Exec
> Affects Versions: Impala 3.1.0
> Reporter: Tim Armstrong
> Priority: Critical
> Labels: query-lifecycle, statestore
>
> I can reproduce this by running a long-running query then cycling the
> statestore:
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh -q
> "select distinct * from tpch10_parquet.lineitem"
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build
> c486fb9ea4330e1008fa9b7ceaa60492e43ee120)
> Query: select distinct * from tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 17:06:48 (Coordinator:
> http://tarmstrong-box:25000)
> {noformat}
> If I kill the statestore, the query runs fine, but if I start up the
> statestore again, it fails.
> {noformat}
> # In one terminal, start up the statestore
> $
> /home/tarmstrong/Impala/incubator-impala/be/build/latest/statestore/statestored
> -log_filename=statestored
> -log_dir=/home/tarmstrong/Impala/incubator-impala/logs/cluster -v=1
> -logbufsecs=5 -max_log_files=10
> # The running query then fails
> WARNINGS: Failed due to unreachable impalad(s): tarmstrong-box:22001,
> tarmstrong-box:22002
> {noformat}
> Note that I've seen different subsets impalads reported as failed, e.g.
> "Failed due to unreachable impalad(s): tarmstrong-box:22001"
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]