Yongjun Zhang created IMPALA-7912:
-------------------------------------
Summary: Inconsistent and intermittent behavior of queries
Key: IMPALA-7912
URL: https://issues.apache.org/jira/browse/IMPALA-7912
Project: IMPALA
Issue Type: Improvement
Reporter: Yongjun Zhang
When investigating IMPALA-5474, which reported the different log messages of
two queries, I found inconsistent and intermittent behavior of the queries.
I added a line to log state change in client-request-state.cc:
{code:java}
void ClientRequestState::UpdateOperationState(
TOperationState::type operation_state) {
operation_state_ = operation_state;
summary_profile_->AddInfoString("Query State",
PrintThriftEnum(BeeswaxQueryState()));
VLOG_QUERY << "YJDebug UpdateOperationState: " <<
PrintThriftEnum(BeeswaxQueryState()) <<
endl;
}{code}
and a line to log value got by ImpalaServer::get_state
{code:java}
beeswax::QueryState::type ImpalaServer::get_state(const QueryHandle& handle) {
......
// Take the lock to ensure that if the client sees a query_state == EXCEPTION,
it is
// guaranteed to see the error query_status.
lock_guard<mutex> l(*request_state->lock());
beeswax::QueryState::type query_state = request_state->BeeswaxQueryState();
DCHECK_EQ(query_state == beeswax::QueryState::EXCEPTION,
!request_state->query_status().ok());
VLOG_QUERY << "YJDebug ImpalaServer::get_state: " << query_state << endl;
return query_state;
}{code}
Query1. select id from bad_column_metadata s;
A: most of the time:
{code:java}
I1129 12:09:39.639384 17555 client-request-state.cc:1232] YJDebug
UpdateOperationState: COMPILED
I1129 12:09:39.639884 17555 impala-beeswax-server.cc:265] YJDebug
ImpalaServer::get_state: 2
I1129 12:09:39.641791 17585 client-request-state.cc:1232] YJDebug
UpdateOperationState: RUNNING
I1129 12:09:39.668946 17586 client-request-state.cc:1232] YJDebug
UpdateOperationState: FINISHED
I1129 12:09:39.740308 17555 impala-beeswax-server.cc:265] YJDebug
ImpalaServer::get_state: 4
I1129 12:09:39.741384 17555 client-request-state.cc:1232] YJDebug
UpdateOperationState: EXCEPTION{code}
We can see that the query_state transitioned from COMILED to RUNNING to
FINISHED then to EXCEPTION.
B: sometimes, I saw it in the beginning after started impala shell (possibly
right after restarting impala cluster):(
{code:java}
I1129 12:15:25.937026 18234 client-request-state.cc:1232] YJDebug
UpdateOperationState: COMPILED
I1129 12:15:25.937563 18234 impala-beeswax-server.cc:265] YJDebug
ImpalaServer::get_state: 2
I1129 12:15:25.952119 18264 client-request-state.cc:1232] YJDebug
UpdateOperationState: RUNNING
I1129 12:15:26.037926 18234 impala-beeswax-server.cc:265] YJDebug
ImpalaServer::get_state: 3
I1129 12:15:26.079480 18265 client-request-state.cc:1232] YJDebug
UpdateOperationState: EXCEPTION
I1129 12:15:26.138288 18234 impala-beeswax-server.cc:265] YJDebug
ImpalaServer::get_state: 5
yzhang@yzhang-pa:~/apache/Impala/logs/cluster${code}
We can see that the query state transitioned from COMPILED to RUNNING then to
EXCEPTION.
But most of the time, it got into FINISHED state before getting into EXCEPTION
state, that's why ERROR was reported.
Query2. select id, cnt from bad_column_metadata t, (select 1 cnt) u;
{code:java}
1129 12:11:42.715994 17679 client-request-state.cc:1232] YJDebug
UpdateOperationState: COMPILED
I1129 12:11:42.716413 17679 impala-beeswax-server.cc:265] YJDebug
ImpalaServer::get_state: 2
I1129 12:11:42.717469 17721 client-request-state.cc:1232] YJDebug
UpdateOperationState: RUNNING
I1129 12:11:42.745760 17722 client-request-state.cc:1232] YJDebug
UpdateOperationState: EXCEPTION
I1129 12:11:42.816792 17679 impala-beeswax-server.cc:265] YJDebug
ImpalaServer::get_state: 5{code}
We can see that the query state transitioned from COMPILED to RUNNING then to
EXCEPTION. It persistently shows this state transition, and reports WARNING in
the end.
There are two issues here:
# Inconsistent behavior of query1 comparing with query2. Why it reached
FINISHED before getting into EXCEPTION?
# Intermittent behavior of query1: it sometimes get into FINISHED state,
sometimes get into EXCEPTION state.
The root cause of these two issues might be the same. Creating this Jira to log
the issues.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)