Mostafa Mokhtar created IMPALA-7172:
---------------------------------------
Summary: Statestore should verify that all subscribers are running
the same version of Impala
Key: IMPALA-7172
URL: https://issues.apache.org/jira/browse/IMPALA-7172
Project: IMPALA
Issue Type: New Feature
Components: Distributed Exec
Affects Versions: Impala 2.13.0
Reporter: Mostafa Mokhtar
While running a metadata test which uses sync_ddl=1, tests appeared to hang
indefinitely.
Turns out one of the Impala daemons was running an older build which caused
statestore topic updates to continuously fail.
Ideally the SS should track the version across subscribers and black list the
ones that don't match the SS and CS version.
Logs from SS
{code}
I0614 11:11:04.410529 57312 statestore.cc:259] Preparing initial
impala-membership topic update for [email protected]:22000.
Size = 2.06 KB
I0614 11:11:04.411222 57312 client-cache.cc:82] ReopenClient(): re-creating
client for vb0204.halxg.cloudera.com:23000
I0614 11:11:04.411821 57312 client-cache.h:304] RPC Error: Client for
vb0204.halxg.cloudera.com:23000 hit an unexpected exception: No more data to
read., type: N6apache6thrift9transport19TTransportExceptionE, rpc:
N6impala20TUpdateStateResponseE, send: done
I0614 11:11:04.411831 57312 client-cache.cc:174] Broken Connection, destroy
client for vb0204.halxg.cloudera.com:23000
I0614 11:11:04.411861 57312 statestore.cc:891] Unable to send priority topic
update message to subscriber [email protected]:22000, received
error: RPC Error: Client for vb0204.halxg.cloudera.com:23000 hit an unexpected
exception: No more data to read., type:
N6apache6thrift9transport19TTransportExceptionE, rpc:
N6impala20TUpdateStateResponseE, send: done
{code}
Log from Impalad
{code}
I0614 11:03:19.479164 41915 thrift-util.cc:123] TAcceptQueueServer exception:
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
I0614 11:03:19.680028 41916 thrift-util.cc:123] TAcceptQueueServer exception:
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
I0614 11:03:19.680776 41917 thrift-util.cc:123] TAcceptQueueServer exception:
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
I0614 11:03:19.881295 41918 thrift-util.cc:123] TAcceptQueueServer exception:
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)