[
https://issues.apache.org/jira/browse/IMPALA-9788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-9788:
----------------------------------
Attachment: statestore.log
> Weird things happen when impalad restarts with different hostname but same IP
> -----------------------------------------------------------------------------
>
> Key: IMPALA-9788
> URL: https://issues.apache.org/jira/browse/IMPALA-9788
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.4.0
> Reporter: Tim Armstrong
> Assignee: Sahil Takiar
> Priority: Critical
> Attachments: Screenshot from 2020-05-28 10-53-16.png,
> get-root-sink-resolved.txt, statestore.log
>
>
> I was messing around with running impala in a single-node dockerized
> configuration and ran into a bunch of weirdness stemming when I restarted the
> impalad. It got into a state where where was a new and old statestore
> registration with the same IP/port and different hostnames (since docker
> generates new hostnames for each incarnation of the container).
> I saw a crash in Coordinator::GetRootSink(). The cause of that is the
> coordinator treating the same impalad as two distinct backends, and sending
> two execute RPCs to the backend (this is a single node cluster).
> {noformat}
> I0528 17:32:41.760128 573 coordinator.cc:143]
> f84b158b036445ad:3a9defdf00000000] Exec()
> query_id=f84b158b036445ad:3a9defdf00000000 stmt=SELECT COUNT(*) FROM
> tpcds_kudu.call_center
> I0528 17:32:41.760670 573 coordinator.cc:463]
> f84b158b036445ad:3a9defdf00000000] starting execution on 2 backends for
> query_id=f84b158b036445ad:3a9defdf00000000
> ..
> I0528 17:32:41.762449 78 control-service.cc:153]
> f84b158b036445ad:3a9defdf00000000] ExecQueryFInstances():
> query_id=f84b158b036445ad:3a9defdf00000000 coord=a16ac03fc53b:22000
> #instances=1
> I0528 17:32:41.761706 79 control-service.cc:153]
> f84b158b036445ad:3a9defdf00000000] ExecQueryFInstances():
> query_id=f84b158b036445ad:3a9defdf00000000 coord=a16ac03fc53b:22000
> #instances=4
> ..
> Wrote minidump to
> /opt/impala/logs/minidumps/impalad/15727084-c931-49e1-62d37e86-75cfe0f6.dmp
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGSEGV (0xb) at pc=0x00000000011a0d50, pid=1, tid=0x00007f92b5e8c700
> #
> # JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build
> 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08)
> # Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64
> compressed oops)
> # Problematic frame:
> Wrote minidump to
> /opt/impala/logs/minidumps/impalad/15727084-c931-49e1-62d37e86-75cfe0f6.dmp
> # C [impalad+0xda0d50] impala::FragmentInstanceState::GetRootSink()
> const+0x0
> #
> # Core dump written. Default location: /opt/impala/core or core.1
> #
> # An error report file with more information is saved as:
> # /opt/impala/hs_err_pid1.log
> #
> # If you would like to submit a bug report, please visit:
> # http://bugreport.java.com/bugreport/crash.jsp
> #
> {noformat}
> CC [~twm378]
> At a separate time I saw it trip the "Tried to add existing backend to
> executor group" case in ExecutorGroup::AddExecutor().
> {noformat}
> >>void ExecutorGroup::AddExecutor(const BackendDescriptorPB& be_desc) {
> // be_desc.is_executor can be false for the local backend when scheduling
> queries to run
> // on the coordinator host.
> DCHECK(!be_desc.ip_address().empty());
> Executors& be_descs = executor_map_[be_desc.ip_address()];
> auto eq = [&be_desc](const BackendDescriptorPB& existing) {
> // The IP addresses must already match, so it is sufficient to check
> the port.
> DCHECK_EQ(existing.ip_address(), be_desc.ip_address());
> return existing.address().port() == be_desc.address().port();
> };
> if (find_if(be_descs.begin(), be_descs.end(), eq) != be_descs.end()) {
> LOG(DFATAL) << "Tried to add existing backend to executor group: "
> << be_desc.krpc_address();
> return;
> }
> if (!CheckConsistencyOrWarn(be_desc)) {
> LOG(WARNING) << "Ignoring inconsistent backend for executor group: "
> << be_desc.krpc_address();
> return;
> }
> if (be_descs.empty()) {
> executor_ip_hash_ring_.AddNode(be_desc.ip_address());
> }
> be_descs.push_back(be_desc);
> executor_ip_map_[be_desc.address().hostname()] = be_desc.ip_address();
> }
> {noformat}
> I'm not sure if using the hostname to identify impalads is even useful at
> this point, we could probably simplify this by using IP address only.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]