[
https://issues.apache.org/jira/browse/IMPALA-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744426#comment-17744426
]
ASF subversion and git services commented on IMPALA-12295:
----------------------------------------------------------
Commit 97e44c11923f3d28e08aba1b5dd66b8a35465deb in impala's branch
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=97e44c119 ]
IMPALA-12295: Statestore crashed when restarting catalogd
Statestore hit DCHECK when re-registering catalogd when CatalogD
HA is not enabled. The number of catalogd should not be increased
when re-registering catalogd.
The issue could be re-produced for unit-test case
test_restart_services.py::TestRestart::test_restart_catalog with
increased value for statestore_heartbeat_frequency_ms.
Testing:
- Verified the issue does not happen for unit-test case
test_restart_services.py::TestRestart::test_restart_catalog.
- Passed core test.
Change-Id: I031f0c6d895601e7ea8b15005a3ad52bd3254e7c
Reviewed-on: http://gerrit.cloudera.org:8080/20217
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Wenzhe Zhou <[email protected]>
> Statestore crashed when restarting catalogd
> -------------------------------------------
>
> Key: IMPALA-12295
> URL: https://issues.apache.org/jira/browse/IMPALA-12295
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Quanlong Huang
> Assignee: Wenzhe Zhou
> Priority: Critical
>
> I restart catalogd in my dev env and see statestore crashed.
> {noformat}
> $ bin/start-impala-cluster.py --restart_catalogd_only
> 19:39:44 MainThread: Found 3 impalad/1 statestored/0 catalogd process(es)
> 19:39:44 MainThread: Starting Catalog Service logging to
> /home/quanlong/workspace/Impala/logs/cluster/catalogd.INFO
> 19:39:47 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:48 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:49 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:50 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:51 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:53 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:54 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:55 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:56 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:57 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:58 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
> 19:39:58 MainThread: Error starting cluster
> Traceback (most recent call last):
> File "bin/start-impala-cluster.py", line 930, in <module>
> expected_cluster_size - expected_catalog_delays)
> File "/home/quanlong/workspace/Impala/tests/common/impala_cluster.py", line
> 185, in wait_until_ready
> self.wait_for_num_impalads(expected_num_impalads)
> File "/home/quanlong/workspace/Impala/tests/common/impala_cluster.py", line
> 231, in wait_for_num_impalads
> raise RuntimeError(msg)
> RuntimeError: statestored failed to start. {noformat}
> Check statestored.ERROR and see the DCHECK failure:
> {noformat}
> F0718 19:39:47.096524 11460 statestore-catalogd-mgr.cc:58] Check failed:
> num_registered_catalogd_ < 2
> *** Check failure stack trace: ***
> @ 0x38371ed google::LogMessage::Fail()
> @ 0x3839124 google::LogMessage::SendToLog()
> @ 0x3836bcc google::LogMessage::Flush()
> @ 0x3839649 google::LogMessageFatal::~LogMessageFatal()
> @ 0x17c3540 impala::StatestoreCatalogdMgr::RegisterCatalogd()
> @ 0x17a32fe impala::Statestore::RegisterSubscriber()
> @ 0x17c27ed StatestoreThriftIf::RegisterSubscriber()
> @ 0x17c0a92
> impala::StatestoreServiceProcessorT<>::process_RegisterSubscriber()
> @ 0x17c30b3
> impala::StatestoreServiceProcessorT<>::dispatchCall()
> @ 0xee68df apache::thrift::TDispatchProcessor::process()
> @ 0x131e224
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @ 0x130ab89 impala::ThriftThread::RunRunnable()
> @ 0x130c7b1
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0x1930e30 impala::Thread::SuperviseThread()
> @ 0x1931c39 boost::detail::thread_data<>::run()
> @ 0x2359067 thread_proxy
> @ 0x7f5cbef5a6db start_thread
> @ 0x7f5cbbcc061f clone{noformat}
> The cluster runs without catalogd HA.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]