[ 
https://issues.apache.org/jira/browse/IMPALA-12617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795963#comment-17795963
 ] 

ASF subversion and git services commented on IMPALA-12617:
----------------------------------------------------------

Commit 93d4f2236ca5afb33e3770ba3814f36fe159288c in impala's branch 
refs/heads/master from wzhou-code
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=93d4f2236 ]

IMPALA-12617: Fix DCHECK failure for Statestore

Statestore uses thread pools to periodically send catalog topic update
and cluster membership. It adds sending tasks to the queues of thread
pools when receiving registration requests from subscribers so the
thread pools have to be ready before the Thrift server of Statestore
is started to accept registration request.
Current code call ThreadPool::Init() after the Thrift server is started.
This could cause Statestore to hit DCHECK failure when calling
ThreadPool::Offer(). It's more likely to happen when Statestore HA is
enabled since Statestore takes more time for initialization.
This patch changes the order to call ThreadPool::Init() before starting
Thrift server of the Statestore server.

Testing:
 - Repeatedly ran custom_cluster/test_statestored_ha.py on local machine
   and Jenkins over night without failure.
 - Passed core tests.

Change-Id: I91423f3de2d64cb617a06ea7adbe5ee2937bde66
Reviewed-on: http://gerrit.cloudera.org:8080/20775
Reviewed-by: Riza Suminto <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Statestore hit DCHECK failure in ThreadPool::Offer()
> ----------------------------------------------------
>
>                 Key: IMPALA-12617
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12617
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Wenzhe Zhou
>            Assignee: Wenzhe Zhou
>            Priority: Major
>             Fix For: Impala 4.4.0
>
>
> Statestore hit DCHECK failure in ThreadPool::Offer() when ran 
> custom_cluster/test_statestored_ha.py::TestStatestoredHA::test_statestored_manual_failover
>  in GVO with following stack:
> {code:java}
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> E1210 22:03:20.429834 2103817 logging.cc:256] stderr will be logged to this 
> file.
> F1210 22:03:20.438897 2103854 thread-pool.h:103] Check failed: initialized_
> *** Check failure stack trace: ***
>     @          0x38e9c5d  google::LogMessage::Fail()
>     @          0x38ebb94  google::LogMessage::SendToLog()
>     @          0x38e963c  google::LogMessage::Flush()
>     @          0x38ec0b9  google::LogMessageFatal::~LogMessageFatal()
>     @          0x18137f9  impala::Statestore::OfferUpdate()
>     @          0x1814c91  impala::Statestore::RegisterSubscriber()
>     @          0x1844606  StatestoreThriftIf::RegisterSubscriber()
>     @          0x184278e  
> impala::StatestoreServiceProcessorT<>::process_RegisterSubscriber()
>     @          0x18457fd  
> impala::StatestoreServiceProcessorT<>::dispatchCall()
>     @           0xf1cd25  apache::thrift::TDispatchProcessor::process()
>     @          0x13723a2  
> apache::thrift::server::TAcceptQueueServer::Task::run()
>     @          0x135ee3d  impala::ThriftThread::RunRunnable()
>     @          0x1360a65  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
>     @          0x19bbe58  impala::Thread::SuperviseThread()
>     @          0x19bcc61  boost::detail::thread_data<>::run()
>     @          0x240bb67  thread_proxy
>     @     0x7f24994b1609  start_thread
>     @     0x7f24974cc133  clone
> {code}
> statestore uses thread pools to periodically send catalog topic update and 
> cluster membership. It adds sending tasks to the queues of thread pools when 
> receiving registration requests from subscribers so the thread pools have to 
> be ready before the Thrift server of statestore is started to accept 
> registration request.
> Current code call ThreadPool::Init() after the Thrift server is started. This 
> could cause statestore to hit DCHECK failure in ThreadPool::Offer().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to