[ 
https://issues.apache.org/jira/browse/IMPALA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672259#comment-16672259
 ] 

ASF subversion and git services commented on IMPALA-7241:
---------------------------------------------------------

Commit 5391100c7eeb33193de7861e761c3920f1d1eecc in impala's branch 
refs/heads/master from Michael Ho
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5391100 ]

IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use KRPC

This change converts ReportExecStatus() RPC from thrift
based RPC to KRPC. This is done in part of the preparation
for fixing IMPALA-2990 as we can take advantage of TCP connection
multiplexing in KRPC to avoid overwhelming the coordinator
with too many connections by reducing the number of TCP connection
to one for each executor.

This patch also introduces a new service pool for all query execution
control related RPCs in the future so that control commands from
coordinators aren't blocked by long-running DataStream services' RPCs.
To avoid unnecessary delays due to sharing the network connections
between DataStream service and Control service, this change added the
service name as part of the user credentials for the ConnectionId
so each service will use a separate connection.

The majority of this patch is mechanical conversion of some Thrift
structures used in ReportExecStatus() RPC to Protobuf. Note that the
runtime profile is still retained as a Thrift structure as Impala
clients will still fetch query profiles using Thrift RPCs. This also
avoids duplicating the serialization implementation in both Thrift
and Protobuf for the runtime profile. The Thrift runtime profiles
are serialized and sent as a sidecar in ReportExecStatus() RPC.

This patch also fixes IMPALA-7241 which may lead to duplicated
dml stats being applied. The fix is by adding a monotonically
increasing version number for fragment instances' reports. The
coordinator will ignore any report smaller than or equal to the
version in the last report.

Testing done:
1. Exhaustive build.
2. Added some targeted test cases for profile serialization failure
   and RPC retries/timeout.

Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Reviewed-on: http://gerrit.cloudera.org:8080/10855
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 0)
> -----------------------------------------------------------
>
>                 Key: IMPALA-7241
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7241
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.1.0
>            Reporter: Michael Brown
>            Assignee: Michael Ho
>            Priority: Blocker
>              Labels: crash, stress
>
> During a stress test with 8 nodes running an insecure debug build based off 
> master, an impalad hit a DCHECK. The concurrency level at the time was 
> between 150-180 queries.
> The DCHECK was {{progress-updater.cc:43] Check failed: delta >= 0 (-3 vs. 
> 0)}}.
> The stack was:
> {noformat}
> #0  0x00007f6a73e811f7 in raise () from /lib64/libc.so.6
> #1  0x00007f6a73e828e8 in abort () from /lib64/libc.so.6
> #2  0x0000000004300e34 in google::DumpStackTraceAndExit() ()
> #3  0x00000000042f78ad in google::LogMessage::Fail() ()
> #4  0x00000000042f9152 in google::LogMessage::SendToLog() ()
> #5  0x00000000042f7287 in google::LogMessage::Flush() ()
> #6  0x00000000042fa84e in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x0000000001f7fedd in impala::ProgressUpdater::Update(long) ()
> #8  0x000000000313912b in 
> impala::Coordinator::BackendState::InstanceStats::Update(impala::TFragmentInstanceExecStatus
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #9  0x00000000031369e6 in 
> impala::Coordinator::BackendState::ApplyExecStatusReport(impala::TReportExecStatusParams
>  const&, impala::Coordinator::ExecSummary*, impala::ProgressUpdater*) ()
> #10 0x00000000031250b4 in 
> impala::Coordinator::UpdateBackendExecStatus(impala::TReportExecStatusParams 
> const&) ()
> #11 0x0000000001e86395 in 
> impala::ClientRequestState::UpdateBackendExecStatus(impala::TReportExecStatusParams
>  const&) ()
> #12 0x0000000001e27594 in 
> impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&, 
> impala::TReportExecStatusParams const&) ()
> #13 0x0000000001ebb8a0 in 
> impala::ImpalaInternalService::ReportExecStatus(impala::TReportExecStatusResult&,
>  impala::TReportExecStatusParams const&) ()
> #14 0x0000000002fa6f62 in 
> impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int, 
> apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, 
> void*) ()
> #15 0x0000000002fa6540 in 
> impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,
>  apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()
> #16 0x00000000018892b0 in 
> apache::thrift::TDispatchProcessor::process(boost::shared_ptr<apache::thrift::protocol::TProtocol>,
>  boost::shared_ptr<apache::thrift::protocol::TProtocol>, void*) ()
> #17 0x0000000001c80d1b in 
> apache::thrift::server::TAcceptQueueServer::Task::run() ()
> #18 0x0000000001c78ffb in 
> impala::ThriftThread::RunRunnable(boost::shared_ptr<apache::thrift::concurrency::Runnable>,
>  impala::Promise<unsigned long, (impala::PromiseMode)0>*) ()
> #19 0x0000000001c7a721 in boost::_mfi::mf2<void, impala::ThriftThread, 
> boost::shared_ptr<apache::thrift::concurrency::Runnable>, 
> impala::Promise<unsigned long, 
> (impala::PromiseMode)0>*>::operator()(impala::ThriftThread*, 
> boost::shared_ptr<apache::thrift::concurrency::Runnable>, 
> impala::Promise<unsigned long, (impala::PromiseMode)0>*) const ()
> #20 0x0000000001c7a5b7 in void 
> boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, 
> boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, 
> boost::_bi::value<impala::Promise<unsigned long, (impala::PromiseMode)0>*> 
> >::operator()<boost::_mfi::mf2<void, impala::ThriftThread, 
> boost::shared_ptr<apache::thrift::concurrency::Runnable>, 
> impala::Promise<unsigned long, (impala::PromiseMode)0>*>, 
> boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf2<void, 
> impala::ThriftThread, 
> boost::shared_ptr<apache::thrift::concurrency::Runnable>, 
> impala::Promise<unsigned long, (impala::PromiseMode)0>*>&, 
> boost::_bi::list0&, int) ()
> #21 0x0000000001c7a303 in boost::_bi::bind_t<void, boost::_mfi::mf2<void, 
> impala::ThriftThread, 
> boost::shared_ptr<apache::thrift::concurrency::Runnable>, 
> impala::Promise<unsigned long, (impala::PromiseMode)0>*>, 
> boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, 
> boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, 
> boost::_bi::value<impala::Promise<unsigned long, (impala::PromiseMode)0>*> > 
> >::operator()() ()
> #22 0x0000000001c7a216 in 
> boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, 
> boost::_mfi::mf2<void, impala::ThriftThread, 
> boost::shared_ptr<apache::thrift::concurrency::Runnable>, 
> impala::Promise<unsigned long, (impala::PromiseMode)0>*>, 
> boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, 
> boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, 
> boost::_bi::value<impala::Promise<unsigned long, (impala::PromiseMode)0>*> > 
> >, void>::invoke(boost::detail::function::function_buffer&) ()
> #23 0x0000000001bbb81c in boost::function0<void>::operator()() const ()
> #24 0x0000000001fb6eaf in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, 
> impala::Promise<long, (impala::PromiseMode)0>*) ()
> #25 0x0000000001fbef87 in void 
> boost::_bi::list5<boost::_bi::value<std::string>, 
> boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
> boost::_bi::value<impala::ThreadDebugInfo*>, 
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> 
> >::operator()<void (*)(std::string const&, std::string const&, 
> boost::function<void ()>, impala::ThreadDebugInfo const*, 
> impala::Promise<long, (impala::PromiseMode)0>*), 
> boost::_bi::list0>(boost::_bi::type<void>, void (*&)(std::string const&, 
> std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, 
> impala::Promise<long, (impala::PromiseMode)0>*), boost::_bi::list0&, int) ()
> #26 0x0000000001fbeeab in boost::_bi::bind_t<void, void (*)(std::string 
> const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo 
> const*, impala::Promise<long, (impala::PromiseMode)0>*), 
> boost::_bi::list5<boost::_bi::value<std::string>, 
> boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
> boost::_bi::value<impala::ThreadDebugInfo*>, 
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > 
> >::operator()() ()
> #27 0x0000000001fbee6e in boost::detail::thread_data<boost::_bi::bind_t<void, 
> void (*)(std::string const&, std::string const&, boost::function<void ()>, 
> impala::ThreadDebugInfo const*, impala::Promise<long, 
> (impala::PromiseMode)0>*), boost::_bi::list5<boost::_bi::value<std::string>, 
> boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
> boost::_bi::value<impala::ThreadDebugInfo*>, 
> boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > > 
> >::run() ()
> #28 0x0000000003222b0a in thread_proxy ()
> #29 0x00007f6a74216e25 in start_thread () from /lib64/libpthread.so.0
> #30 0x00007f6a73f4434d in clone () from /lib64/libc.so.6
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to