[ 
https://issues.apache.org/jira/browse/MESOS-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843632#comment-15843632
 ] 

James Peach commented on MESOS-7017:
------------------------------------

Here's the partial stack trace:

{noformat}
#2  0x00007fb830734696 in google::DumpStackTraceAndExit () at 
src/utilities.cc:147
#3  0x00007fb83072c08d in google::LogMessage::Fail () at src/logging.cc:1458
#4  0x00007fb83072de1d in google::LogMessage::SendToLog (this=Unhandled dwarf 
expression opcode 0xf3
) at src/logging.cc:1412
#5  0x00007fb83072bc12 in google::LogMessage::Flush (this=0x7fb8227f3890) at 
src/logging.cc:1281
#6  0x00007fb83072e7f9 in google::LogMessageFatal::~LogMessageFatal 
(this=Unhandled dwarf expression opcode 0xf3
) at src/logging.cc:1984
#7  0x00007fb82fb35113 in evolve<mesos::v1::master::Response> (response=...) at 
../../src/internal/evolve.cpp:63
#8  mesos::internal::evolve (response=...) at ../../src/internal/evolve.cpp:218
#9  0x00007fb82fba8dd6 in mesos::internal::master::Master::Http::<lambda(const 
std::tuple<process::Owned<mesos::ObjectApprover>, 
process::Owned<mesos::ObjectApprover> >&)>::operator()(const 
std::tuple<process::Owned<mesos::ObjectApprover>, 
process::Owned<mesos::ObjectApprover> > &) const (__closure=0x7fb720c7a940, 
approvers=Unhandled dwarf expression opcode 0xf3
)
    at ../../src/master/http.cpp:3772
#10 0x00007fb82fba9068 in operator() (__functor=Unhandled dwarf expression 
opcode 0xf3
) at ../../3rdparty/libprocess/include/process/deferred.hpp:225
#11 std::_Function_handler<process::Future<process::http::Response>(), 
process::_Deferred<G>::operator std::function<R(P0)>() const::<lambda(P0)> 
[with R = process::Future<process::http::Response>; P0 = const 
std::tuple<process::Owned<mesos::ObjectApprover>, 
process::Owned<mesos::ObjectApprover> >&; F = 
mesos::internal::master::Master::Http::getTasks(const mesos::master::Call&, 
const Option<std::basic_string<char> >&, mesos::ContentType) 
const::<lambda(const std::tuple<process::Owned<mesos::ObjectApprover>, 
process::Owned<mesos::ObjectApprover> >&)>]::<lambda()> >::_M_invoke(const 
std::_Any_data &) (__functor=Unhandled dwarf expression opcode 0xf3
) at /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2025
#12 0x00007fb82faf73b3 in operator() (__functor=Unhandled dwarf expression 
opcode 0xf3
) at /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
#13 operator() (__functor=Unhandled dwarf expression opcode 0xf3
) at ../../3rdparty/libprocess/include/process/dispatch.hpp:112
#14 std::_Function_handler<void(process::ProcessBase*), 
process::internal::Dispatch<process::Future<T> >::operator()(const 
process::UPID&, F&&) [with F = 
std::function<process::Future<process::http::Response>()>&; R = 
process::http::Response]::<lambda(process::ProcessBase*)> >::_M_invoke(const 
std::_Any_data &, process::ProcessBase *) (__functor=Unhandled dwarf expression 
opcode 0xf3
)
    at /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
{noformat}

So the proximate cause of the crash is that {{evolve}} does an unnecessary 
bidirectional serialization. For large messages this causes 2 unnecessary large 
allocations even if is doesn't trigger the {{CHECK}}.

> HTTP API responses can crash the master.
> ----------------------------------------
>
>                 Key: MESOS-7017
>                 URL: https://issues.apache.org/jira/browse/MESOS-7017
>             Project: Mesos
>          Issue Type: Bug
>          Components: HTTP API
>            Reporter: James Peach
>            Priority: Critical
>
> The master can crash when generating large responses to small API requests. 
> One manifestation of this is querying the tasks.
> {noformat}
> [libprotobuf ERROR google/protobuf/io/coded_stream.cc:180] A protocol message 
> was rejected because it was too big (more than 67108864 bytes).  To increase 
> the limit (or to disable these warnings), see 
> CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
> F0126 18:34:18.790386 26230 evolve.cpp:63] Check failed: 
> t.ParsePartialFromString(data) Failed to parse mesos.v1.master.Response while 
> evolving from mesos.master.Response
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to