> On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote: > >
Great review, thanks for the suggestions Andrei! > On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote: > > src/logging/logging.cpp > > Lines 115 (patched) > > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line115> > > > > Do I understand correctly that all the exception filter code tries to > > avoid dynamic allocation at all costs? > > > > Maybe leave a NOTE? Or is it obvious? Yes I tried to for the same reason as on POSIX (it's not async-signal-safe). I imagined some windows exceptions could arrive when a memory allocation lock is held (e.g. interrupt or user sent signal). I'll write this down. > On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote: > > src/logging/logging.cpp > > Lines 117 (patched) > > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line117> > > > > Wrapped link (here and below) can be hard to copy into browser. > > > > Not sure what our styleguide says; checked other parts of Mesos > > codebase - links like this are typically kept on a single line with `// > > NOLINT` at the end. Yeah, I've personally found it cleaner to wrap them when I've added links, and I don't think it's a big burden to open them (especially since github and many IDEs don't seem to make them clickable), but I can use the NOLINT approach. > On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote: > > src/logging/logging.cpp > > Lines 234 (patched) > > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line234> > > > > Do I understand correctly that each thread has its own unhandled > > exception filter (which we reset for all threads with one call of > > `SetUnhandledExceptionFilter`) > > > > > > https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter > > > > but the functions of `dbghelp.h` that we use are not thread-safe? > > > > https://docs.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symgetlinefromaddr64 > > > > Doesn't this mean that we need to synchronize access to `dbghelp` or, > > at least, to make sure that concurrent run is dropped? > > > > Given the nature of Mesos, I would not be surprised if we will > > eventually run into a pair of simultaneous exceptions (probably triggered > > by a common cause) in two threads. > > > > Having only one stack of two is probably tolerable, but we would want > > to avoid getting garbadge/triggering UB at all costs. Good find! > On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote: > > src/logging/logging.cpp > > Lines 308 (patched) > > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line308> > > > > In the not very likely case of a very deep stack, can we somehow > > explicitly inform the reader that the trace was clipped due to this > > limitation? Good suggestion, thanks! > On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote: > > src/logging/logging.cpp > > Lines 449 (patched) > > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line449> > > > > This function returns the previously installed exception filter, if > > there was one; do you have any idea whether we need to care of this or not, > > and why? > > > > I would naively expect that there should be no handler initially. > > Probably, if the returned value is not NULL, this means that we don't > > know what is going on (i.e. what 3rd-party library, after we updated it, > > started to install its own filter before us) and should just crash here? Since we don't have a notion of disabling the failure handling, I didn't add any preserve / restore code. Tried adding a check, turns out it does not return NULL, not sure what is getting set up here (maybe a libc thing for mapping to signals?) - Benjamin ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72239/#review219968 ----------------------------------------------------------- On March 17, 2020, 12:09 a.m., Benjamin Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72239/ > ----------------------------------------------------------- > > (Updated March 17, 2020, 12:09 a.m.) > > > Review request for mesos, Andrei Sekretenko, Greg Mann, and Joseph Wu. > > > Bugs: MESOS-7245 > https://issues.apache.org/jira/browse/MESOS-7245 > > > Repository: mesos > > > Description > ------- > > For now, this starts with adding the necessary code to logging.cpp up > in mesos. This gives us stack traces for any mesos binaries or > libraries that initialize logging with `installFailureSignalHandler` > set to true. > > Ideally, we would either (1) move this into stout, or (2) upstream > some changes to glog. Both approaches would allow us to also get > stack traces from stout tests and libprocess tests. > > For now however, this patch is just aiming to fix the urgent issue > of a lack of stack traces for mesos crashes. > > > Diffs > ----- > > src/logging/logging.cpp baca51b3956dbaa282c03e73a2c261ff103f4821 > > > Diff: https://reviews.apache.org/r/72239/diff/3/ > > > Testing > ------- > > Ran through CI and locally, for the recent reservation update crash this > produces: > > ``` > E0000 00:00:00.000000 18216 logging.cpp:295] RAW: Received fatal exception > 0xc0000005 EXCEPTION_ACCESS_VIOLATION > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > google::protobuf::internal::RepeatedPtrIterator<mesos::Resource_ReservationInfo > const >::operator* [00007FF75279306E+14] > (c:\users\bmahler\git\mesos\build\3rdparty\protobuf-3.5.0\src\protobuf-3.5.0\src\google\protobuf\repeated_field.h:2266) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::authorization::ActionObject::reserve [00007FF753C94880+720] > (c:\users\bmahler\git\mesos\src\master\authorization.cpp:236) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::internal::master::Master::Http::_reserve [00007FF753D007CF+1423] > (c:\users\bmahler\git\mesos\src\master\http.cpp:1960) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::internal::master::Master::Http::reserveResources > [00007FF753D07FFC+380] (c:\users\bmahler\git\mesos\src\master\http.cpp:1984) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > mesos::internal::master::Master::Http::api [00007FF753CED99B+6571] > (c:\users\bmahler\git\mesos\src\master\http.cpp:352) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_339a9fa0c2c868bcf197f8c97b63146d>::operator() [00007FF7527AA2E4+100] > (c:\users\bmahler\git\mesos\src\master\master.cpp:909) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_339a9fa0c2c868bcf197f8c97b63146d> > &,process::http::Request const > &,Option<process::http::authentication::Principal> const &> > [00007FF75262D61A+122] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_339a9fa0c2c868bcf197f8c97b63146d> > &,process::http::Request const > &,Option<process::http::authentication::Principal> const &> > [00007FF7526E55AA+122] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_ret<process::Future<process::http::Response>,0>::_Call<<lambda_339a9fa0c2c868bcf197f8c97b63146d> > &,process::http::Request const > &,Option<process::http::authentication::Principal> const &> > [00007FF75262D55A+122] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Func_impl_no_alloc<<lambda_339a9fa0c2c868bcf197f8c97b63146d>,process::Future<process::http::Response>,process::http::Request > const &,Option<process::http::authentication::Principal> const &>::_Do_call > [00007FF7527EDF2F+111] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\functional:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Func_class<process::Future<process::http::Response>,process::http::Request > const &,Option<process::http::authentication::Principal> const > &>::operator() [00007FF75472BAAD+157] (c:\program files (x86)\microsoft > visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\functional:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_23c733dc7debde2255779e987bee5763>::operator() [00007FF75471911F+335] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3874) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_23c733dc7debde2255779e987bee5763>,bool> > [00007FF754638F1F+95] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_23c733dc7debde2255779e987bee5763>,bool> > [00007FF75469AB2E+94] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>::invoke_expand<<lambda_23c733dc7debde2255779e987bee5763>,std::tuple<bool>,std::tuple<>,0> > [00007FF7546A2B9A+138] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:292) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>::operator()<> > [00007FF754620DCF+127] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:331) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > > [00007FF75463D082+66] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > > [00007FF75469EBF2+66] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Invoke<process::Future<process::http::Response> > >::operator()<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > > [00007FF75462A637+71] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:387) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>::CallableFn<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool> > >::operator() [00007FF7547261CF+95] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:463) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>::operator() [00007FF752521D5B+267] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:443) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_1161592621caf1e03e3f58e967ac7c4e>::operator() [00007FF75251C5CD+77] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\dispatch.hpp:119) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,process::ProcessBase *> [00007FF7524DA748+152] (c:\program > files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,process::ProcessBase *> [00007FF7524F2AC4+132] (c:\program > files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> > >::invoke_expand<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::tuple<std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,std::tuple<process::ProcessBase * &&>,0,1,2> > [00007FF7524F5EE9+249] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:292) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >::operator()<process::ProcessBase *> > [00007FF7524D413A+122] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:331) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524DC40C+76] > (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524F46BC+76] > (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::internal::Invoke<void>::operator()<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524D6031+81] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:399) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<void __cdecl(process::ProcessBase > *)>::CallableFn<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response> > > >,lambda::CallableOnce<process::Future<process::http::Response> > __cdecl(void)>,std::_Ph<1> > >::operator() [00007FF752520296+102] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:464) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > lambda::CallableOnce<void __cdecl(process::ProcessBase *)>::operator() > [00007FF75472B83D+285] > (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:444) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::ProcessBase::consume [00007FF7545D8439+73] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3666) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::DispatchEvent::consume [00007FF7547890EA+74] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\event.hpp:200) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::ProcessBase::serve [00007FF75064AB77+71] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\process.hpp:88) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > process::ProcessManager::resume [00007FF7545E6260+1696] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3055) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > <lambda_124422ac022fa041208b80c1460630d7>::operator() [00007FF754718541+145] > (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:2545) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_Invoker_functor::_Call<<lambda_124422ac022fa041208b80c1460630d7> > > [00007FF754638A20+48] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::invoke<<lambda_124422ac022fa041208b80c1460630d7> > [00007FF75469A650+48] > (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > > > > >::_Execute<0> [00007FF75464CE2C+60] (c:\program files (x86)\microsoft > visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:239) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > > > > >::_Run [00007FF75476E6AA+106] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:245) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > > > > >::_Go [00007FF754759698+40] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:231) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: std::_Pad::_Call_func > [00007FF75473EA3D+45] (c:\program files (x86)\microsoft visual > studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:209) > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > register_onexit_function [00007FFA6B384FB8+1192] > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: > register_onexit_function [00007FFA6B384BF1+225] > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: BaseThreadInitThunk > [00007FFA7F687974+20] > E0000 00:00:00.000000 18216 logging.cpp:320] RAW: RtlUserThreadStart > [00007FFA81F3A261+33] > ``` > > > Thanks, > > Benjamin Mahler > >
