> On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote:
> >

Great review, thanks for the suggestions Andrei!


> On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote:
> > src/logging/logging.cpp
> > Lines 115 (patched)
> > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line115>
> >
> >     Do I understand correctly that all the exception filter code tries to 
> > avoid dynamic allocation at all costs? 
> >     
> >     Maybe  leave a NOTE? Or is it obvious?

Yes I tried to for the same reason as on POSIX (it's not async-signal-safe). I 
imagined some windows exceptions could arrive when a memory allocation lock is 
held (e.g. interrupt or user sent signal). I'll write this down.


> On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote:
> > src/logging/logging.cpp
> > Lines 117 (patched)
> > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line117>
> >
> >     Wrapped link (here and below) can be hard to copy into browser.
> >     
> >     Not sure what our styleguide says; checked other parts of Mesos 
> > codebase - links like this are typically kept on a single line with `// 
> > NOLINT` at the end.

Yeah, I've personally found it cleaner to wrap them when I've added links, and 
I don't think it's a big burden to open them (especially since github and many 
IDEs don't seem to make them clickable), but I can use the NOLINT approach.


> On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote:
> > src/logging/logging.cpp
> > Lines 234 (patched)
> > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line234>
> >
> >     Do I understand correctly that each thread has its own unhandled 
> > exception filter (which we reset for all threads with one call of 
> > `SetUnhandledExceptionFilter`)
> >     
> >     
> > https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter
> >     
> >     but the functions of `dbghelp.h` that we use are not thread-safe?
> >     
> > https://docs.microsoft.com/en-us/windows/win32/api/dbghelp/nf-dbghelp-symgetlinefromaddr64
> >     
> >     Doesn't this mean that we need to synchronize access to `dbghelp` or, 
> > at least, to make sure that concurrent run is dropped?
> >     
> >     Given the nature of Mesos, I would not be surprised if we will 
> > eventually run into a pair of simultaneous exceptions (probably triggered 
> > by a common cause) in two threads. 
> >     
> >     Having only one stack of two is probably tolerable, but we would want 
> > to avoid getting garbadge/triggering UB at all costs.

Good find!


> On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote:
> > src/logging/logging.cpp
> > Lines 308 (patched)
> > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line308>
> >
> >     In the not very likely case of a very deep stack, can we somehow 
> > explicitly inform the reader that the trace was clipped due to this 
> > limitation?

Good suggestion, thanks!


> On March 20, 2020, 12:54 p.m., Andrei Sekretenko wrote:
> > src/logging/logging.cpp
> > Lines 449 (patched)
> > <https://reviews.apache.org/r/72239/diff/3/?file=2214099#file2214099line449>
> >
> >     This function returns the previously installed exception filter, if 
> > there was one; do you have any idea whether we need to care of this or not, 
> > and why?
> >     
> >     I would naively expect that there should be no handler initially. 
> >     Probably, if the returned value is not NULL, this means that we don't 
> > know what is going on (i.e. what 3rd-party library, after we updated it, 
> > started to install its own filter before us) and should just crash here?

Since we don't have a notion of disabling the failure handling, I didn't add 
any preserve / restore code.

Tried adding a check, turns out it does not return NULL, not sure what is 
getting set up here (maybe a libc thing for mapping to signals?)


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72239/#review219968
-----------------------------------------------------------


On March 17, 2020, 12:09 a.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72239/
> -----------------------------------------------------------
> 
> (Updated March 17, 2020, 12:09 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Joseph Wu.
> 
> 
> Bugs: MESOS-7245
>     https://issues.apache.org/jira/browse/MESOS-7245
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> For now, this starts with adding the necessary code to logging.cpp up
> in mesos. This gives us stack traces for any mesos binaries or
> libraries that initialize logging with `installFailureSignalHandler`
> set to true.
> 
> Ideally, we would either (1) move this into stout, or (2) upstream
> some changes to glog. Both approaches would allow us to also get
> stack traces from stout tests and libprocess tests.
> 
> For now however, this patch is just aiming to fix the urgent issue
> of a lack of stack traces for mesos crashes.
> 
> 
> Diffs
> -----
> 
>   src/logging/logging.cpp baca51b3956dbaa282c03e73a2c261ff103f4821 
> 
> 
> Diff: https://reviews.apache.org/r/72239/diff/3/
> 
> 
> Testing
> -------
> 
> Ran through CI and locally, for the recent reservation update crash this 
> produces:
> 
> ```
> E0000 00:00:00.000000 18216 logging.cpp:295] RAW: Received fatal exception 
> 0xc0000005 EXCEPTION_ACCESS_VIOLATION
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> google::protobuf::internal::RepeatedPtrIterator<mesos::Resource_ReservationInfo
>  const >::operator* [00007FF75279306E+14] 
> (c:\users\bmahler\git\mesos\build\3rdparty\protobuf-3.5.0\src\protobuf-3.5.0\src\google\protobuf\repeated_field.h:2266)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> mesos::authorization::ActionObject::reserve [00007FF753C94880+720] 
> (c:\users\bmahler\git\mesos\src\master\authorization.cpp:236)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> mesos::internal::master::Master::Http::_reserve [00007FF753D007CF+1423] 
> (c:\users\bmahler\git\mesos\src\master\http.cpp:1960)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> mesos::internal::master::Master::Http::reserveResources 
> [00007FF753D07FFC+380] (c:\users\bmahler\git\mesos\src\master\http.cpp:1984)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> mesos::internal::master::Master::Http::api [00007FF753CED99B+6571] 
> (c:\users\bmahler\git\mesos\src\master\http.cpp:352)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> <lambda_339a9fa0c2c868bcf197f8c97b63146d>::operator() [00007FF7527AA2E4+100] 
> (c:\users\bmahler\git\mesos\src\master\master.cpp:909)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Invoker_functor::_Call<<lambda_339a9fa0c2c868bcf197f8c97b63146d> 
> &,process::http::Request const 
> &,Option<process::http::authentication::Principal> const &> 
> [00007FF75262D61A+122] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::invoke<<lambda_339a9fa0c2c868bcf197f8c97b63146d> 
> &,process::http::Request const 
> &,Option<process::http::authentication::Principal> const &> 
> [00007FF7526E55AA+122] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Invoker_ret<process::Future<process::http::Response>,0>::_Call<<lambda_339a9fa0c2c868bcf197f8c97b63146d>
>  &,process::http::Request const 
> &,Option<process::http::authentication::Principal> const &> 
> [00007FF75262D55A+122] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Func_impl_no_alloc<<lambda_339a9fa0c2c868bcf197f8c97b63146d>,process::Future<process::http::Response>,process::http::Request
>  const &,Option<process::http::authentication::Principal> const &>::_Do_call 
> [00007FF7527EDF2F+111] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\functional:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Func_class<process::Future<process::http::Response>,process::http::Request
>  const &,Option<process::http::authentication::Principal> const 
> &>::operator() [00007FF75472BAAD+157] (c:\program files (x86)\microsoft 
> visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\functional:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> <lambda_23c733dc7debde2255779e987bee5763>::operator() [00007FF75471911F+335] 
> (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3874)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Invoker_functor::_Call<<lambda_23c733dc7debde2255779e987bee5763>,bool> 
> [00007FF754638F1F+95] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::invoke<<lambda_23c733dc7debde2255779e987bee5763>,bool> 
> [00007FF75469AB2E+94] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>::invoke_expand<<lambda_23c733dc7debde2255779e987bee5763>,std::tuple<bool>,std::tuple<>,0>
>  [00007FF7546A2B9A+138] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:292)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>::operator()<>
>  [00007FF754620DCF+127] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:331)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Invoker_functor::_Call<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>
>  > [00007FF75463D082+66] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::invoke<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>
>  > [00007FF75469EBF2+66] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::internal::Invoke<process::Future<process::http::Response> 
> >::operator()<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>
>  > [00007FF75462A637+71] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:387)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>::CallableFn<lambda::internal::Partial<<lambda_23c733dc7debde2255779e987bee5763>,bool>
>  >::operator() [00007FF7547261CF+95] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:463)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>::operator() [00007FF752521D5B+267] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:443)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> <lambda_1161592621caf1e03e3f58e967ac7c4e>::operator() [00007FF75251C5CD+77] 
> (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\dispatch.hpp:119)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Invoker_functor::_Call<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,process::ProcessBase *> [00007FF7524DA748+152] (c:\program 
> files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::invoke<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,process::ProcessBase *> [00007FF7524F2AC4+132] (c:\program 
> files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,std::_Ph<1> 
> >::invoke_expand<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::tuple<std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,std::_Ph<1> >,std::tuple<process::ProcessBase * &&>,0,1,2> 
> [00007FF7524F5EE9+249] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:292)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,std::_Ph<1> >::operator()<process::ProcessBase *> 
> [00007FF7524D413A+122] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:331)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Invoker_functor::_Call<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524DC40C+76] 
> (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::invoke<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524F46BC+76] 
> (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::internal::Invoke<void>::operator()<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,std::_Ph<1> >,process::ProcessBase *> [00007FF7524D6031+81] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:399)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::CallableOnce<void __cdecl(process::ProcessBase 
> *)>::CallableFn<lambda::internal::Partial<<lambda_1161592621caf1e03e3f58e967ac7c4e>,std::unique_ptr<process::Promise<process::http::Response>,std::default_delete<process::Promise<process::http::Response>
>  > >,lambda::CallableOnce<process::Future<process::http::Response> 
> __cdecl(void)>,std::_Ph<1> > >::operator() [00007FF752520296+102] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:464)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> lambda::CallableOnce<void __cdecl(process::ProcessBase *)>::operator() 
> [00007FF75472B83D+285] 
> (c:\users\bmahler\git\mesos\3rdparty\stout\include\stout\lambda.hpp:444)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> process::ProcessBase::consume [00007FF7545D8439+73] 
> (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3666)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> process::DispatchEvent::consume [00007FF7547890EA+74] 
> (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\event.hpp:200)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> process::ProcessBase::serve [00007FF75064AB77+71] 
> (c:\users\bmahler\git\mesos\3rdparty\libprocess\include\process\process.hpp:88)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> process::ProcessManager::resume [00007FF7545E6260+1696] 
> (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:3055)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> <lambda_124422ac022fa041208b80c1460630d7>::operator() [00007FF754718541+145] 
> (c:\users\bmahler\git\mesos\3rdparty\libprocess\src\process.cpp:2545)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_Invoker_functor::_Call<<lambda_124422ac022fa041208b80c1460630d7> > 
> [00007FF754638A20+48] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::invoke<<lambda_124422ac022fa041208b80c1460630d7> > [00007FF75469A650+48] 
> (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\type_traits:16707566)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7>
>  >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > 
> > > >::_Execute<0> [00007FF75464CE2C+60] (c:\program files (x86)\microsoft 
> visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:239)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7>
>  >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > 
> > > >::_Run [00007FF75476E6AA+106] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:245)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_124422ac022fa041208b80c1460630d7>
>  >,std::default_delete<std::tuple<<lambda_124422ac022fa041208b80c1460630d7> > 
> > > >::_Go [00007FF754759698+40] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:231)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       std::_Pad::_Call_func 
> [00007FF75473EA3D+45] (c:\program files (x86)\microsoft visual 
> studio\2017\community\vc\tools\msvc\14.16.27023\include\thr\xthread:209)
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> register_onexit_function [00007FFA6B384FB8+1192]
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       
> register_onexit_function [00007FFA6B384BF1+225]
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       BaseThreadInitThunk 
> [00007FFA7F687974+20]
> E0000 00:00:00.000000 18216 logging.cpp:320] RAW:       RtlUserThreadStart 
> [00007FFA81F3A261+33]
> ```
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>

Reply via email to