cf-natali opened a new pull request #377:
URL: https://github.com/apache/mesos/pull/377


   This fixes https://issues.apache.org/jira/browse/MESOS-10203
   
   There are actually 3 problems:
   1. The agent shouldn't crash when the kernel supports capabilities the code 
doesn't support, but report an error cleanly. This is fixed by 
https://github.com/apache/mesos/commit/8f0ce240e65de0583ecadd60c206a000aa556a68
   
   Before:
   ```
   Reached unreachable statement at ../../src/linux/capabilities.cpp:497
   *** Aborted at 1611359313 (unix time) try "date -d @1611359313" if you are 
using GNU date ***
   PC: @     0x7f108cd057bb gsignal
   *** SIGABRT (@0x4332) received by PID 17202 (TID 0x7f108b587000) from PID 
17202; stack trace: ***
       @     0x7f108cea1730 (unknown)
       @     0x7f108cd057bb gsignal
       @     0x7f108ccf0535 abort
       @     0x55ba1667870b Unreachable()
       @     0x7f109976e49a mesos::internal::capabilities::operator<<()
       @     0x7f109976f1ab stringify<>()
       @     0x7f109976c7b2 
mesos::internal::capabilities::Capabilities::create()
       @     0x7f1099932ab8 
mesos::internal::slave::LinuxCapabilitiesIsolatorProcess::create()
       @     0x7f1098e68caa std::_Function_handler<>::_M_invoke()
       @     0x7f1098e59d32 std::function<>::operator()()
       @     0x7f1098e26c13 mesos::internal::slave::MesosContainerizer::create()
       @     0x7f1098cf9494 mesos::internal::slave::Containerizer::create()
       @     0x55ba166743ee main
       @     0x7f108ccf209b __libc_start_main
       @     0x55ba16670eda _start
   ```
   
   After:
   ```
   E0122 23:54:23.660285 21366 main.cpp:610] EXIT with status 1: Failed to 
create a containerizer: Could not create MesosContainerizer: Failed to create 
isolator 'linux/capabilities': Failed to initialize capabilit
   ies: System last capability value '37' is greater than maximum supported 
number of capabilities '37'
   ```
   
   2. Add support for the new `CAP_PERFMON`, `CAP_BPF` and 
`CAP_CHECKPOINT_RESTORE` capabilities: 
https://github.com/apache/mesos/commit/95c5f217bf8281ce7f1360e7f46a9f9e66b862e0
   
   3. It would probably be better if the agent didn't error out just because 
the kernel supports some new capabilities. From a cursory look at the code, I 
think it would be safe to change, however I'm not familiar enough with this 
code to make that call, hence have left it for now.
   
   @jpeach @bmahler 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to