[ 
https://issues.apache.org/jira/browse/MESOS-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15863104#comment-15863104
 ] 

Till Toenshoff commented on MESOS-2842:
---------------------------------------

This is what this looks like when coming across this issue;

{noformat}
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 
01:38:28.419044  2809 master.cpp:2783] Subscribing framework integration_test 
with checkpointing enabled and capabilities [ PARTITION_AWARE ]
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 
01:38:28.419072  2809 master.cpp:2861] Updating info for framework 
6aec32bf-cd60-4fa1-9992-f35af104f423-0009
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: W0213 
01:38:28.419083  2809 master.hpp:2486] Cannot update FrameworkInfo.role to '*' 
for framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009. Check MESOS-703
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: W0213 
01:38:28.419091  2809 master.hpp:2497] Cannot update FrameworkInfo.principal to 
'alice' for framework 6aec32bf-cd60-4fa1-9992-f35af104f423-0009. Check MESOS-703
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 
01:38:28.419111  2809 master.cpp:2874] Framework 
6aec32bf-cd60-4fa1-9992-f35af104f423-0009 (integration_test) at 
scheduler-188c0a58-9b44-4e2b-b133-a7c15b37fc55@127.0.0.1:41805 failed over
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 
01:38:28.419245  2809 hierarchical.cpp:358] Activated framework 
6aec32bf-cd60-4fa1-9992-f35af104f423-0009
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: I0213 
01:38:28.419543  2809 master.cpp:6664] Sending 1 offers to framework 
6aec32bf-cd60-4fa1-9992-f35af104f423-0009 (integration_test) at 
scheduler-7fff5d25-a121-48bf-8849-1948b161d729@127.0.0.1:46530
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: F0213 
01:38:28.426944  2809 master.cpp:1446] Check failed: 
metrics->frameworks.contains(principal.get())
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: *** 
Check failure stack trace: ***
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb678b831ad  google::LogMessage::Fail()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb678b84fdd  google::LogMessage::SendToLog()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb678b82d9c  google::LogMessage::Flush()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb678b858d9  google::LogMessageFatal::~LogMessageFatal()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb6780453dd  mesos::internal::master::Master::visit()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb678af7ca1  process::ProcessManager::resume()
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb678b00ba7  
_ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb676f90230  (unknown)
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb6767aedc5  start_thread
Feb 13 01:38:28 test-277bcd0b-fe0e-468a-a9b5-ee624538ac4b mesos-master: @     
0x7fb6764dd73d  __clone
{noformat}

> Update FrameworkInfo.principal on framework re-registration
> -----------------------------------------------------------
>
>                 Key: MESOS-2842
>                 URL: https://issues.apache.org/jira/browse/MESOS-2842
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Vinod Kone
>              Labels: security
>
> From the design doc:
> This is a bit involved because ‘principal’ is used for authentication and 
> rate limiting.
> The authentication part is straightforward because a framework with updated 
> ‘principal’ should authenticate with the new ‘principal’ before being allowed 
> to re-register. The ‘authenticated’ map already gets updated when the 
> framework disconnects and reconnects, so it is fine.
> For rate limiting, Master:failoverFramework() needs to be changed to update 
> the principal in ‘frameworks.principals’ map and also remove the metrics for 
> the old principal if there are no other frameworks with this principal 
> (similar to what we do in Master::removeFramework()).
> The Master::visit() and Master::_visit() should work with the current 
> semantics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to