[ 
https://issues.apache.org/jira/browse/MESOS-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252014#comment-16252014
 ] 

Yan Xu commented on MESOS-8223:
-------------------------------

The problem is that this 
[code|https://github.com/apache/mesos/blob/bb2deb3baafffb9a35d1dfbc35b0d43677b0b842/src/master/allocator/mesos/hierarchical.cpp#L447-L460]
 treats frameworks moving off a role and frameworks suppressing a role the same 
way. The former should untrack the framework under that role and the latter 
shouldn't.

> Master crashes when suppressed on subscribe is enabled.
> -------------------------------------------------------
>
>                 Key: MESOS-8223
>                 URL: https://issues.apache.org/jira/browse/MESOS-8223
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.4.0
>            Reporter: Yan Xu
>            Priority: Critical
>
> Introduced in MESOS-7015, this feature is not actually turned on due to 
> MESOS-8200. However once this is addressed and the feature enabled, the 
> master crashes with:
> {noformat:title=}
> I1113 17:17:37.240901 11285 master.cpp:3309] Disconnecting framework 
> 40f7bdc0-e54b-46da-ace1-48162171baf4-0110 (test-framework)
> I1113 17:17:37.240911 11285 master.cpp:1435] Giving framework 
> 40f7bdc0-e54b-46da-ace1-48162171baf4-0110 (test-framework) 3days to failover
> I1113 17:17:37.241953 11285 master.cpp:2612] Received subscription request 
> for HTTP framework 'test-framework'
> I1113 17:17:37.242807 11285 master.cpp:2748] Subscribing framework 
> 'test-framework' with checkpointing enabled, roles { * } suppressed and 
> capabilities [ SHARED_RESOURCES, TASK_KILLING_STATE ]
> I1113 17:17:37.242820 11285 master.cpp:6994] Updating info for framework 
> 40f7bdc0-e54b-46da-ace1-48162171baf4-0110
> I1113 17:17:37.252637 11270 hierarchical.cpp:380] Activated framework 
> 40f7bdc0-e54b-46da-ace1-48162171baf4-0110
> I1113 17:17:37.272457 11289 master.cpp:7723] Performing implicit task state 
> reconciliation for framework 40f7bdc0-e54b-46da-ace1-48162171baf4-0110 
> (test-framework)
> I1113 17:17:37.272507 11289 master.cpp:7723] Performing implicit task state 
> reconciliation for framework 40f7bdc0-e54b-46da-ace1-48162171baf4-0110 
> (test-framework)
> I1113 17:17:41.966331 11271 master.cpp:5564] Processing REVIVE call for 
> framework 40f7bdc0-e54b-46da-ace1-48162171baf4-0110 (test-framework)
> F1113 17:17:41.966380 11280 sorter.cpp:270] Check failed: 'find(clientPath)' 
> Must be non NULL
> *** Check failure stack trace: ***
>     @     0x7f3467efd0dd  (unknown)
> {noformat}
> This happens with a unsuppressed framework reregisters with suppressed roles 
> and then revive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to