[ https://issues.apache.org/jira/browse/MESOS-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944011#comment-16944011 ]
Benjamin Mahler commented on MESOS-9889: ---------------------------------------- Targeting for backporting, since this seems to cause a serious performance issue if triggered. > Master CPU high due to unexpected foreachkey behaviour in > Master::__reregisterSlave > ----------------------------------------------------------------------------------- > > Key: MESOS-9889 > URL: https://issues.apache.org/jira/browse/MESOS-9889 > Project: Mesos > Issue Type: Bug > Reporter: haosdent > Assignee: Benjamin Mahler > Priority: Critical > Labels: foundations > > At > https://github.com/apache/mesos/blob/9932550e9632e7fbb9a45b217793c7f508f57001/src/master/master.cpp#L7707-L7708 > {code} > void Master::__reregisterSlave( > ... > foreachkey (FrameworkID frameworkId, > slaves.unreachableTasks.at(slaveInfo.id())) { > ... > foreach (TaskID taskId, > slaves.unreachableTasks.at(slaveInfo.id()).get(frameworkId)) > { > {code} > Our case is when network flapping, 3~4 agents reregister, then master would > CPU full and could not process any requests during that period. > After change > {code} > - foreachkey (FrameworkID frameworkId, > - slaves.unreachableTasks.at(slaveInfo.id())) { > + foreach (FrameworkID frameworkId, > + slaves.unreachableTasks.at(slaveInfo.id()).keys()) { > {code} > The problem gone. -- This message was sent by Atlassian Jira (v8.3.4#803005)