[
https://issues.apache.org/jira/browse/MESOS-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317763#comment-15317763
]
Fan Du edited comment on MESOS-5545 at 6/7/16 3:35 AM:
-------------------------------------------------------
[~vinodkone] Thanks for the comments.
Rack topology information does not fall into scope of network isolator, because
it's not the target which can/should be isolated.
Here is the explanation to justify rack topology information can be updated:
The state of rack information could only transit from no rack information to
valid rack information, in other words, it's possible that tasks use resources
without rack information, but later on agents report rack id to master, the
logic could be one/all of design decisions: a) notify corresponding frameworks
with updated rack id for previous resources, b) subsequent allocation will have
rack id tagged with agents, c)Resource freed by framework will have rack id for
the next round allocation. The scenario is simpler and cleaner compared with
attributes updates. OR only activate the agents for resource allocation once
got valid rack id.
Using attributes is a way to export the rack information, but I don't think
that's possible in production, scale of +10000 servers, setting attributes with
rack information from 3rd party logic and start agents?! Automatically exposing
the rack information could save lots of deployment and maintenance effort.
Apologize, seems I don't quite get the meaning of first class field,
influencing allocation decision is not the intention of the ticket, I believe
that part of work is out of scope the ticket, which I put them in the Future
section of the design doc. The allocation strategy DOES honor DRF, current
implementation is do the allocation in a per agent basis, and we could
investigate different allocation modes.
In addition, I'd prefer arranging agents in a per rack basis, because randomly
shuffling agents scale to +10000 nodes is no good for every allocation
iteration. IIRC, this number is grown.
All in all, IMHO, it's a good feature for Mesos, the question is how to do it
elegantly. :)
was (Author: fan.du):
[~vinodkone] Thanks for the comments.
Rack topology information does not fall into scope of network isolator, because
it's not the target which can/should be isolated.
Here is the explanation to justify rack topology information can be updated:
The state of rack information could only transit from no rack information to
valid rack information, in other words, it's possible that tasks use resources
without rack information, but later on agents report rack id to master, the
logic could be one/all of design decisions: a) notify corresponding frameworks
with updated rack id for previous resources, b) subsequent allocation will have
rack id tagged with agents, c)Resource freed by framework will have rack id for
the next round allocation. The scenario is simpler and cleaner compared with
attributes updates. OR only activate the agents for resource allocation once
got valid rack id.
Using attributes is a way to export the rack information, but I don't think
that's possible in production, scale of +10000 servers, setting attributes with
rack information from 3rd party logic and start agents?! Automatically exposing
the rack information could save lots of deployment and maintenance effort.
Apologize, seems I don't quite get the meaning of first class field,
influencing allocation decision is not the intention of the ticket, I believe
that part of work is out of scope the ticket, which I put them in the Future
section of the design doc. The allocation strategy DOES honor DRF, current
implementation is do the allocation in a per agent basis, and we could
investigate different allocation modes.
In addition, I'd prefer arranging agents in a per rack basis, because randomly
shuffling agents scale to +10000 nodes is no good for every allocation
iteration.
IIRC, this number is grown.
All in all, IMHO, it's a good feature for Mesos, the question is how to do it
elegantly. :)
> Add rack awareness support for Mesos resources
> ----------------------------------------------
>
> Key: MESOS-5545
> URL: https://issues.apache.org/jira/browse/MESOS-5545
> Project: Mesos
> Issue Type: Story
> Components: hadoop, master
> Reporter: Fan Du
>
> Resources managed by Mesos master have no topology information of the
> cluster, for example, rack topology. While lots of data center applications
> have rack awareness feature to provide data locality, fault tolerance and
> intelligent task placement. This ticket tries to investigate how to add rack
> awareness for Mesos resources topology.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)