[ 
https://issues.apache.org/jira/browse/MESOS-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15318763#comment-15318763
 ] 

Avinash Sridharan commented on MESOS-5545:
------------------------------------------

[~fan.du] LLDP is a single hop protocol (at layer 2). So not sure how you would 
use LLDP to get the `network topology` information. My guess is that what you 
want is just the layer 2 next hop TOR switch to which the Agent is connected 
which I do agree that LLDP can provide. If you do want the network topology you 
will have to parse the SNMP MIB for every TOR switch in the cluster, but not 
sure that actually is required over here.

As far as introducing an actor in the Agent to use LLDP is concerned, one 
problem I see with using LLDP is that since this is a layer 2 protocol you will 
need a raw socket to communicate with other LLDP hosts (the TOR switch in this 
case). AFAIK libprocess talks HTTP and PROTOBUFs (over TCP) so adding 
infrastructure to talk over raw sockets might be required to support LLDP. 

Also, this will work only if Mesos Agent is running on bare metal. The reason 
being that since LLDP is a 1-hop protocol, if Agent is running on a VM, it will 
end up talking to the v-swtich in the hypervisor which will not forward the 
LLDP frames to the TOR. 

I am not that familiar with Infiniband but I am guessing the same issues will 
exist with Infiniband as well. 

I do agree with [[email protected]] comments that it doesn't  make sense to 
make RACKID a first class field in `SlaveInfo` since this information is not 
going to be universally available. I am not familiar with how attributes are 
set with resources, but would think something like attributes or labels (which 
ever can act as a dynamic metadata for resources) should be used to indicate 
RACKID information to the frameworks.

> Add rack awareness support for Mesos resources
> ----------------------------------------------
>
>                 Key: MESOS-5545
>                 URL: https://issues.apache.org/jira/browse/MESOS-5545
>             Project: Mesos
>          Issue Type: Story
>          Components: hadoop, master
>            Reporter: Fan Du
>         Attachments: RackAwarenessforMesos-Lite.pdf
>
>
> Resources managed by Mesos master have no topology information of the 
> cluster, for example, rack topology. While lots of data center applications 
> have rack awareness feature to provide data locality, fault tolerance and 
> intelligent task placement. This ticket tries to investigate how to add rack 
> awareness for Mesos resources topology.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to