[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275983#comment-15275983 ] Sidharta Seethana commented on YARN-2140: - Hi [~djp], The sub-tasks address tagging/shaping for outbound network traffic only. There has been no work done from an inbound traffic perspective. > Add support for network IO isolation/scheduling for containers > -- > > Key: YARN-2140 > URL: https://issues.apache.org/jira/browse/YARN-2140 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wei Yan >Assignee: Sidharta Seethana > Attachments: NetworkAsAResourceDesign.pdf > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274950#comment-15274950 ] Junping Du commented on YARN-2140: -- Hi [~sidharta-s], I noticed all sub jiras are resolved. Do we have any work left to do? If not, we should mark this umbrella as resolved. > Add support for network IO isolation/scheduling for containers > -- > > Key: YARN-2140 > URL: https://issues.apache.org/jira/browse/YARN-2140 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wei Yan >Assignee: Sidharta Seethana > Attachments: NetworkAsAResourceDesign.pdf > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610825#comment-14610825 ] Sidharta Seethana commented on YARN-2140: - Hi [~dheeren] , we only address network bandwidth resource isolation in the design doc that is attached, not isolating the network stack itself. I recommend taking a look at YARN-3611 for new docker related functionality and please file a JIRA with requirements that you have. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Sidharta Seethana Attachments: NetworkAsAResourceDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610649#comment-14610649 ] Dheeren Beborrtha commented on YARN-2140: - How do you support port level isolation for Docker containers? For example, lets say I would like to run multiple docker containers on the same Datanode. If each of the conatiners needs to be long running and need to advertise their ports, what is the mechanism for doing so? Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Sidharta Seethana Attachments: NetworkAsAResourceDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394546#comment-14394546 ] Do Hoai Nam commented on YARN-2140: --- For the case of ingress traffic you can check our solution in YARN-2618 (Support bandwidth enforcement for containers while reading from HDFS) https://issues.apache.org/jira/browse/YARN-2681 and the related paper (http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf) Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan Attachments: NetworkAsAResourceDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355395#comment-14355395 ] Bikas Saha commented on YARN-2140: -- This paper may have useful insights into the network sharing issues. http://research.microsoft.com/en-us/um/people/srikanth/data/nsdi11_seawall.pdf Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan Attachments: NetworkAsAResourceDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355420#comment-14355420 ] Sidharta Seethana commented on YARN-2140: - [~bikassaha] Thanks. I ran into this paper (and a couple of others) when looking at [YARN-3|https://issues.apache.org/jira/browse/YARN-3] Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan Attachments: NetworkAsAResourceDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355377#comment-14355377 ] Sidharta Seethana commented on YARN-2140: - You are right - there are several areas to think about here and we definitely need to put in more thought w.r.t scheduling. In order to be able to do effective scheduling for network resources, we would need to understand a) the overall network topology in place for the cluster in question - characteristics of the ‘route’ between any two nodes in the cluster - number of hops required and the available/max bandwidth at each point in the route. b) application characteristics w.r.t network utilization - internal/external traffic, latency vs. bandwidth sensitivities etc. With regards to inbound traffic, we currently do not have a good way to do effectively manage traffic - when inbound packets are being ‘examined’ on a given node, they have already consumed bandwidth along the way - and the only option we have is to drop it immediately (we cannot queue on the inbound side) or let it through - the design document mentions these limitations. One possible approach here could be to let the application provide ‘hints’ for inbound network utilization (not all applications might be able to do this) and use this information purely for scheduling purposes. This, of course, adds more complexity to scheduling. Needless to say, there are hard problems to solve here - and the (network) scheduling requirements (and potential approaches for implementation) will need further looking into. As a first step, though, I think it makes sense to focus on classification of outbound traffic (net_cls) and maybe basic isolation/enforcement + collection of metrics. Once we have this in place - we could look at real utilization patterns and decide what the next steps should be. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan Attachments: NetworkAsAResourceDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333880#comment-14333880 ] Sidharta Seethana commented on YARN-2140: - Hi [~ywskycn] , [~vvasudev] and I have been thinking about how we would approach supporting network bandwidth as a resource in YARN. We have a design doc that we'll post here shortly. Do you mind if we take over this JIRA? Thank you, -Sidharta Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333883#comment-14333883 ] Sidharta Seethana commented on YARN-2140: - I have attached the design doc. Thanks! -Sidharta Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan Attachments: NetworkAsAResourceDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030284#comment-14030284 ] haosdent commented on YARN-2140: Thx [~ywskycn] Looking forward your work. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030643#comment-14030643 ] Robert Joseph Evans commented on YARN-2140: --- We are working on similar things for storm. I am very interested in your design, because for any streaming system to truly have a chance on YARN soft guarantees on network I/O are critical. There are several big problems with network I/O even if the user can effectively estimate what they will need. The first is that the resource is not limited to a single node in the cluster. The network has a topology and a bottlekneck can show up at any point in that topology. So you may think you are fine because each node in a rack is not scheduled to be using the full bandwidth that the network card(s) can support. But you can easily have saturated the top of rack switch without knowing it. To solve this problem you effectively have to know the topology of the application itself. So that you can schedule the node to node network connections within that application. if users don't know how much network they are going to use at a high level, they will never have any idea at a low level. But then you also have the big problem of batch being very bursty in its network usage. The only way to solve this is going to require network hardware support for prioritizing packets. But I'll wait for your design before writing too much more. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030750#comment-14030750 ] Wei Yan commented on YARN-2140: --- Thanks for the comments, [~revans2]. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030175#comment-14030175 ] Beckham007 commented on YARN-2140: -- I think it could use the net_cls subsystem of cgroup to handle this. Firstly, it need to refactor org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler to support various of resource, not only cpu. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030220#comment-14030220 ] haosdent commented on YARN-2140: net_cls just classify the package. So cgroup is not enough to do network IO isolation/scheduling. And I have tried tc and net_cls, but them don't do well in network IO isolation/scheduling even couldn't have any effects on package in flow. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030264#comment-14030264 ] Beckham007 commented on YARN-2140: -- tc class add dev ${net_dev} parent ${parent_classid} classid ${classid} htb rate ${guaranteed_bandwidth}kbps ceil ${max_bandwidth}kbps It could be used to control the min and max bandwidth of each container. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030271#comment-14030271 ] Wei Yan commented on YARN-2140: --- [~haosd...@gmail.com], [~beckham007], net_cls can be used to limit the network bandwidth used for each task per device. One problem here is that it is not easy for users to specify the accurate network bandwidth requirement for the application. I'm still working on the design. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026672#comment-14026672 ] Bikas Saha commented on YARN-2140: -- [~ywskycn] For this and YARN-2139 my suggestion would be to first post a design sketch and discuss some alternatives. You may prototype some approach to get supporting data for that design doc. This will help get community interaction and understanding for your proposal and enable quicker progress. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026674#comment-14026674 ] Wei Yan commented on YARN-2140: --- Thanks, [~bikassaha]. Yes, I'm working on that part. Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026675#comment-14026675 ] Sandy Ryza commented on YARN-2140: -- +1 to a design sketch with alternatives Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027332#comment-14027332 ] haosdent commented on YARN-2140: Cool ! Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026054#comment-14026054 ] haosdent commented on YARN-2140: How to implement this? Cgroup? Add support for network IO isolation/scheduling for containers -- Key: YARN-2140 URL: https://issues.apache.org/jira/browse/YARN-2140 Project: Hadoop YARN Issue Type: New Feature Reporter: Wei Yan Assignee: Wei Yan -- This message was sent by Atlassian JIRA (v6.2#6252)