> On May 17, 2017, 2:14 p.m., Jie Yu wrote: > > src/slave/flags.cpp > > Lines 770-786 (patched) > > <https://reviews.apache.org/r/59294/diff/1/?file=1719990#file1719990line770> > > > > This sounds like a heuristic. Any justification why this heuristic? > > Wondering if label based solution is better? For instance, the isolator > > will look for a special label of the task/executor. The label specifies the > > egress rate limit which can override the default rate limit. Something > > along this line? > > > > Then, the custom logic can be injected into a label decrorator, rather > > than first class it here? > > Ian Downes wrote: > It's not really a heuristic, it's a simple linear model with min/max. The > major benefit is that it enables more effective allocation of a host's egress > bandwidth without exposing bandwidth as a resource. A fixed egress bandwidth > allocates poorly for either a small number of very large containers > (underutilizing) or a large number of small containers (overcommitting). > Scaling with CPU means a large container can get a larger share of the > bandwidth. > > I thought about a label based solution but this doesn't work well with a > heterogenous cluster. We have a mix of 1G and 10G hosts and we'd like to use > different egress_rate_per_cpu depending on the link speed, e.g., 40 Mbps / > core for 1G and 120 Mbps / core for 10 G. The scheduler doesn't (and > shouldn't) know the specifics of hosts beyond resources so unless we make > bandwidth a first class resource I think the logic should be at the isolator. > Host bandwidth could be exposed via an agent attribute but that's *really* > breaking the resource abstraction.
To add to the point - scaling network bandwidth with CPU is typical in the cloud and we are just mirroring the same feature here. ` Each core is subject to a 2 Gbits/second (Gbps) cap for peak performance. Each additional core increases the network cap, up to a theoretical maximum of 16 Gbps for each virtual machine; however, the actual performance you experience can vary depending on your workload.` https://cloud.google.com/compute/docs/networks-and-firewalls - Santhosh Kumar ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59294/#review175114 ----------------------------------------------------------- On May 26, 2017, 11:23 a.m., Ian Downes wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59294/ > ----------------------------------------------------------- > > (Updated May 26, 2017, 11:23 a.m.) > > > Review request for mesos, Dmitry Zhuk, Ilya Pronin, and Jie Yu. > > > Bugs: MESOS-7508 > https://issues.apache.org/jira/browse/MESOS-7508 > > > Repository: mesos > > > Description > ------- > > Add support to isolators/port_mapping for optionally scaling egress bandwidth > with CPU and with minimum and maximum limits. > > > Diffs > ----- > > src/slave/containerizer/mesos/isolators/network/port_mapping.hpp > 9d38289c7161d5e931053b587d115684ccc44c94 > src/slave/containerizer/mesos/isolators/network/port_mapping.cpp > cd008aaebcd42554a9a81d2b059269546f59c966 > src/slave/flags.hpp b66995630f89dfb95a6d0cf66efc5d7590e90cbc > src/slave/flags.cpp 0c8276e425a6a7d22ee68edc6cc25b331635ec44 > src/tests/containerizer/port_mapping_tests.cpp > d062f2f6bcf7b44dbcde951cdca23b0a2cd42115 > > > Diff: https://reviews.apache.org/r/59294/diff/2/ > > > Testing > ------- > > # added a new test > $ make check > > > Thanks, > > Ian Downes > >
