Hi Zhilin, Is it possible to get this design doc added to our wiki? I create a design docs page here (https://cwiki.apache.org/confluence/display/TC/Design+Docs). I think it would be good to get the document there so it doesn't get lost over time.
Thanks! Dave On Wed, Mar 28, 2018 at 10:41 PM, Zhilin Huang (zhilhuan) < zhilh...@cisco.com> wrote: > Hi Guys, > > Thanks a lot for the discussion. I should put the design earlier for > review, and sorry for the delay. Here is the link for the design doc: > https://docs.google.com/document/d/1vgq-pGNoLLYf7Y3cu5hWu67TUKpN5hucrp > -ZS9nSsd4/edit?usp=sharing > > Short summary for the feature design: > --- > There is feature request from market to add secondary IPs support on edge > cache servers, and the functionality to assign a delivery service to a > secondary IP of an edge cache. > > This feature requires Traffic Ops implementation to support secondary IP > configuration for edge cache, and delivery service assignment to secondary > IP. > > Traffic Monitor should also monitor connectivity of secondary IPs > configured. And Traffic Router needs support to resolve streamer FQDN to > secondary IP assigned in a delivery service. > > Traffic Server should record the IP serving client request. And should > reject request to an unassigned IP for a delivery service. > > This design has taken compatibility into consideration: if no secondary IP > configured, or some parts of the system has not been upgraded to the > version supports this feature, the traffic will be served by primary IPs as > before. > --- > > Replies for Robert's comments is embedded in the email thread. Much > appreciated and welcome to any further comments. > > Thanks, > Zhilin > > > > > On 29/03/2018, 10:19 AM, "Neil Hao (nbaoping)" <nbaop...@cisco.com> > wrote: > > Hi Robert/Nir, > > Thanks very much for the quick and detail reply, and sorry for that I > didn’t make the whole feature clearly. Actually, it’s our Secondary IP > feature, which is a big feature that will bring change to all the > components in the Traffic Control. I thought our teammate reviewed the > design with you guys before, but it seems not. And after discussion, we > will start the whole feature design review with you guys soon, I think it > will be better to continue the discussion after that. > > Thanks, > Neil > > On 3/29/18, 1:16 AM, "Robert Butts" <robert.o.bu...@gmail.com> wrote: > > I agree with Nir, it's not as simple as changing a structure to > `[]URL`, > it's a bigger architectural design question. > > How do you plan to mark caches Unavailable if they're unhealthy on > one > interface, but healthy on another? > > Right now, Traffic Router needs a boolean for each cache, it > doesn't know > anything about multiple network interfaces, IPv4 vs IPv6, etc. It > only > knows the FQDN, which is all the clients it's giving DNS records > to will > know when they request the cache. > > Questions: > Is a cache marked Unavailable when any interface is unreachable? > Or all of > them? > ZH> Actually, we will care about an IP availability instead of interface > availability. Please take a look at 3.1.2 of the design doc. > > What if an interface is reachable, but one interface reports > different > stats than another interface? For example, what if someone > configures a > different caching proxy (ATS) on each interface? > ZH> Will only use 1 ATS to serve traffic from all IPs configured. > > How are stats aggregated? Should the monitor aggregate all stats > from > different polls and interfaces together, and consider them the same > "server"? If not, how do we reconcile the different stats with > what the > Monitor reports on `CrStates` and `CacheStats`? If so, again, what > happens > if different interfaces have different ATS instances, so e.g. the > byte > count on one is 100, and the other is 1000, then 101, then 1001. > It simply > won't work. Do we handle that? Or just ignore it, and document "all > interfaces must report the same stats"? Do we try to detect that > and give a > useful error or warning? > ZH> The bandwidth for interfaces will be aggregated. We will only have 1 > ATS to serve traffic from all interfaces. The connectivity check is IP > based. And the stats collection will be interface based. Please take a look > at 3.1.2 of the design doc for details. > > In Traffic Ops, servers have specific data used for polling. > Traffic > Monitor gets the stats URI path from Parameters, and the URI IP > from the > Servers table. It doesn't use the FQDN, Server Host or Server > Domain. Where > would these other interfaces come from? Parameters? Or another > table linked > to the servers table? (I'd really, really rather we didn't put > more data in > unsafe Parameters, which can not exist, not be properly formatted, > need > safety checks in all code that ever uses them, and are confusing > and opaque > to new users) Would these other interfaces be in addition to using > the IP > from the Server table? Or replace it? > > Do we have config options for all of these? Only some of them? In > the > config file, or Traffic Ops fields? > ZH> Please take a look at 3.1.1 of the design doc. Basically, we will add > new APIs, or new fields to existing APIs. So this feature implementation > will not impact existing functionality. > > I'd like to hear the use case too, and e.g. why it isn't better to > simply > make each different interface a different server in Traffic Ops? > How is the > ZH> We discussed this solution too. But the main issue is running ort > script for one server will overwrite the ATS configuration for anther > server. The use case is our customer want different client to be served by > different IP. For example a mobile client will be served by different IP of > a PC client. > Traffic Router routing to them, anyway? Are you setting up the > same DNS > record to point to the IPs of all interfaces? How is that > configured in > ZH> For each edge, each DS will be assigned to a single IP. If no > secondary IP specified, it will work just as the behavior today. Please > take a look at 3.1.3 of the design doc. > Traffic Ops then? I.e. which interfaces are configured as the > Server IP and > IP6? Are we certain there aren't other issues in other Traffic > Control > components, with a Server IP and IP6 not having a one-to-one > relationship > with the FQDN A/AAAA record? > ZH> Please check 3.1.1 of the design doc. There will be new pages for > secondary IPs configuration, the current functionality should not be > impacted. > > Do we need to take the bigger step, of having a Traffic Ops Server > have an > array of IPs? That's a lot more work (especially making sure it > works > everywhere, e.g. Traffic Router), but it solves a lot of questions > and > hackery, gives us a lot more flexibility, and matches the physical > reality > better. > ZH> When making this design, we are trying to avoid impact to current > functionality and compatibility with earlier version. So we add extra > tables or fields for secondary IPs. > > I'm not opposed to the idea, but we need to think through the > architecture, > we need to be sure the added complexity is worth it over existing > solutions, we need to make all the options (e.g. Unavailable if > any vs all) > configurable, and we need to make sure the common simple case of a > single > Server IP and IP6 still work without additional configuration > complexity. > ZH> Yes, agree with you. We are trying to not impact the existing > solution. Please take a look at the design doc for more details. > > > > On Wed, Mar 28, 2018 at 10:19 AM, Nir Sopher <n...@qwilt.com> > wrote: > > > Hi Eric/Neil, > > Isn't the question of supporting multi interfaces per server a > much wider > > question? Architectural wise. > > What would be the desired behavior if the monitoring shows that > only one of > > the interfaces is down? Will the router send traffic to the > healthy > > interfaces? How? > > Nir > > > > On Wed, Mar 28, 2018, 19:10 Eric Friedrich (efriedri) < > efrie...@cisco.com> > > wrote: > > > > > The use case behind this question probably deserves a longer > dev@ email. > > > > > > I will oversimplify: we are extending TC to support multiple > IPv4 (or > > > multiple IPv6) addresses per edge cache (across 1 or more > NICs). > > > > > > Assume all addresses are reachable from the TM. > > > > > > —Eric > > > > > > > > > > On Mar 28, 2018, at 11:37 AM, Robert Butts < > robert.o.bu...@gmail.com> > > > wrote: > > > > > > > > When you say different interfaces, do you mean IPv4 versus > IPv6? Or > > > > something else? > > > > > > > > If you mean IPv4 vs IPv6, we have a PR for that from Dylan > Volz > > > > https://github.com/apache/incubator-trafficcontrol/pull/1627 > > > > > > > > I'm hoping to get to it early next week, just haven't found > the time to > > > > review and test it yet. > > > > > > > > Or did you mean something else by "interface"? Linux network > > interfaces? > > > > Ports? > > > > > > > > > > > > On Wed, Mar 28, 2018 at 12:02 AM, Neil Hao (nbaoping) < > > > nbaop...@cisco.com> > > > > wrote: > > > > > > > >> Hi, > > > >> > > > >> Currently, we poll exact one URL request to each cache > server for one > > > >> interface, but now we’d like to add multiple interfaces > support, > > > therefore, > > > >> we need multiple requests to query each interface of the > cache > > server, I > > > >> check the code of Traffic Monitor, it seems we don’t > support this kind > > > of > > > >> polling, right? > > > >> > > > >> I figure out different ways to support this: > > > >> 1) The first way: change the ‘Urls’ field in the > HttpPollerConfig from > > > >> ‘map[string]PollConfig’ to ‘map[string][]PollConfig’, so > that we can > > > have > > > >> multiple polling config to query the multiple interfaces > info. > > > >> > > > >> 2) The second way: Change the ‘URL’ field in the PollConfig > from > > > ‘string’ > > > >> to ‘[]string’. > > > >> > > > >> No matter which way, it seems it will bring a little big > change to the > > > >> current polling model. I’m not sure if I’m on the right > direction, > > would > > > >> you guys have suggestions for this? > > > >> > > > >> Thanks, > > > >> Neil > > > >> > > > > > > > > > > > > >