Replies inline On Wed, Mar 14, 2018 at 10:09 AM, Eric Friedrich (efriedri) <efrie...@cisco.com> wrote: > How much does distance to origin actually impact a Live delivery service? > > Its only the first client for each segment that has to go back to the > origin- the rest of the responses are cached, so they don’t touch the origin > at all.
Good point, maybe this has a greater benefit for HTTP_NO_CACHE DSes, but lower latency is always a good thing. > > > Whats the use case for HTTP_NO_CACHE with Client Steering? Is there a reason > plain-old MultiSiteOrigin wouldn’t work for you on the NO_CACHE delivery > services? We can't currently use MSO for DS-types that bypass the mid tier. Even if we did get that to work though, we wouldn't get the goodness of CLIENT_STEERING in being able to enforce specific ordering/weighting (or a mix of both) between targets on the fly for things like maintenance, capacity differences, beta testing, etc. - Rawlin > > —Eric > > > > > >> On Mar 14, 2018, at 11:45 AM, Rawlin Peters <rawlin.pet...@gmail.com> wrote: >> >> Yes, I'd say that's essentially the main goal - prioritizing redundant >> HTTP_LIVE/HTTP_NO_CACHE deliveryservices to have the shortest distance >> between the edge and the origin. HTTP-type deliveryservices that use >> the MID tier, though, don't really make sense for this feature, so we >> might want to limit geo-steering targets to just >> HTTP_LIVE/HTTP_NO_CACHE. >> >> On Wed, Mar 14, 2018 at 8:56 AM, Eric Friedrich (efriedri) >> <efrie...@cisco.com> wrote: >>> I understand the goals behind Client Steering Delivery Services, but I >>> don’t fully understand the motivation behind these changes. >>> >>> We want redundancy in our live delivery services. Take a national channel >>> and make two copies of it on two origins. Clients can choose DiscoveryA or >>> DiscoveryB based on which one is working better (or at all) for them. All >>> CGs would need both Delivery Services assigned to account for failures >>> where caches holding either Just A or Just B might go offline. >>> >>> OriginA and OriginB would typically be in different locations for the most >>> redundancy. >>> >>> If I’m a client in Boston and I ask for Discovery channel, TR should give >>> me redirects to DiscoveryA and DiscoveryB both in the Boston cache group. >>> >>> Is the goal behind this feature for TR to prioritize the DS list given to >>> the client based on how far origin A is from Boston vs. how far B is from >>> Boston? >>> >>> —Eric >>> >>> >>> >>> >>> >>> >>>> On Mar 13, 2018, at 1:42 PM, Rawlin Peters <rawlin.pet...@gmail.com> wrote: >>>> >>>> replies inline >>>> >>>> On Mon, Mar 12, 2018 at 5:21 PM, Nir Sopher <n...@qwilt.com> wrote: >>>>> Thank you Rawlin for the clarification:) >>>> >>>> You're welcome. Anything I can do to help :) >>>> >>>>> >>>>> Still, I feel like I'm missing a piece of the puzzle here. >>>>> Maybe I do no understand the relations of "origin" and "steering target" >>>>> >>>>> As I see it the router job is to send end users to the optimal cache. It >>>>> has 2 tools for doing so: CZF and Geo >>>>> Using the CZF is preferable, as it is based on the real network topology. >>>>> Geo is a best effort solution, used when we cannot do better. It is not >>>>> necessarily optimal, and has GEO misses, but we must use it since we >>>>> cannot >>>>> map all IPs. >>>> >>>> >>>> Yes, the client's location will be found from the CZF first, falling >>>> back to GEO upon a CZF-miss. Then the most optimal edge cachegroup is >>>> chosen for each steering target deliveryservice. Then, the resulting >>>> list of target deliveryservices will be sorted by total distance >>>> following the path from client -> edge -> origin. >>>> >>>>> >>>>> The cache job is to fetch the content and serve the user. >>>>> It can be optimized to bring the content from the optimal Origin. It can >>>>> be >>>>> configured to do so by specifying the best origin per cache group (in ops >>>>> DB). >>>> >>>> >>>> This is intentionally done as a CLIENT_STEERING deliveryservice so >>>> that a smart client can make the decision to use a different >>>> deliveryservice upon failure. If this decision was made at the caching >>>> proxy level, it would end up being like an optimized version of MSO >>>> (multi-site origin) where the client only has a single URL to request >>>> and the most optimal origin of multiple origins is chosen by the >>>> caching proxy. I don't think that's a bad idea; it's just not the >>>> architecture we want for this. By doing it as client steering we can >>>> also assign weights/ordering between colocated origins and update >>>> those steering assignments at any time. We can form the steering >>>> target list very flexibly this way. >>>> >>>> >>>>> I might be naive here, but as the amount of cache groups is reasonable, >>>>> and >>>>> their network location is much clearer the the end user location, the >>>>> mapping and configuration would be reasonable. Therefore, using >>>>> sub-optimal >>>>> Geo as a tool for choosing the Origin can be avoided. >>>> >>>> >>>> In practice, you could set the coordinates of the Origin to that of >>>> the most optimal cachegroup, rather than assigning the Origin directly >>>> to said cachegroup. The effect would be the same I believe. >>>> >>>>> >>>>> I also did not understand if the suggestion is to use the client location >>>>> for choosing the origin, or the cache group location for choosing the >>>>> origin. >>>>> Using the client location for choosing the origin practically ignores the >>>>> accurate information provided by the CZF. >>>> >>>> >>>> It's a combination of the client location, the edge location, and the >>>> origin location (total distance from client -> edge -> origin). >>>> >>>>> >>>>> What am I missing? >>>>> 10x >>>>> Nir >>>>> >>>>> On Mon, Mar 12, 2018 at 11:19 PM, Rawlin Peters <rawlin.pet...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hey Nir, >>>>>> >>>>>> I think part of the motivation for doing this in Traffic Router rather >>>>>> than the Caching Proxy is separation of concerns. TR is already >>>>>> concerned with routing a client to the best cache based upon the >>>>>> client's location, so TR is already well-equipped to make the decision >>>>>> of how Delivery Services (origins) should be prioritized based upon >>>>>> the client's location. That way the Caching Proxy (e.g. ATS) doesn't >>>>>> need to concern itself with its own location, the client's location, >>>>>> and the location of origins; it just needs to know how to get the >>>>>> origin's content and cache it. All the client needs to know is that >>>>>> they have a prioritized list of URLs to choose from; they don't need >>>>>> to be concerned about origin/edge locations because that >>>>>> prioritization will be made for them by TR. >>>>>> >>>>>> The target DSes will have different origins primarily because they >>>>>> will be in different locations, and the origins should be >>>>>> interchangeable in terms of the content they provide because a smart >>>>>> client may fail over to any of the target DSes in a CLIENT_STEERING DS >>>>>> for the same content. >>>>>> >>>>>> - Rawlin >>>>>> >>>>>> On Mon, Mar 12, 2018 at 2:37 PM, Nir Sopher <n...@qwilt.com> wrote: >>>>>>> Hi Rawlin, >>>>>>> Can you please add a few word for the motivation behind basing the >>>>>> steering >>>>>>> target selection on the location of the client? >>>>>>> As the content goes through the caches, isn't it the job of the cache to >>>>>>> select the best origin for the cache? Why the client should be the one >>>>>> to >>>>>>> take the origin location into consideration? >>>>>>> Why the target DSes have different origins in the first place? Are they >>>>>>> have different characteristics additionally to their location? >>>>>>> Thanks, >>>>>>> Nir >>>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Rawlin Peters <rawlin.pet...@gmail.com> >>>>>>> Date: Mon, Mar 12, 2018 at 9:46 PM >>>>>>> Subject: Delivery Service Origin Refactor >>>>>>> To: dev@trafficcontrol.incubator.apache.org >>>>>>> >>>>>>> >>>>>>> Hey folks, >>>>>>> >>>>>>> As promised, this email thread will be to discuss how to best >>>>>>> associate an Origin Latitude/Longitude with a Delivery Service, >>>>>>> primarily so that steering targets can be ordered/sent to the client >>>>>>> based upon the location of those targets (i.e. the Origin), a.k.a. >>>>>>> Steering Target Geo-Ordering. This is potentially going to be a pretty >>>>>>> large change, so all your feedback/questions/concerns are appreciated. >>>>>>> >>>>>>> Here were a handful of bad ideas I had in order to accomplish this DS >>>>>>> Origin Lat/Long association (feel free to skip to PROPOSED SOLUTION >>>>>>> below): >>>>>>> >>>>>>> 1. Reuse the current MSO (multisite origin) backend (i.e. add the >>>>>>> origin into the servers table, give it a lat/long from its cachegroup, >>>>>>> assign the origin server to the DS) >>>>>>> Pros: >>>>>>> - reuse of existing db schema, probably wouldn't have to add any new >>>>>>> tables/columns >>>>>>> Cons: >>>>>>> - MSO configuration is already very complex >>>>>>> - for the simple case of just wanting to give an Origin a lat/long you >>>>>>> have to create a server (of which only a few fields make sense for an >>>>>>> Origin), add it to a cachegroup (only name and lat/long make sense, >>>>>>> won't use parent relationships, isn't really a "group" of origins), >>>>>>> assign it to a server profile (have to create one first, no parameters >>>>>>> are needed), and finally assign that Origin server to the delivery >>>>>>> service (did I miss anything?) >>>>>>> >>>>>>> 2. Add Origin lat/long columns to the deliveryservice table >>>>>>> Pros: >>>>>>> - probably the most straightforward solution for Steering Target >>>>>>> Geo-Ordering given that Origin FQDN is currently a DS field. >>>>>>> Cons: >>>>>>> - doesn't work well with MSO >>>>>>> - could be confused with Default Miss Lat/Long >>>>>>> - if two different delivery services use colocated origins, the same >>>>>>> lat/long needs entered twice >>>>>>> - adds yet another column to the crowded deliveryservice table >>>>>>> >>>>>>> 3. Add origin lat/long parameters to a Delivery Service Profile >>>>>>> Pros: >>>>>>> - Delivery Services using colocated origins could share the same profile >>>>>>> - no DB schema updates needed >>>>>>> Cons: >>>>>>> - profile parameters lack validation >>>>>>> - still doesn't support lat/long for multiple origins associated with a >>>>>> DS >>>>>>> >>>>>>> 4. Add the lat/long to the steering target itself (i.e. where you >>>>>>> choose weight/order, you'd also enter lat/long) >>>>>>> Pros: >>>>>>> - probably the easiest/quickest solution in terms of development >>>>>>> Cons: >>>>>>> - only applies lat/long to a steering target >>>>>>> - using the same target in multiple Steering DSes means having to keep >>>>>>> the lat/long synced between them all >>>>>>> - lat/long not easily reused by other areas that may need it in the >>>>>> future >>>>>>> >>>>>>> >>>>>>> >>>>>>> PROPOSED SOLUTION: >>>>>>> >>>>>>> All of those ideas were suboptimal, which is why I think we need to: >>>>>>> 1. Split Locations out of the cachegroup table into their own table >>>>>>> with the following columns (cachegroup would have a foreign key to >>>>>>> Location): >>>>>>> - name >>>>>>> - latitude >>>>>>> - longitude >>>>>>> >>>>>>> 2. Split Origins out of the server and deliveryservice tables into >>>>>>> their own table with the following columns: >>>>>>> - fqdn >>>>>>> - protocol (http or https) >>>>>>> - port (optional, can be inferred from protocol) >>>>>>> - location (optional FK to Location table) >>>>>>> - deliveryservice FK (if an Origin can only be associated with a >>>>>>> single DS. Might need step 3 below for many-to-many) >>>>>>> - ip_address (optional, necessary to support `use_ip_address` profile >>>>>>> parameter for using the origin's IP address rather than fqdn in >>>>>>> parent.config) >>>>>>> - ip6_address (optional, necessary because we'd have an ip_address >>>>>>> column for the same reasons) >>>>>>> - profile (optional, primarily for MSO-specific parameters - rank and >>>>>>> weight - but I could be convinced that this is unnecessary) >>>>>>> - cachegroup (optional, necessary to maintain primary/secondary >>>>>>> relationship between MID_LOC and ORG_LOC cachegroups for MSO) >>>>>>> >>>>>>> 3. If many-to-many DSes to Origins will still be possible, create a >>>>>>> new deliveryservice_origin table to support a many-to-many >>>>>>> relationship between DSes and origins >>>>>>> - the rank/weight fields for MSO could be added here possibly, maybe >>>>>>> other things as well? >>>>>>> >>>>>>> 4. Consider constraints in the origin and deliveryservice_origin table >>>>>>> - must fqdn alone be unique? fqdn, protocol, and port combined? >>>>>>> >>>>>>> The process for creating a Delivery Service would change in that >>>>>>> Origins would have to be created separately and added to the delivery >>>>>>> service. However, to aid migration to the new way of doing things, our >>>>>>> UIs could keep the "Origin FQDN" field but the API backend would then >>>>>>> create a new row in the Origin table and add it to the DS. More >>>>>>> Origins could then be added (for MSO purposes) to the DS via a new API >>>>>>> endpoint. MSO configuration would change at least in how Origins are >>>>>>> assigned to a DS ("server assignments" would then just be for >>>>>>> EDGE-type servers). >>>>>>> >>>>>>> Cachegroup creation also changes in that Locations need to be created >>>>>>> before associating them to a Cachegroup. However, our UIs could also >>>>>>> stay the same with the backend API updated to create a Location from >>>>>>> the Cachegroup request and tie it to the Cachegroup. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I know there are a lot of backend and frontend implications with these >>>>>>> changes that would still need to be worked out, but in general does >>>>>>> this proposal sound good? Questions/concerns/feedback welcome and >>>>>>> appreciated! >>>>>>> >>>>>>> - Rawlin >>>>>> >>> >