Replies inline

On Wed, Mar 14, 2018 at 10:09 AM, Eric Friedrich (efriedri)
<efrie...@cisco.com> wrote:
> How much does distance to origin actually impact a Live delivery service?
>
>   Its only the first client for each segment that has to go back to the 
> origin- the rest of the responses are cached, so they don’t touch the origin 
> at all.

Good point, maybe this has a greater benefit for HTTP_NO_CACHE DSes,
but lower latency is always a good thing.

>
>
> Whats the use case for HTTP_NO_CACHE with Client Steering? Is there a reason 
> plain-old MultiSiteOrigin wouldn’t work for you on the NO_CACHE delivery 
> services?

We can't currently use MSO for DS-types that bypass the mid tier. Even
if we did get that to work though, we wouldn't get the goodness of
CLIENT_STEERING in being able to enforce specific ordering/weighting
(or a mix of both) between targets on the fly for things like
maintenance, capacity differences, beta testing, etc.

- Rawlin

>
> —Eric
>
>
>
>
>
>> On Mar 14, 2018, at 11:45 AM, Rawlin Peters <rawlin.pet...@gmail.com> wrote:
>>
>> Yes, I'd say that's essentially the main goal - prioritizing redundant
>> HTTP_LIVE/HTTP_NO_CACHE deliveryservices to have the shortest distance
>> between the edge and the origin. HTTP-type deliveryservices that use
>> the MID tier, though, don't really make sense for this feature, so we
>> might want to limit geo-steering targets to just
>> HTTP_LIVE/HTTP_NO_CACHE.
>>
>> On Wed, Mar 14, 2018 at 8:56 AM, Eric Friedrich (efriedri)
>> <efrie...@cisco.com> wrote:
>>> I understand the goals behind Client Steering Delivery Services, but I 
>>> don’t fully understand the motivation behind these changes.
>>>
>>> We want redundancy in our live delivery services.  Take a national channel 
>>> and make two copies of it on two origins. Clients can choose DiscoveryA or 
>>> DiscoveryB based on which one is working better (or at all) for them. All 
>>> CGs would need both Delivery Services assigned to account for failures 
>>> where caches holding either Just A or Just B might go offline.
>>>
>>> OriginA and OriginB would typically be in different locations for the most 
>>> redundancy.
>>>
>>> If I’m a client in Boston and I ask for Discovery channel, TR should give 
>>> me redirects to DiscoveryA and DiscoveryB  both in the Boston cache group.
>>>
>>> Is the goal behind this feature for TR to prioritize the DS list given to 
>>> the client based on how far origin A is from Boston vs. how far B is from 
>>> Boston?
>>>
>>> —Eric
>>>
>>>
>>>
>>>
>>>
>>>
>>>> On Mar 13, 2018, at 1:42 PM, Rawlin Peters <rawlin.pet...@gmail.com> wrote:
>>>>
>>>> replies inline
>>>>
>>>> On Mon, Mar 12, 2018 at 5:21 PM, Nir Sopher <n...@qwilt.com> wrote:
>>>>> Thank you Rawlin for the clarification:)
>>>>
>>>> You're welcome. Anything I can do to help :)
>>>>
>>>>>
>>>>> Still, I feel like I'm missing a piece of the puzzle here.
>>>>> Maybe I do no understand the relations of "origin" and "steering target"
>>>>>
>>>>> As I see it the router job is to send end users to the optimal cache. It
>>>>> has 2 tools for doing so: CZF and Geo
>>>>> Using the CZF is preferable, as it is based on the real network topology.
>>>>> Geo is a best effort solution, used when we cannot do better. It is not
>>>>> necessarily optimal, and has GEO misses, but we must use it since we 
>>>>> cannot
>>>>> map all IPs.
>>>>
>>>>
>>>> Yes, the client's location will be found from the CZF first, falling
>>>> back to GEO upon a CZF-miss. Then the most optimal edge cachegroup is
>>>> chosen for each steering target deliveryservice. Then, the resulting
>>>> list of target deliveryservices will be sorted by total distance
>>>> following the path from client -> edge -> origin.
>>>>
>>>>>
>>>>> The cache job is to fetch the content and serve the user.
>>>>> It can be optimized to bring the content from the optimal Origin. It can 
>>>>> be
>>>>> configured to do so by specifying the best origin per cache group (in ops
>>>>> DB).
>>>>
>>>>
>>>> This is intentionally done as a CLIENT_STEERING deliveryservice so
>>>> that a smart client can make the decision to use a different
>>>> deliveryservice upon failure. If this decision was made at the caching
>>>> proxy level, it would end up being like an optimized version of MSO
>>>> (multi-site origin) where the client only has a single URL to request
>>>> and the most optimal origin of multiple origins is chosen by the
>>>> caching proxy. I don't think that's a bad idea; it's just not the
>>>> architecture we want for this. By doing it as client steering we can
>>>> also assign weights/ordering between colocated origins and update
>>>> those steering assignments at any time. We can form the steering
>>>> target list very flexibly this way.
>>>>
>>>>
>>>>> I might be naive here, but as the amount of cache groups is reasonable, 
>>>>> and
>>>>> their network location is much clearer the the end user location, the
>>>>> mapping and configuration would be reasonable. Therefore, using 
>>>>> sub-optimal
>>>>> Geo as a tool for choosing the Origin can be avoided.
>>>>
>>>>
>>>> In practice, you could set the coordinates of the Origin to that of
>>>> the most optimal cachegroup, rather than assigning the Origin directly
>>>> to said cachegroup. The effect would be the same I believe.
>>>>
>>>>>
>>>>> I also did not understand if the suggestion is to use the client location
>>>>> for choosing the origin, or the cache group location for choosing the
>>>>> origin.
>>>>> Using the client location for choosing the origin practically ignores the
>>>>> accurate information provided by the CZF.
>>>>
>>>>
>>>> It's a combination of the client location, the edge location, and the
>>>> origin location (total distance from client -> edge -> origin).
>>>>
>>>>>
>>>>> What am I missing?
>>>>> 10x
>>>>> Nir
>>>>>
>>>>> On Mon, Mar 12, 2018 at 11:19 PM, Rawlin Peters <rawlin.pet...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hey Nir,
>>>>>>
>>>>>> I think part of the motivation for doing this in Traffic Router rather
>>>>>> than the Caching Proxy is separation of concerns. TR is already
>>>>>> concerned with routing a client to the best cache based upon the
>>>>>> client's location, so TR is already well-equipped to make the decision
>>>>>> of how Delivery Services (origins) should be prioritized based upon
>>>>>> the client's location. That way the Caching Proxy (e.g. ATS) doesn't
>>>>>> need to concern itself with its own location, the client's location,
>>>>>> and the location of origins; it just needs to know how to get the
>>>>>> origin's content and cache it. All the client needs to know is that
>>>>>> they have a prioritized list of URLs to choose from; they don't need
>>>>>> to be concerned about origin/edge locations because that
>>>>>> prioritization will be made for them by TR.
>>>>>>
>>>>>> The target DSes will have different origins primarily because they
>>>>>> will be in different locations, and the origins should be
>>>>>> interchangeable in terms of the content they provide because a smart
>>>>>> client may fail over to any of the target DSes in a CLIENT_STEERING DS
>>>>>> for the same content.
>>>>>>
>>>>>> - Rawlin
>>>>>>
>>>>>> On Mon, Mar 12, 2018 at 2:37 PM, Nir Sopher <n...@qwilt.com> wrote:
>>>>>>> Hi Rawlin,
>>>>>>> Can you please add a few word for the motivation behind basing the
>>>>>> steering
>>>>>>> target selection on the location of the client?
>>>>>>> As the content goes through the caches, isn't it the job of the cache to
>>>>>>> select the best origin for the cache?  Why the client should be the one
>>>>>> to
>>>>>>> take the origin location into consideration?
>>>>>>> Why the target DSes have different origins in the first place? Are they
>>>>>>> have different characteristics additionally to their location?
>>>>>>> Thanks,
>>>>>>> Nir
>>>>>>>
>>>>>>> ---------- Forwarded message ----------
>>>>>>> From: Rawlin Peters <rawlin.pet...@gmail.com>
>>>>>>> Date: Mon, Mar 12, 2018 at 9:46 PM
>>>>>>> Subject: Delivery Service Origin Refactor
>>>>>>> To: dev@trafficcontrol.incubator.apache.org
>>>>>>>
>>>>>>>
>>>>>>> Hey folks,
>>>>>>>
>>>>>>> As promised, this email thread will be to discuss how to best
>>>>>>> associate an Origin Latitude/Longitude with a Delivery Service,
>>>>>>> primarily so that steering targets can be ordered/sent to the client
>>>>>>> based upon the location of those targets (i.e. the Origin), a.k.a.
>>>>>>> Steering Target Geo-Ordering. This is potentially going to be a pretty
>>>>>>> large change, so all your feedback/questions/concerns are appreciated.
>>>>>>>
>>>>>>> Here were a handful of bad ideas I had in order to accomplish this DS
>>>>>>> Origin Lat/Long association (feel free to skip to PROPOSED SOLUTION
>>>>>>> below):
>>>>>>>
>>>>>>> 1. Reuse the current MSO (multisite origin) backend (i.e. add the
>>>>>>> origin into the servers table, give it a lat/long from its cachegroup,
>>>>>>> assign the origin server to the DS)
>>>>>>> Pros:
>>>>>>> - reuse of existing db schema, probably wouldn't have to add any new
>>>>>>> tables/columns
>>>>>>> Cons:
>>>>>>> - MSO configuration is already very complex
>>>>>>> - for the simple case of just wanting to give an Origin a lat/long you
>>>>>>> have to create a server (of which only a few fields make sense for an
>>>>>>> Origin), add it to a cachegroup (only name and lat/long make sense,
>>>>>>> won't use parent relationships, isn't really a "group" of origins),
>>>>>>> assign it to a server profile (have to create one first, no parameters
>>>>>>> are needed), and finally assign that Origin server to the delivery
>>>>>>> service (did I miss anything?)
>>>>>>>
>>>>>>> 2. Add Origin lat/long columns to the deliveryservice table
>>>>>>> Pros:
>>>>>>> - probably the most straightforward solution for Steering Target
>>>>>>> Geo-Ordering given that Origin FQDN is currently a DS field.
>>>>>>> Cons:
>>>>>>> - doesn't work well with MSO
>>>>>>> - could be confused with Default Miss Lat/Long
>>>>>>> - if two different delivery services use colocated origins, the same
>>>>>>> lat/long needs entered twice
>>>>>>> - adds yet another column to the crowded deliveryservice table
>>>>>>>
>>>>>>> 3. Add origin lat/long parameters to a Delivery Service Profile
>>>>>>> Pros:
>>>>>>> - Delivery Services using colocated origins could share the same profile
>>>>>>> - no DB schema updates needed
>>>>>>> Cons:
>>>>>>> - profile parameters lack validation
>>>>>>> - still doesn't support lat/long for multiple origins associated with a
>>>>>> DS
>>>>>>>
>>>>>>> 4. Add the lat/long to the steering target itself (i.e. where you
>>>>>>> choose weight/order, you'd also enter lat/long)
>>>>>>> Pros:
>>>>>>> - probably the easiest/quickest solution in terms of development
>>>>>>> Cons:
>>>>>>> - only applies lat/long to a steering target
>>>>>>> - using the same target in multiple Steering DSes means having to keep
>>>>>>> the lat/long synced between them all
>>>>>>> - lat/long not easily reused by other areas that may need it in the
>>>>>> future
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> PROPOSED SOLUTION:
>>>>>>>
>>>>>>> All of those ideas were suboptimal, which is why I think we need to:
>>>>>>> 1. Split Locations out of the cachegroup table into their own table
>>>>>>> with the following columns (cachegroup would have a foreign key to
>>>>>>> Location):
>>>>>>> - name
>>>>>>> - latitude
>>>>>>> - longitude
>>>>>>>
>>>>>>> 2. Split Origins out of the server and deliveryservice tables into
>>>>>>> their own table with the following columns:
>>>>>>> - fqdn
>>>>>>> - protocol (http or https)
>>>>>>> - port (optional, can be inferred from protocol)
>>>>>>> - location (optional FK to Location table)
>>>>>>> - deliveryservice FK (if an Origin can only be associated with a
>>>>>>> single DS. Might need step 3 below for many-to-many)
>>>>>>> - ip_address (optional, necessary to support `use_ip_address` profile
>>>>>>> parameter for using the origin's IP address rather than fqdn in
>>>>>>> parent.config)
>>>>>>> - ip6_address (optional, necessary because we'd have an ip_address
>>>>>>> column for the same reasons)
>>>>>>> - profile (optional, primarily for MSO-specific parameters - rank and
>>>>>>> weight - but I could be convinced that this is unnecessary)
>>>>>>> - cachegroup (optional, necessary to maintain primary/secondary
>>>>>>> relationship between MID_LOC and ORG_LOC cachegroups for MSO)
>>>>>>>
>>>>>>> 3. If many-to-many DSes to Origins will still be possible, create a
>>>>>>> new deliveryservice_origin table to support a many-to-many
>>>>>>> relationship between DSes and origins
>>>>>>> - the rank/weight fields for MSO could be added here possibly, maybe
>>>>>>> other things as well?
>>>>>>>
>>>>>>> 4. Consider constraints in the origin and deliveryservice_origin table
>>>>>>> - must fqdn alone be unique? fqdn, protocol, and port combined?
>>>>>>>
>>>>>>> The process for creating a Delivery Service would change in that
>>>>>>> Origins would have to be created separately and added to the delivery
>>>>>>> service. However, to aid migration to the new way of doing things, our
>>>>>>> UIs could keep the "Origin FQDN" field but the API backend would then
>>>>>>> create a new row in the Origin table and add it to the DS. More
>>>>>>> Origins could then be added (for MSO purposes) to the DS via a new API
>>>>>>> endpoint. MSO configuration would change at least in how Origins are
>>>>>>> assigned to a DS ("server assignments" would then just be for
>>>>>>> EDGE-type servers).
>>>>>>>
>>>>>>> Cachegroup creation also changes in that Locations need to be created
>>>>>>> before associating them to a Cachegroup. However, our UIs could also
>>>>>>> stay the same with the backend API updated to create a Location from
>>>>>>> the Cachegroup request and tie it to the Cachegroup.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I know there are a lot of backend and frontend implications with these
>>>>>>> changes that would still need to be worked out, but in general does
>>>>>>> this proposal sound good? Questions/concerns/feedback welcome and
>>>>>>> appreciated!
>>>>>>>
>>>>>>> - Rawlin
>>>>>>
>>>
>

Reply via email to