Jeff and I had a quick Slack convo, so I’ll add a followup summary here in case 
anyone else is interested.

Cache Group location (lat/long) is configured in Traffic Ops today (and is used 
for computing distance from Maxmind Geolocation). 

You can also configure the location (lat/long) for a Cache Group in the 
CoverageZone file (example below). 

When this location is configured (and Jeff’s suggested logic fix from below is 
applied) and all caches in the mapped cache group are unavailable, TR will send 
a client request to the cache group that is closest to the original mapped 
group. 

Example CZF w/ cache location
-----
"coverageZones": {
    “edge-cg-1": {
      "network6": [
        ...
      ],
      "network": [
        ...
      ],
      "coordinates": {
        "longitude": “-75.3342",
        "latitude": “42.555"
      }
    },


—Eric


> On Jan 5, 2017, at 12:06 PM, Jeff Elsloo <[email protected]> wrote:
> 
> If we applied the proposed change, given your scenario we should fall
> through to the return statement that calls getClosestCacheLocation().
> That method will order all cache groups based on their lat/long and
> the lat/long of the cache group we hit on in the CZF. Once the list is
> ordered, we iterate through the list until we find a cache group that
> has available caches for that DS.
> 
> BTW, the stuff on line 536 is likely to produce the exact same result
> as the check that precedes it. networkNode.getLoc() will return the
> string name of the cache group, so when we find the CacheLocation, it
> will be the same as what we had just checked. We could probably get
> away with removing that part of the method as it's redundant.
> --
> Thanks,
> Jeff
> 
> 
> On Wed, Jan 4, 2017 at 11:54 AM, Eric Friedrich (efriedri)
> <[email protected]> wrote:
>> Where would TR look outside the assigned cache group to find the next 
>> closest cache group?
>> 
>>> On Jan 4, 2017, at 11:25 AM, Eric Friedrich (efriedri) <[email protected]> 
>>> wrote:
>>> 
>>> 
>>> On Jan 3, 2017, at 5:20 PM, Jeff Elsloo 
>>> <[email protected]<mailto:[email protected]>> wrote:
>>> 
>>> Hey Eric,
>>> 
>>> It sounds like the use case you're after is an RFC 1918 client
>>> associated with a cache group whose caches are all unavailable for one
>>> reason or another. Is that correct?
>>> Yes, exactly.
>>> 
>>> 
>>> I looked at the code a bit, and I think that we can make a minor
>>> change to achieve the behavior you're looking for as long as you're
>>> able to put your RFC 1918 ranges in the CZF.
>>> Yes, we would want those ranges in the CZF. I can’t think of any other 
>>> place they would go.
>>> 
>>> 
>>> There's a small logic gap in the existing algorithm around cache
>>> location selection and I think if we fix that (two line change), we
>>> should be better off all around. I think the only time we'd ever want
>>> to go to the geolocation provider is in the event of a miss on the
>>> CZF, so as long as we have a hit there, we should find the cache group
>>> closest to that hit location that has available caches. This would
>>> automatically provide the "backup" cache group concept, and has the
>>> added benefit of doing this selection dynamically based on the state
>>> of the CDN.
>>> Wow, thanks for picking up on this solution. Sounds like a strong 
>>> possibility. I like that it can extend dynamically.
>>> 
>>> 
>>> 
>>> See this to get an idea of what I mean: http://apaste.info/u3PQo
>>> https://github.com/apache/incubator-trafficcontrol/blob/249bd7504eeb7cc43402126f3719017e2475ad33/traffic_router/core/src/main/java/com/comcast/cdn/traffic_control/traffic_router/core/router/TrafficRouter.java#L536
>>> Does this line set cacheLocation to the closest cache group with active 
>>> caches on that DS?
>>> 
>>> What does networkNode.getLoc() actually return?
>>> 
>>> —Eric
>>> 
>>> 
>>> 
>>> Obviously we'd need to test this to ensure we don't break other 
>>> functionality.
>>> --
>>> Thanks,
>>> Jeff
>>> 
>>> 
>>> On Tue, Jan 3, 2017 at 10:07 AM, Eric Friedrich (efriedri)
>>> <[email protected]<mailto:[email protected]>> wrote:
>>> If all caches in the primary cache group are unavailable, our goal is to 
>>> provide a backup routing policy for RFC1918 clients.
>>> 
>>> When client IP is an public Internet IP, the current backup policy is to 
>>> assign the client to the geographically closest cache (Distance = MaxMind 
>>> Geo Lat/Long - configured CG lat/long).
>>> 
>>> When client IP is an RFC1918 IP, the client would not have a maxmind 
>>> geo-loc, so would fall back to the DS geo-miss lat long. We’d prefer some 
>>> more granular control over where these clients are routed to, rather than a 
>>> per-DS setting.
>>> 
>>> 
>>> So with an RFC1918 client, the lookup process would be (step 3 is only 
>>> addition)
>>> 1) Check CZF for a subnet match (and find a match for existing cache 
>>> group). Assign client to CG
>>> 2) Check CG for available (online and associated w/ DS) servers. In this 
>>> particular case, assume CG has no servers available to route the client to
>>> 3) Walk the CZF's list of backup CGs and perform the check from #2 for each 
>>> CG. Use first server that is found
>>> 4) Assuming no server is found in #3, perform geo-location and find closest 
>>> cache group. Use a server from the closest CG if one is found
>>> 4a) If geo-location returns null, use the DS’ default geo-miss location as 
>>> the client location.
>>> 
>>> —Eric
>>> 
>>> 
>>> On Dec 26, 2016, at 10:01 AM, Jan van Doorn 
>>> <[email protected]<mailto:[email protected]>> wrote:
>>> 
>>> Hi Eric,
>>> 
>>> How does the backup list relate to the RFC1918-is-not-in-geo problem?
>>> 
>>> To get to a cachegroup you need to get a match in the coverage zone, I 
>>> would think?
>>> 
>>> Rgds,
>>> JvD
>>> 
>>> On Dec 22, 2016, at 12:28, Eric Friedrich (efriedri) 
>>> <[email protected]<mailto:[email protected]>> wrote:
>>> 
>>> The current behavior of cache group selection works as follows
>>> 1) Look for a subnet match in CZF
>>> 2) Use MaxMind/Neustar for GeoLocation based on client IP. Choose closest 
>>> cache group.
>>> 3) Use Delivery Service Geo-Miss Lat/Long. Choose closest cache group.
>>> 
>>> 
>>> For deployments where IP addressing is primarily private (say RFC-1918 
>>> addresses), client IP Geo Location (#2) is not useful.
>>> 
>>> 
>>> We are considering adding another field to the Coverage Zone File that 
>>> configures an ordered list of backup cache groups to try if the primary 
>>> cache group does not have any available caches.
>>> 
>>> Example:
>>> 
>>> "coverageZones": {
>>> "cache-group-01": {
>>> “backupList”: [“cache-group-02”, “cache-group-03”],
>>> "network6": [
>>>  "1234:5678::\/64”,
>>>  "1234:5679::\/64"],
>>> "network": [
>>>  "192.168.8.0\/24",
>>>  "192.168.9.0\/24”]
>>> }
>>> 
>>> This configuration could also be part of the per-cache group configuration, 
>>> but that would give less control over which clients preferred which cache 
>>> groups. For example, you may have cache groups in LA, Chicago and NY. If 
>>> the Chicago Cache group fails, you may want some of the Chicago clients to 
>>> go to LA and some to go to NY. If the backup CG configuration is per-cg, we 
>>> would not be able to control where clients are allocated.
>>> 
>>> Looking for opinions and comments on the above proposal, this is still in 
>>> idea stage.
>>> 
>>> Thanks All!
>>> Eric
>>> 
>>> 
>>> 
>>> 
>>> 
>> 

Reply via email to