mitchell852 commented on a change in pull request #4537: Add blueprint for 
Flexible Topologies
URL: https://github.com/apache/trafficcontrol/pull/4537#discussion_r398202529
 
 

 ##########
 File path: blueprints/flexible-topologies.md
 ##########
 @@ -0,0 +1,395 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Flexible Topologies
+
+## Problem Description
+
+Today, a Traffic Control CDN is limited to 2 tiers -- *EDGE* and *MID* -- with 
the option to skip the *MID* tier for certain Delivery Service types (e.g. 
`HTTP_LIVE` and `HTTP_NO_CACHE`). In addition, a CDN is limited to one global 
parent hierarchy, which is defined via the `parent_cachegroup` and 
`secondary_parent_cachegroup` fields of cachegroups. Both of these problems 
limit a CDN's ability to scale with increased demand and changing usage 
patterns, and providing the ability to add more tiers to a CDN helps it keep up 
with that growth. A Topology that works well for one set of Delivery Services 
might not be ideal for a different set of Delivery Services, and a CDN needs 
the flexibility to provide the best Topology for any given Delivery Service -- 
with any number of tiers and custom caching hierarchies.
+
+## Proposed Change
+
+Traffic Control will provide the ability to define one or more Topologies, and 
a Topology can have any number of Delivery Services assigned to it. A Topology 
will be composed of Cachegroups along with their primary/secondary parent 
relationships to other Cachegroups as defined by the Topology.
+
+If a Delivery Service is assigned to a Topology, any `deliveryservice_server` 
assignments it has to `EDGE` caches will be ignored, because it will be 
assigned to all caches in the Delivery Service's CDN (filtered by server 
capabilities) that belong to the Topology's cachegroups. Ideally, this feature 
will obsolete legacy `deliveryservice_server` assignments, since Topologies 
negate the need to assign Delivery Services to individual `EDGE` caches. 
Nonetheless, legacy `deliveryservice_server` assignments will be supported 
alongside Topology-based Delivery Services for some time until all Delivery 
Services have been migrated to Topologies.
+
+### Traffic Portal Impact
+
+Traffic Portal will need new pages for creating and viewing Topologies, and 
the Delivery Service form will need to be updated to add a new Topology field 
for assigning a Delivery Service to a Topology. If a Delivery Service is 
assigned to a Topology, Traffic Portal should prohibit assigning `EDGE` servers 
to the Delivery Service (`ORIGIN` servers may still need to be assignable for 
MSO).
+
+Since Delivery Services will no longer be constrained to one global Topology 
as they are today, it would be extremely useful to be able to visualize a 
Delivery Service's Topology like a tree, where each node in the tree is a 
cachegroup, and the edges between nodes are the primary/secondary parent 
relationships between them. Clicking on a particular node would show all the 
servers in that cachegroup that could serve a request for the Delivery Service. 
This visualization will most likely be different from the Topology form for 
creating a Topology and does not necessarily need to be provided by Traffic 
Portal.
+
+### Traffic Ops Impact
+
+Traffic Ops will provide the ability to create Topologies, composed of 
cachegroups and parent relationships, which will be assignable to one or more 
Delivery Services.
+
+#### REST API Impact
+
+The following is the JSON representation of a `Topology` object:
+
+```JSON
+{
+    "name": "foo",
+    "description": "a foo topology",
+    "nodes": [
+        {
+            "cachegroup": "child-cachegroup",
+            "parents": [1, 2]
+        },
+        {
+            "cachegroup": "parent-cachegroup",
+            "parents": []
+        },
+        {
+            "cachegroup": "secondary-parent-cachegroup",
+            "parents": []
+        }
+    ]
+}
+```
+
+The following table describes the top-level `Topology` object:
+
+| field       | type                        | optionality | description        
                                                 |
+| ----------- | --------------------------- | ----------- | 
------------------------------------------------------------------- |
+| name        | string                      | required    | a unique name for 
identifying this Topology                         |
+| description | string                      | required    | the description of 
this Topology                                    |
+| nodes       | array of `node` sub-objects | required    | the set of `nodes` 
in this topology, similar to an *adjacency list* |
+
+The following table describes the `node` sub-object:
+
+| field      | type              | optionality | description                   
                                                                                
                                                                                
              |
+| ---------- | ----------------- | ----------- | 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 |
+| cachegroup | string            | required    | the `short_name` of a 
cachegroup this node maps to in the Topology                                    
                                                                                
                      |
+| parents    | array of integers | required    | zero-based indexes to other 
nodes in the Topology's `nodes` array, where the 1st element is for the 
*primary* parent relationship and the 2nd element is for the *secondary* parent 
relationship, and so on |
+
+API constraints:
+- a Topology must have at least 1 `node`; otherwise, it is useless
+- there cannot be multiple `nodes` for the same cachegroup in a Topology
+- `parents` must have 0, 1 or 2 elements, cannot contain duplicates, cannot 
contain the index of its own `node`, and cannot contain the index of `nodes` 
whose cachegroup is of type `EDGE_LOC`
+- leaf `nodes` must be cachegroups of type `EDGE_LOC`
+- all `nodes` in the Topology must be reachable -- i.e. a `node` is either a 
leaf (which would be an `EDGE_LOC`) or is a parent of at least one other node
+- a Topology cannot contain a cycle (through any combination of 
primary/secondary parent relationships)
+
+The following new endpoints will be required:
+
+##### `GET /topologies`
+
+response JSON:
+```JSON
+{ "response": [
+    {
+        "name": "foo",
+        "description": "a foo topology",
+        "nodes": [
+            {
+                "cachegroup": "child-cachegroup",
+                "parents": [1, 2]
+            },
+            {
+                "cachegroup": "parent-cachegroup",
+                "parents": []
+            },
+            {
+                "cachegroup": "secondary-parent-cachegroup",
+                "parents": []
+            }
+        ]
+    }
+]}
+```
+
+##### `POST /topologies`
+
+request JSON:
+```JSON
+{
+    "name": "foo",
+    "description": "a foo topology",
+    "nodes": [
+        {
+            "cachegroup": "child-cachegroup",
+            "parents": [1, 2]
+        },
+        {
+            "cachegroup": "parent-cachegroup",
+            "parents": []
+        },
+        {
+            "cachegroup": "secondary-parent-cachegroup",
+            "parents": []
+        }
+    ]
+}
+```
+
+response JSON:
+```JSON
+{
+    "alerts": [
+        {
+            "text": "topology was created successfully",
+            "level": "success"
+        }
+    ],
+    "response": {
+        "name": "foo",
+        "description": "a foo topology",
+        "nodes": [
+            {
+                "cachegroup": "child-cachegroup",
+                "parents": [1, 2]
+            },
+            {
+                "cachegroup": "parent-cachegroup",
+                "parents": []
+            },
+            {
+                "cachegroup": "secondary-parent-cachegroup",
+                "parents": []
+            }
+        ]
+    }
+}
+
+```
+
+##### `PUT /topologies?name=foo`
+
+request JSON:
+```JSON
+{
+    "name": "foo",
+    "description": "a foo topology",
+    "nodes": [
+        {
+            "cachegroup": "child-cachegroup",
+            "parents": [1, 2]
+        },
+        {
+            "cachegroup": "parent-cachegroup",
+            "parents": []
+        },
+        {
+            "cachegroup": "secondary-parent-cachegroup",
+            "parents": []
+        }
+    ]
+}
+```
+
+response JSON:
+```JSON
+{
+    "alerts": [
+        {
+            "text": "topology was updated successfully",
+            "level": "success"
+        }
+    ],
+    "response": {
+        "name": "foo",
+        "description": "a foo topology",
+        "nodes": [
+            {
+                "cachegroup": "child-cachegroup",
+                "parents": [1, 2]
+            },
+            {
+                "cachegroup": "parent-cachegroup",
+                "parents": []
+            },
+            {
+                "cachegroup": "secondary-parent-cachegroup",
+                "parents": []
+            }
+        ]
+    }
+}
+```
+
+##### `DELETE /topologies?name=foo`
+
+response JSON:
+```JSON
+{
+    "alerts": [
+        {
+            "text": "topology was deleted successfully",
+            "level": "success"
+        }
+    ]
+}
+```
+
+##### `/deliveryservices` endpoints
+
+All relevant Delivery Service APIs will have their JSON request and response 
objects modified to include a new `topology` field which references the name of 
the topology it's assigned to:
+```JSON
+{
+    ...
+    "topology": "foo"
+}
+```
+
+##### The various `/snapshot` endpoints
+
+The various `/snapshot` endpoints will need to be updated to include new 
Topologies data along with their associations to Delivery Services in the 
`CRConfig.json` snapshot. The data should only include the `EDGE_LOC` 
cachegroups of the Topologies, because those are all Traffic Router needs.
+
+##### Various endpoints that are affected by cachegroup parentage or 
deliveryservice-server assignment
+
+API endpoints that do things such as the following may need to be updated to 
take Topology-based Delivery Service assignment and parentage into account:
+- assign a Delivery Service to a server (or vice versa)
+- return the servers that are assigned to a Delivery Service (or vice versa)
+- perform an operation on "child" cachegroups -- like queueing updates on 
"child" caches when changing the status of a "parent" cache
+
+#### Client Impact
+
+New Go client methods will be added for the `/topologies` endpoints in order 
to write TO API tests for the new endpoints. The `/deliveryservices` client 
methods won't need modified as the new `Topology` field will simply be added to 
the `DeliveryService` struct. New client methods for the Python client will 
also be added for each of the new `/topologies` endpoints.
+
+#### Data Model Impact
+
+##### Go structs
+
+New `Topology` and `TopologyNode` structs will be added for the `/topologies` 
endpoints, mapping directly to the JSON request bodies in the REST API Impact 
section. The `DeliveryService` struct will be updated with a new `Topology` 
field which is the name of the Topology the Delivery Service is assigned to.
+
+##### Traffic Ops Database
+
+A new `topology` table will be created:
+
+| column      | type | modifiers    |
+| ----------- | ---- | ------------ |
+| name        | text | not null, PK |
+| description | text |              |
+
+A new `topology_cachegroup` table will be created to model the association of 
cachegroups to topologies:
+
+| column     | type | modifiers                                       |
+| ---------- | ---- | ----------------------------------------------- |
+| id         | int  | not null, PK                                    |
+| topology   | text | not null, FK: references topology(name)         |
+| cachegroup | text | not null, FK: references cachegroup(short_name) |
+
+**Constraints**:
+- unique (topology, cachegroup) -- a cachegroup can only be in a Topology once.
+
+A new `topology_cachegroup_parents` table will be created to model the parent 
relationships of cachegroups within a topology:
+
+| column | type | modifiers                                        |
+| ------ | ---- | ------------------------------------------------ |
+| child  | int  | not null, FK: references topology_cachegroup(id) |
+| parent | int  | not null, FK: references topology_cachegroup(id) |
+| rank   | int  | not null                                         |
+
+**Constraints**:
+- unique (child, rank) -- within a Topology, a cachegroup can only have one 
primary parent, one secondary parent, and so on.
+- unique (child, parent) -- within a Topology, a cachegroup cannot relate to 
another cachegroup more than once.
+- check (rank is either 1 or 2) -- a cachegroup can only have primary and 
secondary parents currently.
+
+The `deliveryservice` table will be updated to add a new column for the 
topology it is associated to:
+
+| column   | type | modifiers                     |
+| -------- | ---- | ----------------------------- |
+| topology | text | FK: references topology(name) |
+
+### ORT Impact
+
+`atstccfg` will need to be updated to request the Topologies from Traffic Ops 
and use that data to determine the following for config generation:
+- what delivery services are assigned to a cache via Topologies -- in addition 
to legacy `deliveryservice_server` assignments used today
+- if a delivery service is assigned to a cache via a Topology, the parent and 
secondary parent cachegroups for that delivery service are determined via the 
Topology it's assigned to. Otherwise, the parents are determined by the 
server's cachegroup as they are today.
+
+Since new Topologies can be more than 2 tiers (`EDGE` -> `MID`), `atstccfg` 
may need to break some assumptions about the current 2-tier hierarchy in order 
to work with an arbitrary number of "forward proxy tiers" -- e.g. `EDGE` -> 
`MID` -> `MID` -> `ORIGIN`. Basically, `MID` caches need to be able to forward 
requests to other `MID` caches -- they can no longer assume that their parents 
are always origins.
+
+### Traffic Monitor Impact
+
+There should be little (if any) impact to Traffic Monitor for this feature.
+
+### Traffic Router Impact
+
+Traffic Router will need to be made aware of Topologies and their associations 
to Delivery Services via additions to the CRConfig. No new TR profile 
parameters should be required to enable Topology-based routing since Topologies 
are configurable on a per-delivery-service basis.
+
+The CRConfig will need a new top-level field for `topologies`, which will be a 
map of Topology names to arrays of cachegroup names that make up the "edge 
tier" of that Topology. The `deliveryServices` section will add an optional 
`topology` field to each delivery service that is assigned to a Topology. 
Delivery services that have a Topology assigned will not be referenced 
explicitly by any `contentServer` objects. Traffic Router will use the Topology 
information to determine which "edge" cachegroups can be routed to for a 
particular delivery service. While Traffic Router *could* assume that every 
cache in a cachegroup for a Topology could be routed to, we might need to make 
Traffic Router aware of Server Capabilities so that it only routes to servers 
in a Topology that have the required Capabilities.
+
+Since Topologies will be optional, new Traffic Routers should remain 
backwards-compatible with old CRConfigs, and old Traffic Routers should remain 
forwards-compatible with new CRConfigs because Traffic Router ignores unknown 
fields by default.
+
+### Traffic Stats Impact
+
+There should be little (if any) impact to Traffic Stats for this feature.
+
+### Traffic Vault Impact
+
+This feature should not require any changes to Traffic Vault or its related 
APIs in Traffic Ops.
+
+### Documentation Impact
+
+The Traffic Ops API reference will be updated to include the new `/topologies` 
API endpoints as well as all of the relevant `deliveryservices` endpoints. It 
may be useful to include a new "Topologies Overview/How-To" section in the docs 
describing how to create and use custom Topologies.
+
+### Testing Impact
+
+For Traffic Ops, new API tests will be written for the new `/topologies` API 
endpoints, and existing API tests for the `/deliveryservices` endpoints will be 
updated to test the Topology association.
+
+Traffic Router unit and/or integration tests will be added to test the new 
functionality of Topology-based delivery services.
+
+Unit tests for `atstccfg` will be added in order to validate ATS config 
generation for Topology-based delivery services.
+
+Automated end-to-end environment creation (such as CDN-in-a-box) should be 
updated to include the creation of arbitrary Topologies that would be assigned 
to one or more delivery services. Those delivery services should be tested by 
an end user HTTP client to verify the basic functionality of their Topologies. 
Additionally, the request/data flows should be observed to verify that they 
match up with the given Topology as expected.
+
+### Performance Impact
+
+There should be no visible impact to performance from an application 
perspective -- this feature does not introduce anything particularly CPU, 
network, or storage-intensive. However, this feature will allow the tuning of 
Topologies in the CDN, which *will* affect end-to-end performance of the CDN in 
terms of things like latency, cache hit ratio, cache efficiency, etc. CDN 
architects will be able to make certain trade-offs in their Topologies until 
their desired end-to-end performance characteristics are met.
+
+Additionally, the more Delivery Services that are migrated from legacy 
`deliveryservice_server` assignments to Topologies, the smaller the size of the 
`CRConfig` will get. Currently, `deliveryservice_server` assignments are 
responsible for the most of the growth in `CRConfig` size due to the natural 
addition of both Delivery Services and Servers to a CDN over time, and 
migrating to Topology-based Delivery Services should noticeably reduce the size 
and growth of the `CRConfig` over time.
+
+### Security Impact
+
+Probably the biggest impact to security will be the custom Topologies 
themselves, specifically in terms of breaking the assumptions of the existing 
2-tier CDN architecture. There will no longer be a single, global parent 
hierarchy, so things like firewalls/ACLs that allow caches to communicate with 
each other may need to be updated to account for custom parent hierarchies.
+
+Creating or updating Topologies should be restricted to users with the 
`operations` role or higher, and Topologies are CDN-wide -- they do not belong 
to a particular tenant.
+
+### Upgrade Impact
+
+This feature will require a database migration to create new tables and add a 
new field to the `deliveryservice` table, but existing data does not need to be 
modified or migrated. Therefore, rolling back the database migration will not 
cause any data loss until Topologies are actually created and assigned. 
Additionally, since this feature will not remove any existing tables or 
columns, the new database schema should be backwards-compatible with the 
previous version of Traffic Ops. However, this blueprint cannot make any claim 
as to the backwards-compatibility of schema changes required by *other* 
features, which would affect the overall backwards-compatibility of the entire 
release.
+
+This feature will not require components to be upgraded in a specific order, 
and no special manual steps will be required before, during, or after the 
upgrade is done.
+
+### Operations Impact
+
+One of the bigger day-to-day operational impacts this feature will have is in 
the assignment of Delivery Services to Topologies instead of to individual 
`EDGE` caches. In theory, if the number of unique Topologies is kept as low as 
possible, it should be easier to choose a specific Topology for a Delivery 
Service than to assign it to individual caches (unless the default approach is 
to just assign delivery services to *all* caches). This could be made easier by 
defining a kind of *default* Topology which is assigned to until it is 
determined that the default Topology does not meet the Delivery Service's 
requirements.
+
+Until legacy `deliveryservice_server` assignments are fully removed in favor 
of Topology-based assignments, there will be extra operational overhead due to 
having two different ways to assign Delivery Services, but this overhead should 
be temporary as CDN operators should migrate all Delivery Services to 
Topologies as soon as possible. That one-time manual migration should also be 
considered an operational impact and should be scheduled to take place sometime 
after the upgrade is completed.
 
 Review comment:
   and then by calling something like GET /topologies/default?type=http_live it 
would return the default, if any, which could be selected in a new DS form in 
TP.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to