Re: Delivery-Service Configuration Versioning
Thanks Ryan & Eric for the feedback. Answers inline. Thanks again, Nir On Thu, May 4, 2017 at 3:59 PM, Eric Friedrich (efriedri) < efrie...@cisco.com> wrote: > Thanks Nir- > Comments inline > > On May 1, 2017, at 1:12 PM, Nir Sopher <n...@qwilt.com> wrote: > > > > Dear all, > > > > Planning the efforts toward "self-service", we are considering > > "delivery-service configuration versioning" (DSCV) as one of our next > > steps. > > In a very high level, by DSCV we refer to the ability to hold multiple > > configuration versions/revisions per delivery-service, and actively > choose > > which version should be deployed. > > > > A significant portion of the value we would like to bring when working > > toward "self-service" can be achieved using the initial step of > > configuration versioning: > > > > 1. As the amount of delivery-services handled by TC is increasing, > > denying the "non dev-ops" user from changing delivery-services > > configuration by himself, and require a "dev-ops" user to actually > make the > > changes in the DB, put an increasing load on the operations team. > > Via DSCV the operator may allow the users to really push configurations > > into the DB, as it separates the provisioning phase from the > deployment. > > Once commited, the CDN's "dev-ops" user is able to examine the changes > > and choose which version should be deployed, subject to the operator's > > acceptance policy. > EF> How do we get from DSCV to the ultimate self-service goals where the > CDN operator is no longer in the critical path for deploying DS changes? > > NS> Indeed Eric, at this stage the deployment itself is still in the hands of the operator: The operator has to change the deployed DS version, Queue Update and Cr-Config snapshot. Allowing the DS owner to "change the DS version to deploy" is a "process" issue, as it should be subject to the different operators changes acceptance policy. We will need to model it and add flexible, probably plugins supporting, building block in the future. Allowing the DS owner to actually deploy the changes (Cr-Config snapshot and Queue-Update / future push mechanism) cannot be done as long as the different DSs configuration is coupled in the sames files (remap.config & cr-config) and processes. Once decoupled, the deploy operations can be done on a DS granularity and therefore the operator can delegate the control to the "DS owner" > > 2. DSCV brings improved auditing and troubleshooting capabilities, > which > > is important for supporting TC deployment growth, as well as allow > users to > > be more independent. > > It allows to investigate issues using versions associated log records, > > as well as the data in the DB itself: Examining the delivery-service > > versions, their meta data (e.g. "deployed dates") as well as use tools > for > > versions comparisons. > > 3. DSCV allows a simple delivery service configuration rollback, which > > provides a quick remedy for configuration errors issues. > > > > Moreover, we suggest to allow the deployment of multiple versions of the > > same delivery service simultaneously, on the same caches. Doing so, and > > allowing the operator to orchestrate the usage of the different > > versions (for example, via "steering"), the below become available: > EF> This feature will extend to both caches and TR, right? Lots of > DS-specific policy is evaluated by the TR. > NS> There are few ways to implement the feature, but as I currently see it there is no real need to change the data-plane for supporting this feature. Please let me know if you think I'm missing something here. One option is to deploy the different versions of the same delivery service as if they are entirely different delivery-services. With "ids" and "host-regexes" which include the version. No changes in Cr-Config structure, remap.config, etc. Therefore, all changes need to be done are in traffic-ops, simulating the different version towards the rest of the system. Let now improve this solution. First improvement is giving the different components the ability to understand a the concept of "DS and version". For example adding the version field to the cr-config and adjust the different component. For example, this may allow traffic-stats to show the reports about different DS version separately, as well as aggregated for the DS. More changes will probably be required, but as far as I currently see, these changes are all in the control plane, and not the data plane. Caches are effectively unaware
Re: Delivery-Service Configuration Versioning
Thanks Nir- Comments inline > On May 1, 2017, at 1:12 PM, Nir Sopher <n...@qwilt.com> wrote: > > Dear all, > > Planning the efforts toward "self-service", we are considering > "delivery-service configuration versioning" (DSCV) as one of our next > steps. > In a very high level, by DSCV we refer to the ability to hold multiple > configuration versions/revisions per delivery-service, and actively choose > which version should be deployed. > > A significant portion of the value we would like to bring when working > toward "self-service" can be achieved using the initial step of > configuration versioning: > > 1. As the amount of delivery-services handled by TC is increasing, > denying the "non dev-ops" user from changing delivery-services > configuration by himself, and require a "dev-ops" user to actually make the > changes in the DB, put an increasing load on the operations team. > Via DSCV the operator may allow the users to really push configurations > into the DB, as it separates the provisioning phase from the deployment. > Once commited, the CDN's "dev-ops" user is able to examine the changes > and choose which version should be deployed, subject to the operator's > acceptance policy. EF> How do we get from DSCV to the ultimate self-service goals where the CDN operator is no longer in the critical path for deploying DS changes? > 2. DSCV brings improved auditing and troubleshooting capabilities, which > is important for supporting TC deployment growth, as well as allow users to > be more independent. > It allows to investigate issues using versions associated log records, > as well as the data in the DB itself: Examining the delivery-service > versions, their meta data (e.g. "deployed dates") as well as use tools for > versions comparisons. > 3. DSCV allows a simple delivery service configuration rollback, which > provides a quick remedy for configuration errors issues. > > Moreover, we suggest to allow the deployment of multiple versions of the > same delivery service simultaneously, on the same caches. Doing so, and > allowing the operator to orchestrate the usage of the different > versions (for example, via "steering"), the below become available: EF> This feature will extend to both caches and TR, right? Lots of DS-specific policy is evaluated by the TR. > > 1. Manual testing of a new delivery-service configuration, via dedicated > URL or using request headers. > 2. Staging / Canary testing of new versions, applying them only for a > specific content path, or filtering base on source IP. > 3. Gradual transition between the different configuration versions. > 4. Configuration versions A/B testing (assuming the reporting/stats also > becomes "version aware"). > 5. Immediate (no CRON wait, cr-config change only) delivery-service > version"switch", and specifically immediate rollback capabilities. EF> Does #5 imply that it will be the TR choosing between the versions of DS’ deployed on the caches? How will this modify the format of requests to TrafficServer? This will have impacts to log analysis, HTTPS, DNSSEC, and many other aspects of the system. > > Note that, engineering wise, one may consider DSCV as a building block for > other "self-service" steps. It allows the system to identify what > configuration is deployed on which server, as well as allows the servers to > identify configuration changes with DS granularity. Therefore, it can help > to decouple the individual delivery services deployment as well as reduce > the load derived from the caches update process. > We would greatly appreciate community input on the subject. > > Many thanks, > Nir
Re: Delivery-Service Configuration Versioning
I am +1 on all these concepts. This matches all of our anticipated requirements for deployment of service configs, near instant changes, logging, reporting, rollback and testing. I think the canary testing options need to be fleshed out a bit more. There are a lot of different options suggested below and I think each has pros/cons but the core DSCV idea gives us a lot of advantages. Ryan DurfeyM | 303-524-5099 On 5/1/17, 11:12 AM, "Nir Sopher" <n...@qwilt.com> wrote: Dear all, Planning the efforts toward "self-service", we are considering "delivery-service configuration versioning" (DSCV) as one of our next steps. In a very high level, by DSCV we refer to the ability to hold multiple configuration versions/revisions per delivery-service, and actively choose which version should be deployed. A significant portion of the value we would like to bring when working toward "self-service" can be achieved using the initial step of configuration versioning: 1. As the amount of delivery-services handled by TC is increasing, denying the "non dev-ops" user from changing delivery-services configuration by himself, and require a "dev-ops" user to actually make the changes in the DB, put an increasing load on the operations team. Via DSCV the operator may allow the users to really push configurations into the DB, as it separates the provisioning phase from the deployment. Once commited, the CDN's "dev-ops" user is able to examine the changes and choose which version should be deployed, subject to the operator's acceptance policy. 2. DSCV brings improved auditing and troubleshooting capabilities, which is important for supporting TC deployment growth, as well as allow users to be more independent. It allows to investigate issues using versions associated log records, as well as the data in the DB itself: Examining the delivery-service versions, their meta data (e.g. "deployed dates") as well as use tools for versions comparisons. 3. DSCV allows a simple delivery service configuration rollback, which provides a quick remedy for configuration errors issues. Moreover, we suggest to allow the deployment of multiple versions of the same delivery service simultaneously, on the same caches. Doing so, and allowing the operator to orchestrate the usage of the different versions (for example, via "steering"), the below become available: 1. Manual testing of a new delivery-service configuration, via dedicated URL or using request headers. 2. Staging / Canary testing of new versions, applying them only for a specific content path, or filtering base on source IP. 3. Gradual transition between the different configuration versions. 4. Configuration versions A/B testing (assuming the reporting/stats also becomes "version aware"). 5. Immediate (no CRON wait, cr-config change only) delivery-service version"switch", and specifically immediate rollback capabilities. Note that, engineering wise, one may consider DSCV as a building block for other "self-service" steps. It allows the system to identify what configuration is deployed on which server, as well as allows the servers to identify configuration changes with DS granularity. Therefore, it can help to decouple the individual delivery services deployment as well as reduce the load derived from the caches update process. We would greatly appreciate community input on the subject. Many thanks, Nir
Delivery-Service Configuration Versioning
Dear all, Planning the efforts toward "self-service", we are considering "delivery-service configuration versioning" (DSCV) as one of our next steps. In a very high level, by DSCV we refer to the ability to hold multiple configuration versions/revisions per delivery-service, and actively choose which version should be deployed. A significant portion of the value we would like to bring when working toward "self-service" can be achieved using the initial step of configuration versioning: 1. As the amount of delivery-services handled by TC is increasing, denying the "non dev-ops" user from changing delivery-services configuration by himself, and require a "dev-ops" user to actually make the changes in the DB, put an increasing load on the operations team. Via DSCV the operator may allow the users to really push configurations into the DB, as it separates the provisioning phase from the deployment. Once commited, the CDN's "dev-ops" user is able to examine the changes and choose which version should be deployed, subject to the operator's acceptance policy. 2. DSCV brings improved auditing and troubleshooting capabilities, which is important for supporting TC deployment growth, as well as allow users to be more independent. It allows to investigate issues using versions associated log records, as well as the data in the DB itself: Examining the delivery-service versions, their meta data (e.g. "deployed dates") as well as use tools for versions comparisons. 3. DSCV allows a simple delivery service configuration rollback, which provides a quick remedy for configuration errors issues. Moreover, we suggest to allow the deployment of multiple versions of the same delivery service simultaneously, on the same caches. Doing so, and allowing the operator to orchestrate the usage of the different versions (for example, via "steering"), the below become available: 1. Manual testing of a new delivery-service configuration, via dedicated URL or using request headers. 2. Staging / Canary testing of new versions, applying them only for a specific content path, or filtering base on source IP. 3. Gradual transition between the different configuration versions. 4. Configuration versions A/B testing (assuming the reporting/stats also becomes "version aware"). 5. Immediate (no CRON wait, cr-config change only) delivery-service version"switch", and specifically immediate rollback capabilities. Note that, engineering wise, one may consider DSCV as a building block for other "self-service" steps. It allows the system to identify what configuration is deployed on which server, as well as allows the servers to identify configuration changes with DS granularity. Therefore, it can help to decouple the individual delivery services deployment as well as reduce the load derived from the caches update process. We would greatly appreciate community input on the subject. Many thanks, Nir