Re: Delivery-Service Configuration Versioning

2017-05-04 Thread Nir Sopher
Thanks Ryan & Eric for the feedback.
Answers inline.
Thanks again,
Nir


On Thu, May 4, 2017 at 3:59 PM, Eric Friedrich (efriedri) <
efrie...@cisco.com> wrote:

> Thanks Nir-
> Comments inline
> > On May 1, 2017, at 1:12 PM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > Dear all,
> >
> > Planning the efforts toward "self-service", we are considering
> > "delivery-service configuration versioning" (DSCV) as one of our next
> > steps.
> > In a very high level, by DSCV we refer to the ability to hold multiple
> > configuration versions/revisions per delivery-service, and actively
> choose
> > which version should be deployed.
> >
> > A significant portion of the value we would like to bring when working
> > toward "self-service" can be achieved using the initial step of
> > configuration versioning:
> >
> >   1. As the amount of delivery-services handled by TC is increasing,
> >   denying the "non dev-ops" user from changing delivery-services
> >   configuration by himself, and require a "dev-ops" user to actually
> make the
> >   changes in the DB, put an increasing load on the operations team.
> >   Via DSCV the operator may allow the users to really push configurations
> >   into the DB, as it separates the provisioning phase from the
> deployment.
> >   Once commited, the CDN's "dev-ops" user is able to examine the changes
> >   and choose which version should be deployed, subject to the operator's
> >   acceptance policy.
> EF> How do we get from DSCV to the ultimate self-service goals where the
> CDN operator is no longer in the critical path for deploying DS changes?
>
> NS> Indeed Eric, at this stage the deployment itself is still in the hands
of the operator: The operator has to change the deployed DS version, Queue
Update and Cr-Config snapshot.
Allowing the DS owner to "change the DS version to deploy" is a "process"
issue, as it should be subject to the different operators changes
acceptance policy. We will need to model it and add flexible, probably
plugins supporting, building block in the future.
Allowing the DS owner to actually deploy the changes (Cr-Config snapshot
and Queue-Update / future push mechanism) cannot be done as long as the
different DSs configuration is coupled in the sames files (remap.config &
cr-config) and processes.
Once decoupled, the deploy operations can be done on a DS granularity and
therefore the operator can delegate the control to the "DS owner"


> >   2. DSCV brings improved auditing and troubleshooting capabilities,
> which
> >   is important for supporting TC deployment growth, as well as allow
> users to
> >   be more independent.
> >   It allows to investigate issues using versions associated log records,
> >   as well as the data in the DB itself: Examining the delivery-service
> >   versions, their meta data (e.g. "deployed dates") as well as use tools
> for
> >   versions comparisons.
> >   3. DSCV allows a simple delivery service configuration rollback, which
> >   provides a quick remedy for configuration errors issues.
> >
> > Moreover, we suggest to allow the deployment of multiple versions of the
> > same delivery service simultaneously, on the same caches. Doing so, and
> > allowing the operator to orchestrate the usage of the different
> > versions (for example, via "steering"), the below become available:
> EF> This feature will extend to both caches and TR, right? Lots of
> DS-specific policy is evaluated by the TR.
>
NS> There are few ways to implement the feature, but as I currently see it
there is no real need to change the data-plane for supporting this feature.
Please let me know if you think I'm missing something here.
One option is to deploy the different versions of the same delivery service
as if they are entirely different delivery-services. With "ids" and
"host-regexes" which include the version. No changes in Cr-Config
structure, remap.config, etc. Therefore, all changes need to be done are in
traffic-ops, simulating the different version towards the rest of the
system.
Let now improve this solution.
First improvement is giving the different components the ability to
understand a the concept of "DS and version". For example adding the
version field to the cr-config and adjust the different component. For
example, this may allow traffic-stats to show the reports about different
DS version separately, as well as aggregated for the DS. More changes will
probably be required, but as far as I currently see, these changes are all
in the control plane, and not the data plane. Caches are effectively
unaware

Re: Delivery-Service Configuration Versioning

2017-05-04 Thread Eric Friedrich (efriedri)
Thanks Nir-
Comments inline
> On May 1, 2017, at 1:12 PM, Nir Sopher <n...@qwilt.com> wrote:
> 
> Dear all,
> 
> Planning the efforts toward "self-service", we are considering
> "delivery-service configuration versioning" (DSCV) as one of our next
> steps.
> In a very high level, by DSCV we refer to the ability to hold multiple
> configuration versions/revisions per delivery-service, and actively choose
> which version should be deployed.
> 
> A significant portion of the value we would like to bring when working
> toward "self-service" can be achieved using the initial step of
> configuration versioning:
> 
>   1. As the amount of delivery-services handled by TC is increasing,
>   denying the "non dev-ops" user from changing delivery-services
>   configuration by himself, and require a "dev-ops" user to actually make the
>   changes in the DB, put an increasing load on the operations team.
>   Via DSCV the operator may allow the users to really push configurations
>   into the DB, as it separates the provisioning phase from the deployment.
>   Once commited, the CDN's "dev-ops" user is able to examine the changes
>   and choose which version should be deployed, subject to the operator's
>   acceptance policy.
EF> How do we get from DSCV to the ultimate self-service goals where the CDN 
operator is no longer in the critical path for deploying DS changes?

>   2. DSCV brings improved auditing and troubleshooting capabilities, which
>   is important for supporting TC deployment growth, as well as allow users to
>   be more independent.
>   It allows to investigate issues using versions associated log records,
>   as well as the data in the DB itself: Examining the delivery-service
>   versions, their meta data (e.g. "deployed dates") as well as use tools for
>   versions comparisons.
>   3. DSCV allows a simple delivery service configuration rollback, which
>   provides a quick remedy for configuration errors issues.
> 
> Moreover, we suggest to allow the deployment of multiple versions of the
> same delivery service simultaneously, on the same caches. Doing so, and
> allowing the operator to orchestrate the usage of the different
> versions (for example, via "steering"), the below become available:
EF> This feature will extend to both caches and TR, right? Lots of DS-specific 
policy is evaluated by the TR.

> 
>   1. Manual testing of a new delivery-service configuration, via dedicated
>   URL or using request headers.
>   2. Staging / Canary testing of new versions, applying them only for a
>   specific content path, or filtering base on source IP.
>   3. Gradual transition between the different configuration versions.
>   4. Configuration versions A/B testing (assuming the reporting/stats also
>   becomes "version aware").
>   5. Immediate (no CRON wait, cr-config change only) delivery-service
>   version"switch", and specifically immediate rollback capabilities.
EF> Does #5 imply that it will be the TR choosing between the versions of DS’ 
deployed on the caches? How will this modify the format of requests to 
TrafficServer?
This will have impacts to log analysis, HTTPS, DNSSEC, and many other aspects 
of the system. 

> 
> Note that, engineering wise, one may consider DSCV as a building block for
> other "self-service" steps. It allows the system to identify what
> configuration is deployed on which server, as well as allows the servers to
> identify configuration changes with DS granularity. Therefore, it can help
> to decouple the individual delivery services deployment as well as reduce
> the load derived from the caches update process.
> We would greatly appreciate community input on the subject.
> 
> Many thanks,
> Nir



Re: Delivery-Service Configuration Versioning

2017-05-03 Thread Durfey, Ryan
I am +1 on all these concepts.  This matches all of our anticipated 
requirements for deployment of service configs, near instant changes, logging, 
reporting, rollback and testing.  I think the canary testing options need to be 
fleshed out a bit more.  There are a lot of different options suggested below 
and I think each has pros/cons but the core DSCV idea gives us a lot of 
advantages.

Ryan DurfeyM | 303-524-5099


On 5/1/17, 11:12 AM, "Nir Sopher" <n...@qwilt.com> wrote:

Dear all,

Planning the efforts toward "self-service", we are considering
"delivery-service configuration versioning" (DSCV) as one of our next
steps.
In a very high level, by DSCV we refer to the ability to hold multiple
configuration versions/revisions per delivery-service, and actively choose
which version should be deployed.

A significant portion of the value we would like to bring when working
toward "self-service" can be achieved using the initial step of
configuration versioning:

   1. As the amount of delivery-services handled by TC is increasing,
   denying the "non dev-ops" user from changing delivery-services
   configuration by himself, and require a "dev-ops" user to actually make 
the
   changes in the DB, put an increasing load on the operations team.
   Via DSCV the operator may allow the users to really push configurations
   into the DB, as it separates the provisioning phase from the deployment.
   Once commited, the CDN's "dev-ops" user is able to examine the changes
   and choose which version should be deployed, subject to the operator's
   acceptance policy.
   2. DSCV brings improved auditing and troubleshooting capabilities, which
   is important for supporting TC deployment growth, as well as allow users 
to
   be more independent.
   It allows to investigate issues using versions associated log records,
   as well as the data in the DB itself: Examining the delivery-service
   versions, their meta data (e.g. "deployed dates") as well as use tools 
for
   versions comparisons.
   3. DSCV allows a simple delivery service configuration rollback, which
   provides a quick remedy for configuration errors issues.

Moreover, we suggest to allow the deployment of multiple versions of the
same delivery service simultaneously, on the same caches. Doing so, and
allowing the operator to orchestrate the usage of the different
versions (for example, via "steering"), the below become available:

   1. Manual testing of a new delivery-service configuration, via dedicated
   URL or using request headers.
   2. Staging / Canary testing of new versions, applying them only for a
   specific content path, or filtering base on source IP.
   3. Gradual transition between the different configuration versions.
   4. Configuration versions A/B testing (assuming the reporting/stats also
   becomes "version aware").
   5. Immediate (no CRON wait, cr-config change only) delivery-service
   version"switch", and specifically immediate rollback capabilities.

Note that, engineering wise, one may consider DSCV as a building block for
other "self-service" steps. It allows the system to identify what
configuration is deployed on which server, as well as allows the servers to
identify configuration changes with DS granularity. Therefore, it can help
to decouple the individual delivery services deployment as well as reduce
the load derived from the caches update process.
We would greatly appreciate community input on the subject.

Many thanks,
Nir




Delivery-Service Configuration Versioning

2017-05-01 Thread Nir Sopher
Dear all,

Planning the efforts toward "self-service", we are considering
"delivery-service configuration versioning" (DSCV) as one of our next
steps.
In a very high level, by DSCV we refer to the ability to hold multiple
configuration versions/revisions per delivery-service, and actively choose
which version should be deployed.

A significant portion of the value we would like to bring when working
toward "self-service" can be achieved using the initial step of
configuration versioning:

   1. As the amount of delivery-services handled by TC is increasing,
   denying the "non dev-ops" user from changing delivery-services
   configuration by himself, and require a "dev-ops" user to actually make the
   changes in the DB, put an increasing load on the operations team.
   Via DSCV the operator may allow the users to really push configurations
   into the DB, as it separates the provisioning phase from the deployment.
   Once commited, the CDN's "dev-ops" user is able to examine the changes
   and choose which version should be deployed, subject to the operator's
   acceptance policy.
   2. DSCV brings improved auditing and troubleshooting capabilities, which
   is important for supporting TC deployment growth, as well as allow users to
   be more independent.
   It allows to investigate issues using versions associated log records,
   as well as the data in the DB itself: Examining the delivery-service
   versions, their meta data (e.g. "deployed dates") as well as use tools for
   versions comparisons.
   3. DSCV allows a simple delivery service configuration rollback, which
   provides a quick remedy for configuration errors issues.

Moreover, we suggest to allow the deployment of multiple versions of the
same delivery service simultaneously, on the same caches. Doing so, and
allowing the operator to orchestrate the usage of the different
versions (for example, via "steering"), the below become available:

   1. Manual testing of a new delivery-service configuration, via dedicated
   URL or using request headers.
   2. Staging / Canary testing of new versions, applying them only for a
   specific content path, or filtering base on source IP.
   3. Gradual transition between the different configuration versions.
   4. Configuration versions A/B testing (assuming the reporting/stats also
   becomes "version aware").
   5. Immediate (no CRON wait, cr-config change only) delivery-service
   version"switch", and specifically immediate rollback capabilities.

Note that, engineering wise, one may consider DSCV as a building block for
other "self-service" steps. It allows the system to identify what
configuration is deployed on which server, as well as allows the servers to
identify configuration changes with DS granularity. Therefore, it can help
to decouple the individual delivery services deployment as well as reduce
the load derived from the caches update process.
We would greatly appreciate community input on the subject.

Many thanks,
Nir