Indeed. This is probably only one of quite a few issues required to be
dealt with on the way for the instant apply of ds configuration changes.
Further planning should be done and baby steps towards this goal should be
presented (as well as other "self-service" building blocks).
I just mentioned it as additional motivation for avoiding further coupling
of the different delivery services configurations.

Back to the original discussion, can the proposed versioned definition file
be broken into separate versioned files? A file per DS may be just the
first example for this need. Other "orthogonal" sections can be held
separately as well.

Thanks,
Nir

On Apr 14, 2017 23:38, "David Neuman" <[email protected]> wrote:

> The discussion around delivery service configs should probably be it's own
> thread, however I am going to contribute to the hijacking of this thread
> anyway.
>
> We need to make sure that we keep the Traffic Router in mind when
> discussing delivery service changes that get applied "instantly" and
> individually.  There are certain attributes of a delivery service that
> affect the Traffic Router and we need to make sure that we don't cause an
> issue by pushing a config to a cache before the Traffic Router has it or
> visa-versa.
>
> On Fri, Apr 14, 2017 at 8:07 AM, Amir Yeshurun <[email protected]> wrote:
>
> > It seems that with Nir's approach there is no problem to enforce a size
> > limit on historical data
> >
> > On Fri, Apr 14, 2017 at 4:07 PM Eric Friedrich (efriedri) <
> > [email protected]> wrote:
> >
> > > I think this sounds good Nir.
> > >
> > > Its not so much the size that is the main concern. Rather, people tend
> to
> > > have strong reactions to “its permanent, it will be there forever”. As
> > long
> > > as we give some way to delete and preferably with a batch mode we
> should
> > be
> > > all set.
> > >
> > > —Eric
> > >
> > > > On Apr 13, 2017, at 3:08 PM, Nir Sopher <[email protected]> wrote:
> > > >
> > > > Hi Eric,
> > > >
> > > > I thought to start with saving for each configuration the range of
> > dates
> > > it
> > > > was the "head" revision, and the range of dates it was deployed.
> > > > This will allow the operator to remove old versions via designated
> > script
> > > > using criteria like "configuration age", "ds history length" or "was
> it
> > > > deployed". For example "Leave all deployed revisions and up to 100
> non
> > > > deployed revisions".
> > > > I haven't thought of the option to support the marking of
> configuration
> > > > versions as "never delete", but it can surely be added.
> > > >
> > > > I did not intended to create something more sophisticated, and
> believe
> > > that
> > > > the mentioned script will be used only on rare cases that something
> is
> > > > trashing the DB, as the math I did lead me to believe it is a none
> > issue:
> > > > Judging from the kable-town example, a delivery-service configuration
> > > size
> > > > is less than 500B. Lets say the average is *1K *to support future
> > growth.
> > > > Lets also say we support *10K *DSs (which is much much more than any
> TC
> > > > deployment I'm aware of has) and we have *1K *revisions per DS.
> > > > In such a case versioning will use 10GB, which I believe is not an
> > issue
> > > > for postgres to hold (yet, I'm not a postgres expert).
> > > >
> > > > Nir
> > > >
> > > >
> > > > On Thu, Apr 13, 2017 at 3:53 PM, Eric Friedrich (efriedri) <
> > > > [email protected]> wrote:
> > > >
> > > >> Hey Nir-
> > > >>  If we keep all DS versions in the DB, are there any concerns about
> > the
> > > >> amount of data retained? I know Delivery Services don’t change very
> > > often,
> > > >> but over time do we really need to keep the last 1000 revisions of a
> > > >> delivery service?
> > > >>
> > > >> Its more of an implementation detail, but I think it would be useful
> > to
> > > >> give some control over version retention policies (i.e. keep last n
> > > based
> > > >> on quantity or dates, mark some as “never delete”)
> > > >>
> > > >> More inline
> > > >>> On Apr 12, 2017, at 12:53 AM, Nir Sopher <[email protected]> wrote:
> > > >>>
> > > >>> Thanks Derek for the clarification.
> > > >>>
> > > >>> So the definition file is a global file for the CDN.
> > > >>> Does it contain the information of which server has which DS?
> > > >>> Does it hold all CDN's DSs configuration together?
> > > >>> On a single DS change, will all servers in the CDN download the
> > entire
> > > >> file
> > > >>> for every DS change?
> > > >>>
> > > >>> What I'm practically asking is, if it is not already your
> intention,
> > > "can
> > > >>> the definition file hold only the information of which server holds
> > > which
> > > >>> DS (and configuration version when we add it), and the DS
> > configuration
> > > >> be
> > > >>> held and pulled separately on a DS level granularity?"
> > > >>>
> > > >>> When discussing "self-service" we would like to decouple the
> > operations
> > > >> of
> > > >>> the different users / content providers. Ultimately, when a DS is
> > > >> changed,
> > > >>> the change should be deployed immediately to the CDN - with no
> > > dependency
> > > >>> with other DSs, and possibly with "no buffering" by the operator
> > > >> deploying
> > > >>> batch of DS changes together. This allows to improve the user
> > > experience
> > > >>> and independence when working on the CDN.
> > > >>> Following the change you are suggesting, will the DS configuration
> > > >>> deployment coupling get tighter? We prefer not to have the need to
> > > >> "finish
> > > >>> your work and not start additional work before the queued run has
> > > >>> completed".
> > > >> EF> Agree. The less steps our users have to take, the happier they
> > are.
> > > If
> > > >> it was a common workflow to batch a bunch of DS changes and them
> roll
> > > them
> > > >> out together, I would probably be a stronger advocate for keeping
> the
> > > queue
> > > >> update/snapshot crconfig steps around. In our discussion, that
> doesn’t
> > > seem
> > > >> to be often used. Should we consider deprecating those stages and
> > > >> immediately (excepting ORT polling interval of course) applying
> config
> > > >> changes when the DS is changed?
> > > >>
> > > >>>
> > > >>> Another requirement is to be able to rollback changes on a DS level
> > and
> > > >> not
> > > >>> only on a CDN level, as it is not desired to rollback changes of
> user
> > > "A"
> > > >>> because of errors of user "B". If I understand correctly, the
> > > definition
> > > >>> file does not support that.
> > > >> EF> I think the definition file can support this- only the
> rolled-back
> > > DS
> > > >> would change inside that file. No other users would be affected
> > because
> > > >> their DS configs would not change.
> > > >>
> > > >>>
> > > >>> Last, using the definition file, must all servers in a CDN work
> with
> > > the
> > > >>> same set of DSs? One of the reasons we consider "DS versioning" is
> to
> > > >> allow
> > > >>> deployment of a change in a DS only for a subset of the server for
> > > canary
> > > >>> testing.
> > > >> EF> When I think of canary testing today, my mind first goes to
> > Steering
> > > >> DS. With those steering delivery services, do we still need ability
> to
> > > set
> > > >> per-cache DS versions?
> > > >>
> > > >> —Eric
> > > >>
> > > >>
> > > >>>
> > > >>> Thanks,
> > > >>> Nir
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Wed, Apr 12, 2017 at 3:00 AM, Dewayne Richardson <
> > [email protected]
> > > >
> > > >>> wrote:
> > > >>>
> > > >>>> +1 I was just about to formulate that response.  The "dev" list is
> > our
> > > >>>> discussion forum.
> > > >>>>
> > > >>>> On Tue, Apr 11, 2017 at 9:35 AM, Dave Neuman <[email protected]>
> > > wrote:
> > > >>>>
> > > >>>>> @Ryan, I think its better to have conversations on the dev list
> > than
> > > a
> > > >>>> wiki
> > > >>>>> page...
> > > >>>>>
> > > >>>>> On Tue, Apr 11, 2017 at 9:01 AM, Durfey, Ryan <
> > > [email protected]
> > > >>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Started a new wiki page to discuss this here
> > > >> https://cwiki.apache.org/
> > > >>>>>> confluence/display/TC/Configuration+Management
> > > >>>>>>
> > > >>>>>> I will do my best to summarize the discussion below later today.
> > > >>>>>>
> > > >>>>>> Ryan M | 303-524-5099
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> -----Original Message-----
> > > >>>>>> From: Eric Friedrich (efriedri) [mailto:[email protected]]
> > > >>>>>> Sent: Tuesday, April 11, 2017 8:55 AM
> > > >>>>>> To: [email protected]
> > > >>>>>> Subject: Re: Proposal for CDN definition file based
> configuration
> > > >>>>>> management
> > > >>>>>>
> > > >>>>>> A few questions/thoughts, apologies for not in-lining:
> > > >>>>>>
> > > >>>>>> 1) If we move away from individually queued updates, we give up
> > the
> > > >>>>>> ability to make changes and then selectively deploy them. How
> > often
> > > do
> > > >>>> TC
> > > >>>>>> operations teams make config changes but do not immediately
> queue
> > > >>>>> updates.
> > > >>>>>> (I personally think that we currently have a bit of a tricky
> > > situation
> > > >>>>>> where queuing updates much later can push down an unknowingly
> > large
> > > >>>>> config
> > > >>>>>> change to a cache- i.e. many new DS added/removed since last
> time
> > > >>>> updates
> > > >>>>>> were queued maybe months earlier). I wouldn't be sad to see
> queue
> > > >>>> updates
> > > >>>>>> go away, but don't want to cause hardship on operators using
> that
> > > >>>>> feature.
> > > >>>>>>
> > > >>>>>> 2) If we move away from individually queued updates, how does
> that
> > > >>>> affect
> > > >>>>>> the implicit "config state machine"? Specifically, how will
> edges
> > > know
> > > >>>>> when
> > > >>>>>> their parents have been configured and are ready for service?
> > Today
> > > we
> > > >>>>>> don't config an edge cache with a new DS unless the mid is ready
> > to
> > > >>>>> handle
> > > >>>>>> traffic as well.
> > > >>>>>>
> > > >>>>>> 3) If we move away from individually queued updates, how do we
> do
> > > >>>> things
> > > >>>>>> like unassign a delivery service from a cache? Today we have to
> > > >>>> snapshot
> > > >>>>>> CRConfig first to stop redirects to the cache before we queue
> the
> > > >>>> update.
> > > >>>>>> If updates are immediately applied and snapshot is still
> separate,
> > > how
> > > >>>> do
> > > >>>>>> we get TR to stop sending traffic to a cache that no longer has
> > the
> > > >>>> remap
> > > >>>>>> rule?
> > > >>>>>>
> > > >>>>>> 4) Also along the lines of the config state machine, we never
> > really
> > > >>>>>> closed on if we would make any changes to the queue
> > update/snapshot
> > > >>>>>> CRConfig flow. If we are looking at redoing how we generate
> config
> > > >>>> files,
> > > >>>>>> it would be great to have consensus on an approach (if not an
> > > >>>>>> implementation) to remove the need to sequence queue updates and
> > > >>>> snapshot
> > > >>>>>> CRConfig. I think the requirement here would be to have Traffic
> > > >> Control
> > > >>>>>> figure out on its own when to activate/deactivate routing to a
> > cache
> > > >>>> from
> > > >>>>>> TR.
> > > >>>>>>
> > > >>>>>> 5) I like the suggestion of cache-based config file generation.
> > > >>>>>> - Caches only retrieve relevant information, so scale
> proportional
> > > to
> > > >>>>>> number of caches/DSs in the CDN is much better
> > > >>>>>> - We could modify TR/TM to use the same approach, rather than
> > > >>>>>> snapshotting a CRConfig.
> > > >>>>>> - Cache/TR/TM-based config could play a greater role in config
> > state
> > > >>>>>> machine, rather than having Traffic Ops build static
> configuration
> > > >>>> ahead
> > > >>>>> of
> > > >>>>>> time.
> > > >>>>>>
> > > >>>>>> Downsides
> > > >>>>>> - Versioning is still possible, but more work than maintaining
> > > >>>>> snapshots
> > > >>>>>> of a config file
> > > >>>>>> - Have to be very careful with API changes, any breakage now
> > impacts
> > > >>>>>> cache updates.
> > > >>>>>>
> > > >>>>>> -Eric
> > > >>>>>>
> > > >>>>>>> On Apr 10, 2017, at 9:45 PM, Gelinas, Derek <
> > > >>>> [email protected]
> > > >>>>>>
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>> Thanks Rob. To your point about scalability: I think that this
> is
> > > >>>> more
> > > >>>>>> scaleable than the current crconfig implementation due to the
> > > caching.
> > > >>>>>> However that is a very valid point and one that has been
> > considered.
> > > >>>> I've
> > > >>>>>> started looking into the problem from that angle and hope to
> have
> > > some
> > > >>>>> more
> > > >>>>>> solid data soon.  I still believe that this is ultimately more
> > > >>>> scaleable
> > > >>>>>> than current config implementation, even with the scope caching,
> > but
> > > >>>> the
> > > >>>>>> proof will be in the data.
> > > >>>>>>>
> > > >>>>>>> Derek
> > > >>>>>>>
> > > >>>>>>>> On Apr 10, 2017, at 9:23 PM, Robert Butts <
> > > [email protected]
> > > >>>>>
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>> I'd propose:
> > > >>>>>>>> * Instead of storing the JSON as blob, use
> > > >>>>>>>> https://www.postgresql.org/doc s/9.2/static/datatype-json.
> html
> > > >>>>>>>> * Instead of version-then-file request, use a "latest"
> endpoint
> > > with
> > > >>>>>>>> `If-Modified-Since`
> > > >>>>>>>> (https://tools.ietf.org/html/rfc7232#section-3.3). We can
> also
> > > >>>> serve
> > > >>>>>>>> each version at endpoints, but `If-Modified-Since` lets us
> > > determine
> > > >>>>>>>> whether there's a new snapshot and get it in a single request,
> > > both
> > > >>>>>>>> efficiently and using a standard. (We should do the same for
> the
> > > >>>>>> CRConfig).
> > > >>>>>>>>
> > > >>>>>>>> Also for cache-side config generation, consider
> > > >>>>>>>> https://github.com/apache/incubator-trafficcontrol/pull/151 .
> > > It's
> > > >>>> a
> > > >>>>>>>> prototype and needs work to bring it to production, but the
> > basic
> > > >>>>>>>> functionality is there. Go is safer and faster to develop than
> > > Perl,
> > > >>>>>>>> and this is already encapsulated in a library, with both CLI
> and
> > > >>>> HTTP
> > > >>>>>>>> microservice examples. I'm certainly willing to help bring it
> to
> > > >>>>>> production.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> "a single definition file for each CDN which will contain all
> > the
> > > >>>>>>>> information required for any server within that CDN to
> generate
> > > its
> > > >>>>>>>> own configs"
> > > >>>>>>>>
> > > >>>>>>>> Also, long-term, that doesn't scale, nor does the CRConfig. As
> > > >>>>>>>> Traffic Control is deployed with larger and larger CDNs, the
> > > >>>> CRConfig
> > > >>>>>>>> grows uncontrollably. It's already 5-7mb for us, which takes
> an
> > > >>>>>>>> approaching-unreasonable amount of time for Traffic Monitor
> and
> > > >>>>>>>> Router to fetch. This isn't an immediate concern, but
> long-term,
> > > we
> > > >>>>>>>> need to develop a scalable solution, something that says "only
> > > give
> > > >>>>>>>> me the data modified since this timestamp".
> > > >>>>>>>>
> > > >>>>>>>> Again, this isn't an immediate crisis. I only mention it now
> > > >>>> because,
> > > >>>>>>>> if a scalable solution is about the same amount of work, now
> > > sounds
> > > >>>>>>>> like a good time. If it's relevantly more work, no worries.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> But otherwise, +1. We've long needed to Separate our Concerns
> of
> > > >>>>>>>> Traffic Ops and the cache application.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Apr 10, 2017 at 5:05 PM, Gelinas, Derek
> > > >>>>>>>> <[email protected]>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> I would like to propose a new method for ATS config file
> > > >>>> generation,
> > > >>>>>>>>> in which a single definition file for each CDN which will
> > contain
> > > >>>>>>>>> all the information required for any server within that CDN
> to
> > > >>>>>>>>> generate its own configs, rather than requesting them from
> > > traffic
> > > >>>>>>>>> ops.  This would be a version-controlled json file that, when
> > > >>>>>>>>> generated, would be stored in a new table in the traffic ops
> > > >>>>>>>>> database as a blob type.  This will satisfy high-availability
> > > >>>>>>>>> requirements and allow several versions of the configuration
> to
> > > be
> > > >>>>>>>>> retained for rollback, as well as "freezing" the config at
> that
> > > >>>>>>>>> moment in time.  Combined with cache support coming in 2.1,
> > this
> > > >>>> file
> > > >>>>>> would only need be generated once per traffic ops server
> instance.
> > > >>>>>>>>> Instead of queueing servers to update their configurations,
> the
> > > >>>>>>>>> configuration would be snapshotted similar to the crconfig
> file
> > > and
> > > >>>>>>>>> downloaded by each cache according to their set interval
> > checks -
> > > >>>>>>>>> rather than performing a syncds and checking that the server
> > has
> > > >>>>>>>>> been queued for update, the version number would simply be
> > > checked
> > > >>>>>>>>> and compared against the currently active version on the
> cache
> > > >>>>>>>>> itself.  Should a difference be found the server would
> request
> > > the
> > > >>>>>>>>> definition file and begin generating configuration files for
> > > itself
> > > >>>>>> using the data in the definition file.
> > > >>>>>>>>>
> > > >>>>>>>>> I would like feedback from the community regarding this
> > proposal,
> > > >>>>>>>>> and any suggestions or comments you may have.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks!
> > > >>>>>>>>> Derek
> > > >>>>>>>>>
> > > >>>>>>>>> Derek Gelinas
> > > >>>>>>>>> IPCDN Engineering
> > > >>>>>>>>> [email protected]<mailto:Derek_Gelinas@
> > > >>>> cable.comcast.c
> > > >>>>>>>>> om>
> > > >>>>>>>>> 603.812.5379
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
>

Reply via email to