I'm down with this! But I'm worried about database performance, for one, and 
table size. I think we need to have a utility for removing older entries if we 
are to go this route.

On Jul 27, 2017, at 10:12 AM, Robert Butts 
<[email protected]<mailto:[email protected]>> wrote:

Can I propose an adjustment?

If we add a timestamp to every table, we can generate the JSON on-the-fly.
Then, the snapshot becomes a timestamp field, `snapshot_time`, and all the
data `select` queries add `where timestamp <= snapshot_time limit 1`.
Instead of updating rows, we only ever insert new rows with new timestamps.
This gives us snapshots back to eternity, and if a snapshot ever breaks,
rolling back is as simple as updating the `snapshot_time`. Our data is so
tiny, space is almost certainly not a problem, but if it is, truncating is
as easy as `delete where count > X and timestamp < Y`.

That gives us all the benefits of your plan, plus the benefits of
relational data, type safety, more powerful querying, etc. And it shouldn't
be much more work to implement: add timestamp columns, snapshotting updates
the snapshot field, and getting the config simply runs what the snapshot
otherwise would to create the JSON. If generation performance is an issue
(it may be in Perl, probably not in Go), we can always cache the latest
snapshot in memory, and only regenerate it when the `snapshot_time` changes.

On Wed, Jul 26, 2017 at 9:08 AM, Gelinas, Derek 
<[email protected]<mailto:[email protected]>>
wrote:

That’s not a terrible idea.  Fewer changes to the code that way for sure,
really just the DS interface page.

On Jul 26, 2017, at 10:30 AM, Nir Sopher 
<[email protected]<mailto:[email protected]>> wrote:

Hi Derek,

As discussed in the summit, we also see significant value in

 1. DS Deployment Granularity - using DS individual config files.
 2. Delivery Service Configuration Versioning (DSCV) -  separating the
 "provisioning" from the "deployment".
 3. Improving the roll-out procedure, joining the capabilities #1 & #2

We are on the same page with these needs:)

However, as I see it, these are #1 & #2 are 2 separate features, each has
different requirements.
For example, for DSCV,  I would suggest to manage the versions as
standard
rows in the Delivery-Service table, side by side with the "hot" DS
configuration.
This will allow the existing code (with minor adjustments) to properly
work
on these rows.
Furthermore, it also allows you to simply "restore" the DS "hot"
configuration to a specified revision.
It is also more resilient to DS table schema updates.

I'll soon share, on another thread, a link to a "DSCV functional spec" I
was working on. It extends the presentation
<https://cwiki.apache.org/confluence/download/attachments/69407844/TC%
20Summit%20-%20Spring%202017%20-%20Self-Service.pptx?
version=1&modificationDate=1495451091000&api=v2>
we
had in the summit.
I would appreciate any inputs to this spec.

Nir

On Tue, Jul 25, 2017 at 10:13 PM, Gelinas, Derek <
[email protected]<mailto:[email protected]>>
wrote:

At the summit, there was some talk about changing the manner in which we
generate configuration files.  The early stages of this idea had me
creating large CDN definition files, but in the course of our
discussion it
became clear that we would be better served by creating delivery service
configuration files instead.  This would shift us from a
server-generated
implementation, as we have now, to generating the configuration files
for
the caches locally.  The data for this would come from a new API that
would
provide the delivery service definitions in json format.

What I’m envisioning is creating delivery service “snapshots” which are
saved to the database as json objects.  These snapshots would have the
full
range of information specific to the delivery service, including the
new DS
profiles.  The database would store up to five of these objects per DS,
and
one DS object would be set to “active” through the UI or API.

In this way, we could create multiple versions of a delivery service, or
safely modify the definition currently “live” (but not necessarily
active)
in the database without changing the configuration in the field.
Configuration would only be changed when the DS was saved and then that
saved version was set to become active.  In the reverse manner, existing
saved delivery services could be restored to the live DB for
modification.

By divorcing the “live” db from the active configuration we prevent the
possibility of accidental edits affecting the field, or
edits-in-progress
from being sent out prematurely when one person is working on a delivery
service and another is queueing updates.

Once set, it would be this active delivery service definition that would
be provided to the rest of traffic ops for any delivery service
operations.  For config file generation, new API endpoints would be
created
that do the following:

- List the delivery services and the active versions of each assigned to
the specific server.
- Provide the json object from the database when requested - I’m
thinking
that the endpoint would send the current active by default, or a
specific
version if specified.

These definitions would be absurdly cacheable - we would not need to
worry
about sending stale data because each new version would have a
completely
different name - and so could be generated once and sent to thousands of
caches with greatly reduced load on traffic ops.  The load would
consist of
the initial creation of the json object, and the minimal serving of that
object, so this would still result in greatly reduced load on the
traffic
ops host(s) even without the use of caching.  Because of this, the new
cache management service could check with traffic ops multiple times per
minute for updates.  Once a delivery service was changed, the new json
would be downloaded and configs generated on the cache itself.

Other benefits of the use of a cache manager service rather than the ORT
script include:

- Decreased load from logins - once the cache has logged in, it could
use
the cookie from the previous session and only re-login when that cookie
has
expired.  we could also explore the use of certificates or keys instead,
and eliminate logins altogether.
- Multiple checks per minute rather than every X minutes - faster
checks,
more agile CDN.
- Service could provide regular status updates to traffic ops, giving us
the ability to keep an eye out for drastic shifts in i/o, unwanted
behavior, problems with the ATS service, etc.  This leads to building a
traffic ops that can adapt itself on the fly to changing conditions and
adjust accordingly.
- Queue commands to run on the host from traffic ops.  ATS restarts,
system reboots, all manner of things could be triggered and scheduled
right
from traffic ops.

Thoughts?

Derek


Reply via email to