Re: Two different Delivery Services using the same host regexp: A suggested Traffic Monitor code change

Robert Butts Tue, 02 Jan 2018 08:35:32 -0800

>I would like to avoid adding additional configurations to a system, in
cases where the system can auto-detect the proper configuration.


But the system can't, in this case. Detecting a single field as "must be
the DS Name" makes single-domain hosts impossible (such as `localhost`).
Single-domain names are perfectly valid within an internal network, we
shouldn't make that configuration impossible.

I agree, minimizing configuration is ideal, but in this case, it isn't
possible without making certain things mysteriously impossible; and that's
going to be a very confusing bug for someone to track down, if they need it.

Moreover, that configuration does exist, whether we try to magically detect
it or not. It is a property of the cache. And that's what Traffic Ops is
for -- configuration data about the cache should be stored in TO. Not
having it there is going to make Operations' job more difficult, because
now they can't easily see what type of Stats a cache is serving. For
example, what if there's a bug and the Monitor is inferring the wrong type?
Now it's a nightmare for Ops and Dev to track down.

I support what you want to accomplish, supporting multiple DSes with the
same regex; and it's expensive to parse the domain anyway, it's ideal to
have the DS name in the stat. But unfortunately, it isn't possible to
automatically infer without making certain configurations impossible, and
failing to store the Stats type configuration (which exists, whether we
store the data or not) in TO is going to lead to pain.

>I do think it would be nice to have Traffic Monitor support some sort of
canonical stats interface

I have mixed feelings. On one hand, it'd be simpler, if there were "One
Right Way" with all the things we want/need. But on the other hand, that
would make it painful to use Caches with Traffic Control which don't have
extensible stats. It'd mean users would have to set up an intermediary
service (for every single Cache) to transform the Stats, for Cache
applications which aren't sufficiently configurable.

Maybe it'd be better make the Monitor able to easily be extended, similar
to adding an endpoint in Golang TO, simply by adding a function in a new
file, and adding that func to a "Server Cache Type" map? That solution
seems fairly easy, and much easier for users to add custom stats than
having to write and deploy an intermediary service. It will also be safe,
if we make the TO database field non-nullable and default to the current
type.


Also be aware @orens that raw stats are part of the /publish/CacheStats API
in Traffic Monitor. Changing the Stats format will fundamentally change
that API endpoint (unless you also write the Monitor code to rebuild the
domain, as inserted into the `Result.Astats.Ats` and `ResultStatHistory`
objects). I'm not sure if anything currently uses that (Router? Traffic
Stats?), but anything that does will break. Just something to be aware of.


On Sun, Dec 31, 2017 at 1:35 AM, Oren Shemesh <[email protected]> wrote:

> Thanks Ryan, Robert and Dave for your responses.
>
> Dave, can you send a pointer to any documentation/text available regarding
> 'Stats over HTTP' ?
>
> I believe I can consider the overall response to the change I am proposing
> as positive :-)
>
> Robert, from my experience I would like to avoid adding additional
> configurations to a system, in cases where the system can auto-detect the
> proper configuration.
> Any configuration added is another chance for someone making a mistake.
>
> With the proposal I have made, so configuration is needed: TM auto-detects
> the type of astats information (DS name / entire DS domain) based on a
> single 'if' statement which queries a single scalar value.
> This also means that, assuming my proposed code change is valid, there can
> be no performance issues.
>
> Also, I do not know what 'Stats over HTTP' is, but I hope it keeps the job
> of classifying Client HTTP transactions to a DS in the cache (ATS in our
> case), avoiding the need to re-classify it in TM.
>
> (Note: In my mind, the fact that TM currently does this job, based on a
> *single* hostName regex per DS, makes the entire feature of having multiple
> regexps per DS very limited in use. Which is a shame, based on the effort
> invested in the DB model, TO API, and TR code).
>
> Thanks a lot guys !
>
>
> On Sat, Dec 30, 2017 at 7:33 PM, Dave Neuman <[email protected]> wrote:
>
> > Hi Oren,
> > Sorry for the slow response, I think what you are proposing sounds
> > reasonable, so +1.  I would, however, like to make sure that we aren't
> > affecting the performance of TM too much in terms of both CPU usage (with
> > 1000+ caches) and time to poll all caches/aggregate statistics.
> >
> > Additionally, I don't think we should be adding new functionality to
> astats
> > at all. The plan is to move towards the Stats Over HTTP plugin in the
> > future, so we should focus on making Traffic Monitor do what we need it
> to
> > do now, and add new functionality to Stats over HTTP if it doesn't
> > currently exist.  I do think it would be nice to have Traffic Monitor
> > support some sort of canonical stats interface instead of
> > astats/stats_over_http, and I am sure there are several other features
> > would be nice to have as well, but I don't think we should be muddying
> the
> > waters with that right now.
> >
> >
> > On Wed, Dec 27, 2017 at 12:24 PM, Robert Butts <[email protected]
> >
> > wrote:
> >
> > > I'd be +1 if we extend Astats/stats_over_http to have the DS name.
> > >
> > > It's easy to extend the Monitor to get the name from a field, if it's
> > > available. And it should be faster than the current method. But, Astats
> > > doesn't currently support it, and in fact, I don't believe ATS has a
> good
> > > way to associate an identifier (DS name) with a Remap Rule. Can anyone
> > > confirm?
> > >
> > > That said, if we're extending `stats_over_http`, I'd really like to see
> > it
> > > extended to serve CSV (ideally with the DS name in a field, as above),
> > when
> > > the request has an `Accept: text/csv` header. Astats only has a single
> > > level of JSON, and they're all numbers. So there's no reason to use
> JSON
> > > over CSV, and CSV is drastically simpler and faster to parse.
> Especially
> > if
> > > we break each `.` component of the stat into its own CSV field. The
> > Monitor
> > > is CPU-bound, so there's a good chance we'd see a noticeable
> performance
> > > improvement, which translates to a tangible poll time improvement.
> > >
> > > Also @orens if we can't reasonably extend ATS Astats, we can still
> > support
> > > your use case for your own cache, by having a separate code path in the
> > > Monitor for your "stats", and adding a "stats type" number or enum to
> the
> > > Server table (or ideally a new table for "cache servers", since it only
> > > applies to caches). So, the Monitor would look at that field (possibly
> > > adding it to `monitoring.json` or `CRConfig.json`), to determine how to
> > > parse a given server's stats.
> > >
> > > We've discussed a "stat version" field anyway, for different Astats
> > > versions. For example, we're looking into combining the System stats
> into
> > > the Ats key, and we'd also like to be able to experiment with different
> > > formats (CSV, Protobuf, etc). I think it's a good feature for the CDN
> in
> > > general, to make stat polling more flexible.
> > >
> > >
> > > On Wed, Dec 27, 2017 at 12:21 PM, Durfey, Ryan <
> [email protected]>
> > > wrote:
> > >
> > > > Oren,
> > > >
> > > > This is a great conversation to have. I can’t speak to the best
> > solution,
> > > > but allowing our servers to have a single SSL cert that could be
> shared
> > > > across delivery services would save us a lot of time, money, and
> > > > operational resources. From a product perspective this would be
> highly
> > > > beneficial to the platform especially since all services are
> eventually
> > > > moving to SSL.
> > > >
> > > > Key Benefits:
> > > >
> > > >   1.  Instant Service Builds: All new services could use SSL
> > immediately,
> > > > no 2-3 business day delays in ordering SSL certs.
> > > >      *   This gets us a step closer to instant SSL service builds
> > without
> > > > the need to implement something like “letsencrypt”.
> > > >   2.  Operations Load: Reducing the management of a constant flow of
> > > > orders and renewals for SSL certs would save operations resources and
> > > time.
> > > >   3.  Cost Reduction: A reduction in SSL certs equates to reduction
> in
> > > > purchase and renewal costs.
> > > >
> > > >
> > > >
> > > > Ryan Durfey    M | 303-524-5099
> > > > CDN Support (24x7): 866-405-2993 or [email protected]<mailto:
> > > > [email protected]>
> > > >
> > > >
> > > > From: Oren Shemesh <[email protected]>
> > > > Reply-To: "[email protected]" <
> > > > [email protected]>
> > > > Date: Wednesday, December 27, 2017 at 2:49 AM
> > > > To: "[email protected]" <
> > > > [email protected]>
> > > > Subject: Two different Delivery Services using the same host regexp:
> A
> > > > suggested Traffic Monitor code change
> > > >
> > > > Hi TC Dev people,
> > > >
> > > > We would like to avoid having to issue a specific SSL certificate for
> > > every
> > > > DS, as is needed today.
> > > >
> > > > (Reminder: Every DS needs a separate certificate because The domains
> > used
> > > > by the DS are tr.<ds-name>.<cdn-domain> and
> > > > <cache-name>.<ds-name>.<cdn-domain>, and wild-card certs support
> only
> > a
> > > > single *, so every cert is issued to *.<ds-name>.<cdn-comain>, which
> > > makes
> > > > it DS-specific).
> > > >
> > > > When configuring two different DSes to use the same host regexp, and
> > > > configuring additional path regexps to differentiate between them, it
> > is
> > > > possible to overcome this issue.
> > > >
> > > > For example: DS 'my-ds1' can have host regexp .*\.my-ds\..* , and
> path
> > > > regexp /ds1/.* (Both with order 0), and DS 'my-ds2' has the same host
> > > > regexp and path regexp /ds2/.*.
> > > >
> > > > This makes TR redirect requests of the form
> > > > tr.my-ds.<cdn-domain>/ds1/<whatever> to
> > > > <cache-name>.my-ds.<cdn-domain>/ds1/<whatever>, do the same for
> > > > tr.my-ds.<cdn-domain>/ds2/<whatever>, and respond with '503 service
> > > > unavailable' to requests to tr.my-ds.<cdn-domain> which do not start
> > with
> > > > /ds1/ or /ds2/.
> > > >
> > > > So with TR, So far so good.
> > > >
> > > > ATS can also differentiate between the DSes based on the path, so
> there
> > > is
> > > > no problem there as well.
> > > >
> > > > However, there is a problem with Traffic Monitor:
> > > >
> > > > Because astats sends per-DS info based on the origin name, TM
> requires
> > > that
> > > > every DS has a unique host regexp (See
> > > > https://github.com/apache/incubator-trafficcontrol/blob/
> > > > 3e3a5f41d482a065f46ef287b347e66d7d205d82/traffic_monitor/
> > > > todata/todata.go#L234
> > > > ).
> > > > This is so that TM can map origin names from astats back to a DS.
> > > >
> > > > So, I would like to make a small change to the TM code:
> > > >
> > > > If the astats info contains only a single host name, assume it is
> > already
> > > > the DS name.
> > > >
> > > > (This would be implemented in  <goog_1101383350>
> > > > https://github.com/apache/incubator-trafficcontrol/blob/
> > > > 3e3a5f41d482a065f46ef287b347e66d7d205d82/traffic_monitor/
> > > > cache/cache.go#L419
> > > > :
> > > >
> > > > if len(statParts) == 2 { // [0] is DS name, [1] is statName
> > > > ds = statParts[0]
> > > > }
> > > > else {
> > > > ... existing code which uses statsParts[0:len(statsParts)-1]
> > > > }
> > > >
> > > > Obviously, the ds should be checked against the list of all DSes to
> > make
> > > > sure it is valid, I wrote this code line just to demonstrate the
> idea).
> > > >
> > > > Of course, the astats plugin for ATS might not support this today (I
> am
> > > not
> > > > sure), but a different cache (Hint: Qwilt cache) has no problem
> > reporting
> > > > the astats information per DS name.
> > > > If needed, the astats code could be improved to do that, as well.
> > > >
> > > > I believe that this change, while being very small and unobtrusive,
> > would
> > > > make the information passed from caches to TM 'cache vendor
> agnostic',
> > > > while also eliminating the need to re-classify URLs into DS (Which is
> > > what
> > > > happens today in TM), hence it is a good step in a good direction.
> > > >
> > > > Any comments / opinions, please ?
> > > >
> > > > --
> > > >
> > > > *Oren Shemesh*
> > > > Qwilt | Work: +972-72-2221637| Mobile: +972-50-2281168 |
> > [email protected]
> > > > <mailto:[email protected]>
> > > > <[email protected]<mailto:[email protected]>>
> > > >
> > > >
> > >
> >
>
>
>
> --
>
> *Oren Shemesh*
> Qwilt | Work: +972-72-2221637| Mobile: +972-50-2281168 | [email protected]
> <[email protected]>
>

Re: Two different Delivery Services using the same host regexp: A suggested Traffic Monitor code change

Reply via email to