Re: Traffic Server Secondary Streaming IPs Design

2018-04-09 Thread Nir Sopher
n there need to be
> two tables: IP and interface.
> > And in the server table, we need to replace the original
> "interface_xxx", "ip_xxx", "ip6_xxx" fields with a "primary_ip_id" field.
> And do similar things to management IP.
> >
> > Thanks,
> > Zhilin
> >
> >
> > On 03/04/2018, 7:08 AM, "Mark Torluemke" <
> mtorlue...@apache.org> wrote:
> >
> >I would support an 'interfaces' table (adding some
> sort of a 'type' column)
> >that would include moving the management and lights
> out management
> >interfaces to that table as well.
> >
> >Cheers,
> >Mark
> >
> >On Mon, Apr 2, 2018 at 2:39 PM, Nir Sopher <
> n...@qwilt.com> wrote:
> >
> >> Hi Zhilin,
> >>
> >> I took a quick look into the spec. Hope to have the
> opportunity to dive
> >> deeper into it soon so we can further discuss it.
> >>
> >> For now I have a 2 questions.
> >> In the spec, you refer to "secondary interfaces", and
> you have a list of
> >> secondary interfaces added.
> >> IIUC the secondary interfaces are used as long as they
> are available, and
> >> when down, you move to the primary interface.
> >>
> >> Why not, instead of holding a secondary interfaces
> table, move all
> >> interfaces to a separate table? Primary and secondary.
> >> For each interface you can hold:
> >>
> >>   - Server id
> >>   - name (e.g. eth0)
> >>   - IPv6
> >>   - IPv4
> >>   - Priority (Integer as flexible as you wish: e.g. "1"
> for "secondary",
> >>   "2" for "primary" in your example,)
> >>
> >>
> >> Additionally, it is not clear to me what happens if one
> of the interfaces
> >> fails?
> >> Does every interface has a unique DNS name? If an
> interface fails, are
> >> redirects
> >> sent only to the available (secondary) interfaces?
> >>
> >> Thanks,
> >> Nir
> >>
> >>
> >> On Mon, Apr 2, 2018 at 10:21 AM, Zhilin Huang
> (zhilhuan) <
> >> zhilh...@cisco.com
> >>> wrote:
> >>
> >>> Hi Guys,
> >>>
> >>> This was originally posted in another discussion.
> Resend this in a
> >>> standalone topic to catch more awareness. The link for
> the design doc:
> >>>
> https://docs.google.com/document/d/1vgq-pGNoLLYf7Y3cu5hWu67TUKpN5hucrp
> >>> -ZS9nSsd4/edit?usp=sharing
> >>>
> >>>
> >>> Short summary for the feature design:
> >>> ---
> >>> There is feature request from market to add secondary
> IPs support on edge
> >>> cache servers, and the functionality to assign a
> delivery service to a
> >>> secondary IP of an edge cache.
> >>>
> >>> This feature requires Traffic Ops implementation to
> support secondary IP
> >>> configuration for edge cache, and delivery service
> assignment to
> >> secondary
> >>> IP.
> >>>
> >>> Traffic Monitor should also monitor connectivity of
> secondary IPs
> >>> configured. And Traffic Router needs support to
> resolve streamer FQDN to
> >>> secondary IP assigned in a delivery service.
> >>>
> >>> Traffic Server should record the IP serving client
> request. And should
> >>> reject request to an unassigned IP for a delivery
> service.
> >>>
> >>> This design has taken compatibility into
> consideration: if no secondary
> >> IP
> >>> configured, or some parts of the system has not been
> upgraded to the
> >>> version supports this feature, the traffic will be
> served by primary IPs
> >> as
> >>> before.
> >>> ---
> >>>
> >>> Much appreciated and welcome to any comments. If no
> major objections, we
> >>> planned to start coding this week.
> >>>
> >>> Thanks,
> >>> Zhilin
> >>>
> >>>
> >>
> >
> >
>
>
>
>
>
>
>
>
>
>


Re: Traffic Server Secondary Streaming IPs Design

2018-04-04 Thread Nir Sopher
Eric,
Great, IP as delivery unit.
Note that I believe the port is part of this unit, and not a common
settings to all IPs in the interface.

+1 for Rob's suggestion (stats are collected on the interface level, and
health/heartbeat on the ip level)

Rob/Jeff
I believe we need to verify this entire concept fits well with monitor and
router localization.

Nir

On Wed, Apr 4, 2018, 17:23 Robert Butts <robert.o.bu...@gmail.com> wrote:

> @nbaoping
>
> > So I suggest the change to the current TM to be like:
> > 1) Separate the current polling of cache servers into two different
> pollings, one for the keep alive, the other for the stat query.
>
> The Golang Monitor already does this. We call it the "health" and "stat"
> polls in the code. The stat poll is the full stats, and the health poll is
> just the system stats. Does that work for your keep-alive poll? The health
> poll is slightly more than just establishing a TCP connection, but it's
> very small, and it also gives us the interface data.
>
> > We need record the availability for each configured IP so that if it’s
> assigned, the router can check if it can redirect the client request to
> that assigned IP or not.
> > if we have multiple interfaces support, we should check the bandwidth
> availability for each interface
>
> Because the health poll has interface data, I'd suggest modifying Traffic
> Monitor to poll a single arbitrary IP for the Stat poll, as you suggest;
> and to poll all IPs on the Health poll. Then, because the system stats are
> in the health poll, the Monitor can figure out which interface that IP is
> on, and track the availability of that interface from that health poll
> data.
>
> If you aren't familiar with the Health vs Stat polls in the new Golang
> Monitor, see:
> https://traffic-control-cdn.readthedocs.io/en/latest/
> development/traffic_monitor_golang.html#architecture
> https://github.com/apache/incubator-trafficcontrol/blob/
> master/traffic_monitor/manager/manager.go
> https://github.com/apache/incubator-trafficcontrol/blob/
> master/traffic_monitor/manager/health.go
> https://github.com/apache/incubator-trafficcontrol/blob/
> master/traffic_monitor/manager/stat.go
>
> Does that work?
>
>
> On Wed, Apr 4, 2018 at 5:07 AM, Eric Friedrich (efriedri) <
> efrie...@cisco.com> wrote:
>
> > Hey Nir-
> >   For our particular use case, we are looking at making an IP the
> delivery
> > unit. We would like to use a single interface with multiple IPs. DSs
> would
> > be assigned to one of the IPs on that interface.  Interface (or IP)
> > priority does not come into play here as there is no failover between IPs
> >
> > —Eric
> >
> >
> > > On Apr 4, 2018, at 1:23 AM, Nir Sopher <n...@qwilt.com> wrote:
> > >
> > > +1
> > > Note that beyond the DB, the change should also be reflected into the
> > > cr-config.
> > > As I see it, a flexible model may be built of the below items:
> > > 1 - Edge server.
> > > 2- Interface
> > > 3. IPs
> > >
> > > The Interface (or should it be called "delivery unit") is the element
> we
> > > redirect the traffic to and which is monitored by the traffic-monitor:
> > > * Each server may have multiple interfaces
> > > * Each interface may have multiple IPs
> > > * Interfaces has priorities (abstraction for primary/secondary)
> > > * Each interface is given a seperate DNS name by the router. Single
> name
> > > for the multiple IPs.
> > > * Each interface is monitored and reported seperately by the traffic
> > > monitor, including health an stats.
> > >
> > >
> > > The router "redirect target decision" may look as follows
> > > 0. Select cache as we do today taking into account the consistent
> hash. A
> > > server is in the selection group only if one of its interfaces is found
> > to
> > > be healthy
> > > 1. Once we have server selected, select an interface out of all
> > interfaces
> > > of the server with max available priority.
> > >
> > > An additional improvement, may assign DS to interfaces instead of
> > servers.
> > > A server serves DS X iff one of its interfaces is assigned to the DS.
> > >
> > > Nir
> > >
> > >
> > > On Apr 4, 2018 6:56 AM, "Zhilin Huang (zhilhuan)" <zhilh...@cisco.com>
> > > wrote:
> > >
> > > Updated the DB schema in section 3.1.1.4
> > >
> > > Thanks,
> > > Zhilin
> > >
> > >
> > >
> &

Re: Traffic Server Secondary Streaming IPs Design

2018-04-03 Thread Nir Sopher
+1
Note that beyond the DB, the change should also be reflected into the
cr-config.
As I see it, a flexible model may be built of the below items:
1 - Edge server.
2- Interface
3. IPs

The Interface (or should it be called "delivery unit") is the element we
redirect the traffic to and which is monitored by the traffic-monitor:
* Each server may have multiple interfaces
* Each interface may have multiple IPs
* Interfaces has priorities (abstraction for primary/secondary)
* Each interface is given a seperate DNS name by the router. Single name
for the multiple IPs.
* Each interface is monitored and reported seperately by the traffic
monitor, including health an stats.


The router "redirect target decision" may look as follows
0. Select cache as we do today taking into account the consistent hash. A
server is in the selection group only if one of its interfaces is found to
be healthy
1. Once we have server selected, select an interface out of all interfaces
of the server with max available priority.

An additional improvement, may assign DS to interfaces instead of servers.
A server serves DS X iff one of its interfaces is assigned to the DS.

Nir


On Apr 4, 2018 6:56 AM, "Zhilin Huang (zhilhuan)" <zhilh...@cisco.com>
wrote:

Updated the DB schema in section 3.1.1.4

Thanks,
Zhilin



On 04/04/2018, 11:02 AM, "Zhilin Huang (zhilhuan)" <zhilh...@cisco.com>
wrote:

Good points. I am happy to make this change in the design doc.

Thanks,
Zhilin


On 03/04/2018, 8:17 PM, "Eric Friedrich (efriedri)" <efrie...@cisco.com>
wrote:

I would prefer a consistent way to store all interface and IP
address information. Its good database design practice to store similar
information in similar tables (i.e. all IP info in 1 table) rather than
keep some IPs in the server table and some IPs in another table.

I also think this refactoring will give us greater flexibility for
more changes in the future. Outside of this particular use case, we might
have additional features like sharing edges between public/private networks
or having multiple (equal priority) streaming interfaces on a cache.

These future features would be easier if the interface data and IP
data is all organized into separate tables.

I’d also like to see the delivery service to IP mapping be a many
to many mapping in the DB. For this particular feature we will only assign
a single IP (and we can restrict that in the API if we want), but I am near
certain that in the future we would like the ability to assign a DS to
multiple IPs on the same cache.


—Eric



> On Apr 3, 2018, at 2:42 AM, Zhilin Huang (zhilhuan) <
zhilh...@cisco.com> wrote:
>
> Hi Mark,
>
> Thanks for your comments. Please check my reply in another thread:
>
> If we all agreed to use unified tables for all IPs and/or
interfaces: primary, management, secondary, then there need to be two
tables: IP and interface.
> And in the server table, we need to replace the original
"interface_xxx", "ip_xxx", "ip6_xxx" fields with a "primary_ip_id" field.
And do similar things to management IP.
>
> Thanks,
> Zhilin
>
>
> On 03/04/2018, 7:08 AM, "Mark Torluemke" <mtorlue...@apache.org>
wrote:
>
>I would support an 'interfaces' table (adding some sort of a
'type' column)
>that would include moving the management and lights out
management
>interfaces to that table as well.
>
>Cheers,
>Mark
>
>On Mon, Apr 2, 2018 at 2:39 PM, Nir Sopher <n...@qwilt.com>
wrote:
>
>> Hi Zhilin,
>>
>> I took a quick look into the spec. Hope to have the opportunity
to dive
>> deeper into it soon so we can further discuss it.
>>
>> For now I have a 2 questions.
>> In the spec, you refer to "secondary interfaces", and you have a
list of
>> secondary interfaces added.
>> IIUC the secondary interfaces are used as long as they are
available, and
>> when down, you move to the primary interface.
>>
>> Why not, instead of holding a secondary interfaces table, move
all
>> interfaces to a separate table? Primary and secondary.
>> For each interface you can hold:
>>
>>   - Server id
>>   - name (e.g. eth0)
>>   - IPv6
>>   - IPv4
>>   - Priority (Integer as flexible as you wish: e.g. "1" for
"secondary",
>>   "2" for "primary" in your 

Re: Traffic Server Secondary Streaming IPs Design

2018-04-02 Thread Nir Sopher
Hi Zhilin,

I took a quick look into the spec. Hope to have the opportunity to dive
deeper into it soon so we can further discuss it.

For now I have a 2 questions.
In the spec, you refer to "secondary interfaces", and you have a list of
secondary interfaces added.
IIUC the secondary interfaces are used as long as they are available, and
when down, you move to the primary interface.

Why not, instead of holding a secondary interfaces table, move all
interfaces to a separate table? Primary and secondary.
For each interface you can hold:

   - Server id
   - name (e.g. eth0)
   - IPv6
   - IPv4
   - Priority (Integer as flexible as you wish: e.g. "1" for "secondary",
   "2" for "primary" in your example,)


Additionally, it is not clear to me what happens if one of the interfaces
fails?
Does every interface has a unique DNS name? If an interface fails, are
redirects
sent only to the available (secondary) interfaces?

Thanks,
Nir


On Mon, Apr 2, 2018 at 10:21 AM, Zhilin Huang (zhilhuan)  wrote:

> Hi Guys,
>
> This was originally posted in another discussion. Resend this in a
> standalone topic to catch more awareness. The link for the design doc:
> https://docs.google.com/document/d/1vgq-pGNoLLYf7Y3cu5hWu67TUKpN5hucrp
> -ZS9nSsd4/edit?usp=sharing
>
>
> Short summary for the feature design:
> ---
> There is feature request from market to add secondary IPs support on edge
> cache servers, and the functionality to assign a delivery service to a
> secondary IP of an edge cache.
>
> This feature requires Traffic Ops implementation to support secondary IP
> configuration for edge cache, and delivery service assignment to secondary
> IP.
>
> Traffic Monitor should also monitor connectivity of secondary IPs
> configured. And Traffic Router needs support to resolve streamer FQDN to
> secondary IP assigned in a delivery service.
>
> Traffic Server should record the IP serving client request. And should
> reject request to an unassigned IP for a delivery service.
>
> This design has taken compatibility into consideration: if no secondary IP
> configured, or some parts of the system has not been upgraded to the
> version supports this feature, the traffic will be served by primary IPs as
> before.
> ---
>
> Much appreciated and welcome to any comments. If no major objections, we
> planned to start coding this week.
>
> Thanks,
> Zhilin
>
>


Re: [VOTE] Resolution for Traffic Control graduation to TLP

2018-04-02 Thread Nir Sopher
+1

On Mon, Apr 2, 2018 at 11:24 PM, Gelinas, Derek 
wrote:

> +1
>
> > On Apr 2, 2018, at 4:20 PM, Leif Hedstrom  wrote:
> >
> >
> >> On Apr 2, 2018, at 2:11 PM, David Neuman 
> wrote:
> >>
> >> Dear Traffic Control community members:
> >>
> >> I would like to call a vote on the resolution for Traffic Control to
> >> graduate from to an Apache TLP.  We have already voted on whether or
> not we
> >> should start the graduation process [1] and this is the next step.
> Please
> >> see the resolution below and vote as follows:
> >>
> >> [ ] +1 Graduate Traffic Control from the incubator
> >> [ ] +0 No Opinion
> >> [ ] -1 Do not graduate Traffic Control from the incubator (please
> provide a
> >> reason)
> >>
> >
> >
> > +1 (binding)
> >
> > — Leif
> >
>


Re: TM: Question about the poll model of the Traffic Monitor

2018-03-28 Thread Nir Sopher
Hi Eric/Neil,
Isn't the question of supporting multi interfaces per server a much wider
question? Architectural wise.
What would be the desired behavior if the monitoring shows that only one of
the interfaces is down? Will the router send traffic to the healthy
interfaces? How?
Nir

On Wed, Mar 28, 2018, 19:10 Eric Friedrich (efriedri) 
wrote:

> The use case behind this question probably deserves a longer dev@ email.
>
> I will oversimplify: we are extending TC to support multiple IPv4 (or
> multiple IPv6) addresses per edge cache (across 1 or more NICs).
>
> Assume all addresses are reachable from the TM.
>
> —Eric
>
>
> > On Mar 28, 2018, at 11:37 AM, Robert Butts 
> wrote:
> >
> > When you say different interfaces, do you mean IPv4 versus IPv6? Or
> > something else?
> >
> > If you mean IPv4 vs IPv6, we have a PR for that from Dylan Volz
> > https://github.com/apache/incubator-trafficcontrol/pull/1627
> >
> > I'm hoping to get to it early next week, just haven't found the time to
> > review and test it yet.
> >
> > Or did you mean something else by "interface"? Linux network interfaces?
> > Ports?
> >
> >
> > On Wed, Mar 28, 2018 at 12:02 AM, Neil Hao (nbaoping) <
> nbaop...@cisco.com>
> > wrote:
> >
> >> Hi,
> >>
> >> Currently, we poll exact one URL request to each cache server for one
> >> interface, but now we’d like to add multiple interfaces support,
> therefore,
> >> we need multiple requests to query each interface of the cache server, I
> >> check the code of Traffic Monitor, it seems we don’t support this kind
> of
> >> polling, right?
> >>
> >> I figure out different ways to support this:
> >> 1) The first way: change the ‘Urls’ field in the HttpPollerConfig from
> >> ‘map[string]PollConfig’ to ‘map[string][]PollConfig’, so that we can
> have
> >> multiple polling config to query the multiple interfaces info.
> >>
> >> 2) The second way: Change the ‘URL’ field in the PollConfig from
> ‘string’
> >> to ‘[]string’.
> >>
> >> No matter which way, it seems it will bring a little big change to the
> >> current polling model. I’m not sure if I’m on the right direction, would
> >> you guys have suggestions for this?
> >>
> >> Thanks,
> >> Neil
> >>
>
>


Fwd: Delivery Service Origin Refactor

2018-03-12 Thread Nir Sopher
Hi Rawlin,
Can you please add a few word for the motivation behind basing the steering
target selection on the location of the client?
As the content goes through the caches, isn't it the job of the cache to
select the best origin for the cache?  Why the client should be the one to
take the origin location into consideration?
Why the target DSes have different origins in the first place? Are they
have different characteristics additionally to their location?
Thanks,
Nir

-- Forwarded message --
From: Rawlin Peters 
Date: Mon, Mar 12, 2018 at 9:46 PM
Subject: Delivery Service Origin Refactor
To: dev@trafficcontrol.incubator.apache.org


Hey folks,

As promised, this email thread will be to discuss how to best
associate an Origin Latitude/Longitude with a Delivery Service,
primarily so that steering targets can be ordered/sent to the client
based upon the location of those targets (i.e. the Origin), a.k.a.
Steering Target Geo-Ordering. This is potentially going to be a pretty
large change, so all your feedback/questions/concerns are appreciated.

Here were a handful of bad ideas I had in order to accomplish this DS
Origin Lat/Long association (feel free to skip to PROPOSED SOLUTION
below):

1. Reuse the current MSO (multisite origin) backend (i.e. add the
origin into the servers table, give it a lat/long from its cachegroup,
assign the origin server to the DS)
Pros:
- reuse of existing db schema, probably wouldn't have to add any new
tables/columns
Cons:
- MSO configuration is already very complex
- for the simple case of just wanting to give an Origin a lat/long you
have to create a server (of which only a few fields make sense for an
Origin), add it to a cachegroup (only name and lat/long make sense,
won't use parent relationships, isn't really a "group" of origins),
assign it to a server profile (have to create one first, no parameters
are needed), and finally assign that Origin server to the delivery
service (did I miss anything?)

2. Add Origin lat/long columns to the deliveryservice table
Pros:
- probably the most straightforward solution for Steering Target
Geo-Ordering given that Origin FQDN is currently a DS field.
Cons:
- doesn't work well with MSO
- could be confused with Default Miss Lat/Long
- if two different delivery services use colocated origins, the same
lat/long needs entered twice
- adds yet another column to the crowded deliveryservice table

3. Add origin lat/long parameters to a Delivery Service Profile
Pros:
- Delivery Services using colocated origins could share the same profile
- no DB schema updates needed
Cons:
- profile parameters lack validation
- still doesn't support lat/long for multiple origins associated with a DS

4. Add the lat/long to the steering target itself (i.e. where you
choose weight/order, you'd also enter lat/long)
Pros:
- probably the easiest/quickest solution in terms of development
Cons:
- only applies lat/long to a steering target
- using the same target in multiple Steering DSes means having to keep
the lat/long synced between them all
- lat/long not easily reused by other areas that may need it in the future



PROPOSED SOLUTION:

All of those ideas were suboptimal, which is why I think we need to:
1. Split Locations out of the cachegroup table into their own table
with the following columns (cachegroup would have a foreign key to
Location):
- name
- latitude
- longitude

2. Split Origins out of the server and deliveryservice tables into
their own table with the following columns:
- fqdn
- protocol (http or https)
- port (optional, can be inferred from protocol)
- location (optional FK to Location table)
- deliveryservice FK (if an Origin can only be associated with a
single DS. Might need step 3 below for many-to-many)
- ip_address (optional, necessary to support `use_ip_address` profile
parameter for using the origin's IP address rather than fqdn in
parent.config)
- ip6_address (optional, necessary because we'd have an ip_address
column for the same reasons)
- profile (optional, primarily for MSO-specific parameters - rank and
weight - but I could be convinced that this is unnecessary)
- cachegroup (optional, necessary to maintain primary/secondary
relationship between MID_LOC and ORG_LOC cachegroups for MSO)

3. If many-to-many DSes to Origins will still be possible, create a
new deliveryservice_origin table to support a many-to-many
relationship between DSes and origins
- the rank/weight fields for MSO could be added here possibly, maybe
other things as well?

4. Consider constraints in the origin and deliveryservice_origin table
- must fqdn alone be unique? fqdn, protocol, and port combined?

The process for creating a Delivery Service would change in that
Origins would have to be created separately and added to the delivery
service. However, to aid migration to the new way of doing things, our
UIs could keep the "Origin FQDN" field but the API backend would then
create a new row in the Origin table and add it 

Re: [VOTE] Traffic Control graduation to TLP

2018-03-01 Thread Nir Sopher
+1

On Mar 1, 2018 10:27 PM, "Steve Malenfant"  wrote:

> +1
>
> On Thu, Mar 1, 2018 at 12:14 PM, Phil Sorber  wrote:
>
> > +1
> >
> > On Thu, Mar 1, 2018 at 1:12 PM Hank Beatty  wrote:
> >
> > > +1
> > >
> > > On 03/01/2018 10:41 AM, Dave Neuman wrote:
> > > >   Hey All,
> > > >
> > > > After a great discussion amongst the Apache Traffic Control PPMC,
> > > reviewing
> > > > the graduation checklist[1], updating the podling status page[2], and
> > > > updating the project website to ensure the whimsy podling website
> > checks
> > > > pass[3], I would like to call a vote for Apache Traffic Control
> > > graduating
> > > > to a top level project.
> > > >
> > > > Apache Traffic Control entered the incubator on July 12, 2016.  Since
> > > then
> > > > we have announced 4 releases, nominated 4 new committers, organized 3
> > > > summits, had almost 8,000 commits from 63 different contributors, and
> > --
> > > > most importantly -- we have grown and diversified our community.
> > Apache
> > > > Traffic Control is a healthy project that is already acting like an
> > > Apache
> > > > top level project, so we should take the next step.
> > > >
> > > > If we agree that we should graduate to a top level project, the next
> > > steps
> > > > will be to pick the initial PMC chair for the TLP and draft a
> > Resolution
> > > > for the PPMC and IPMC to vote upon.
> > > >
> > > > Please take a minute to vote on wheter or not Traffic Control should
> > > > graduate to a Top Level Project by responding with one of the
> > following:
> > > >
> > > > [ ] +1 Apache Traffic Control should graduate.
> > > > [ ] +0 No opinion
> > > > [ ] -1 Apache Traffic Control should not graduate (please provide the
> > > > reason)
> > > >
> > > > The VOTE will be opened for at least the next 72 hours.  Per Apache
> > > > guidelines[4] I will also be notifying the incubator mailing list
> that
> > a
> > > > community vote is under way.
> > > >
> > > > Thanks,
> > > > Dave
> > > >
> > > >
> > > > [1]
> > > >
> > > https://incubator.apache.org/guides/graduation.html#
> > graduation_check_list
> > > > [2] http://incubator.apache.org/projects/trafficcontrol.html
> > > > [3] https://whimsy.apache.org/pods/project/trafficcontrol
> > > > [4]
> > > >
> > > https://incubator.apache.org/guides/graduation.html#
> > community_graduation_vote
> > > >
> > >
> >
>


Re: Traffic Router Fail - Too Many Open Sockets

2018-02-14 Thread Nir Sopher
 Hi,

I implemented the fix and issue was resolved
until today:)

I have 2 routers, both got stuck together due to connections leak, with
"CLOSE_WAIT" connection towards the monitors.
The only messages in catalina.out were:
WARNING: Imported handshake data with alias 
Feb 13, 2018 2:04:49 PM
com.comcast.cdn.traffic_control.traffic_router.secure.CertificateRegistry
importCertificateDataList

Can it be that in some rare, probably failing, situations, the monitor does
not close the connection?
Nir

On Thu, Feb 1, 2018 at 11:27 PM, Nir Sopher <n...@qwilt.com> wrote:

> Great,
> Thanks!
> Nir
>
> On Thu, Feb 1, 2018 at 11:12 PM, Jeffrey Martin <martin.jef...@gmail.com>
> wrote:
>
>> Hi Nir,
>>This issue is defined by:
>>
>>  Jira: https://issues.apache.org/jira/browse/TC-197
>> and Github https://github.com/apache/incubator-trafficcontrol/issues/916
>>
>> I will be working on a pull request to address this issue in 2.2. The work
>> around is in the second link above.
>> Jeff
>>
>>
>> On Thu, Feb 1, 2018 at 4:09 PM, Jeffrey Martin <martin.jef...@gmail.com>
>> wrote:
>>
>> > Hi Nir,
>> >
>> >
>> > On Thu, Feb 1, 2018 at 4:01 PM, Nir Sopher <n...@qwilt.com> wrote:
>> >
>> >> Hi,
>> >>
>> >> One of my routers got stuck today, not being able to answer http
>> requests
>> >> (routing and API).
>> >> When trying to investigate the issue, I found catalina.log with a lot
>> of
>> >> messages complaining on failure to open a socket due to too many open
>> >> files. See example below.
>> >> No issues were found in the log earlier to that point, beyond a
>> periodic
>> >> warnings of pulling the certificates every 5 minutes.
>> >>
>> >> When trying to understand "what are these open files", I found about 4k
>> >> open connections in "CLOSE_WAIT" towards the monitor.
>> >> Note: I'm running TC2.1 RC3 with golang traffic-monitor.
>> >>
>> >> Have anyone encountered a similar issue?
>> >> Are the warnings for pulling the certificates a normal thing?
>> >>
>> >> Thanks,
>> >> Nir
>> >>
>> >> Feb 01, 2018 7:33:09 AM
>> >> com.comcast.cdn.traffic_control.traffic_router.secure.Certif
>> icateRegistry
>> >> importCertificateDataList
>> >> WARNING: Imported handshake data with alias my-ds.my-cdn.com
>> >> Feb 01, 2018 8:43:13 AM org.apache.tomcat.util.net.Nio
>> Endpoint$Acceptor
>> >> run
>> >> SEVERE: Socket accept failed
>> >> java.io.IOException: Too many open files
>> >> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:422)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:250)
>> >> at
>> >> org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpo
>> >> int.java:1309)
>> >> at java.lang.Thread.run(Thread.java:745)
>> >>
>> >> Feb 01, 2018 8:43:14 AM org.apache.tomcat.util.net.Nio
>> Endpoint$Acceptor
>> >> run
>> >> SEVERE: Socket accept failed
>> >> java.io.IOException: Too many open files
>> >> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:422)
>> >> at
>> >> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChanne
>> >> lImpl.java:250)
>> >> at
>> >> org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpo
>> >> int.java:1309)
>> >> at java.lang.Thread.run(Thread.java:745)
>> >>
>> >
>> >
>>
>
>


Re: Traffic Router Enhancement - Default Maxmind Geolocation Override

2018-02-14 Thread Nir Sopher
I need to get better understanding of the DNS infra-structure to be able to
verify this assumption.
I assumed TR localization feature already solved the problem of getting to
the right router, and from there it is in our hands ...

Nir





On Tue, Feb 13, 2018 at 11:39 PM, Rawlin Peters <rawlin.pet...@gmail.com>
wrote:

> Nir,
>
> You bring up a good point. If we can make the assumption that requests
> coming to a specific Traffic Router are actually somewhat local to
> that Traffic Router, we might be able to localize those
> "country-localized" clients to a cachegroup close to that particular
> Traffic Router. That would have the effect of spreading load around a
> country a bit better if the Traffic Routers were geographically
> distributed well. Maybe that could be Phase 2 of this effort, but how
> much can we rely on that assumption?
>
> -Rawlin
>
> On Tue, Feb 13, 2018 at 1:27 PM, Rivas, Jesse <jesse_ri...@comcast.com>
> wrote:
> > Nir,
> >
> > This solution does not support that level of granularity.
> >
> > Jesse
> >
> > On 2/13/18, 11:43 AM, "Nir Sopher" <n...@qwilt.com> wrote:
> >
> > Hi,
> >
> > Can this solution support different value in different routers?
> > Taking TR localization into account, it might give better
> granularity.
> >
> > Nir
> >
> > On Tue, Feb 13, 2018 at 8:34 PM, Rawlin Peters <
> rawlin.pet...@gmail.com>
> > wrote:
> >
> > > Yeah, this basically solves the problem where MaxMind knows a
> client
> > > is in the US (or another country) but doesn't know the state, city,
> > > zip, etc., so it's not a "true" miss. In that case MaxMind returns
> the
> > > geographic center of that country as the client's location, but we
> > > don't want to route those clients to the cache group closest to
> that
> > > location because it might not be the ideal cachegroup. By using
> this
> > > parameter we can shift this high volume of "US" traffic that is
> > > essentially being localized to a lake in Kansas to a cachegroup
> more
> > > capable of handling that load. And we can do this on a per-country
> > > basis because we can create multiple of these parameters (which we
> > > wouldn't be able to do if we just used the Default Miss Lat/Lon of
> a
> > > DeliveryService).
> > >
> > > -Rawlin
> > >
> > > On Tue, Feb 13, 2018 at 11:10 AM, Rivas, Jesse <
> jesse_ri...@comcast.com>
> > > wrote:
> > > > Steve,
> > > >
> > > > Using the miss location for the DS was a potential solution that
> we
> > > talked about. However, the miss location is intended for use when
> the
> > > client IP falls through MaxMind without any data. Since the default
> > > location doesn't fit this criteria, it was decided to use a profile
> > > parameter to preserve granularity.
> > > >
> > > > Jesse
> > > >
> > > > On 2/13/18, 11:06 AM, "Steve Malenfant" <smalenf...@gmail.com>
> wrote:
> > > >
> > > > Jesse,
> > > >
> > > > I'm not exactly sure how MaxMind return this default value
> but would
> > > there
> > > > be a way to use the MISS location specified in the DS? Seems
> like
> > > that is
> > > > what it was intended for.
> > > >
> > > > Steve
> > > >
> > > > On Tue, Feb 13, 2018 at 12:42 PM, Rivas, Jesse <
> > > jesse_ri...@comcast.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > >
> > > > >
> > > > > At Comcast, we have been seeing a pattern of the same
> cache group
> > > being
> > > > > overloaded nightly as traffic increases on the CDN. The
> cause was
> > > > > determined to be a default location that the geolocation
> provider
> > > MaxMind
> > > > > returns for client IPs that it does not have additional
> data for.
> > > For the
> > > > > US, MaxMind returns a geolocation with the coordinates:
> > > 37.751,-97.822;
> > > > > this is a substantial amount of

Re: Immutable DS CDN - resolving Riak/Postgres data coherency

2018-02-14 Thread Nir Sopher
See WIP PR:
https://github.com/apache/incubator-trafficcontrol/pull/1868/files
Deleting only the latest

On Wed, Feb 14, 2018 at 4:56 PM, Steve Malenfant <smalenf...@gmail.com>
wrote:

> Would deleting the certificate only remove the "latest" copy/alias? The
> certificate and keys should still be retrievable manually.  Yes/No?
>
> On Tue, Feb 13, 2018 at 5:40 PM, Dave Neuman <neu...@apache.org> wrote:
>
> > I think I can get on board with not allowing a user to change the CDN.
> If
> > you want to change the CDN you need to delete your DS and re-create it or
> > create a new DS with a different XML_ID and a regex that matches the
> first
> > DS.
> >
> > We have gone back and forth several times on deleting the keys from riak
> > when you delete a DS.  Each time we decide not to make the change for one
> > reason or another.  The worry is that if you delete a DS and then decide
> > that it was a mistake you now have to generate a whole new certificate
> > which could cost real money.  I am not sure that use-case is common
> enough
> > to warrant us not deleting the certificates for a DS.  For now I am +1 on
> > deleting the certificates when a DS is deleted.
> >
> > Thanks,
> > Dave
> >
> > On Tue, Feb 13, 2018 at 12:14 PM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > >  Hi,
> > >
> > > I created a delivery service and later on realized it is in the wrong
> > CDN.
> > > I then changed the CDN.
> > > The ssl-keys record in the riak kept referring to the old CDN, even if
> I
> > > generated new certificates. Traffic router was therefore unable to pull
> > the
> > > certificate.
> > >
> > > See issue 1847
> > > <https://github.com/apache/incubator-trafficcontrol/issues/1847>
> > >
> > > A fix to this issue can be done by changing the code so the record in
> the
> > > Riak is updated along with the DS update.
> > > However, this does not really make sense - if the CDN has changed, the
> > > domain usually changes as well and the certificate is no longer valid.
> > >
> > > Therefore, I suggest to entirely block DS CDN change [see
> > > https://github.com/apache/incubator-trafficcontrol/pull/1872]
> > > .
> > > Additionally, I added a PR for ssl-keys deletion up-on DS deletion, so
> > > deleting the DS and recreating it would not cause similar issues.
> > >
> > > Would appreciate community input for other alternatives.
> > >
> > > Thanks,
> > > Nir
> > >
> >
>


Re: Connection leaks traffic stats -> influxdb

2018-02-12 Thread Nir Sopher
Thanks Dave,
I'm working with traffic stats 2.1.0 and influx 1.2.2. Tried also with
influx 1.4.3 and found the same issues.
OS: Centos 7.4-1708
Nir

On Mon, Feb 12, 2018 at 2:00 AM, Dave Neuman <neu...@apache.org> wrote:

> Hi Nir,
> I have not seen this issue and I do have some setups with Traffic Stats and
> InfluxDB on the same server.
> A couple of questions:
>   - What version of Traffic Stats are you running?
>   - What version of InfluxDB are you running?
>   - What version of OS are you running?
>
> Thanks,
> Dave
>
>
> On Sun, Feb 11, 2018 at 12:52 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Hi,
> >
> > On my setup, traffic-stats is installed on the same server as the
> influxdb
> > server.
> > We have noticed that the number of open sockets, from stats to influx, is
> > constantly increasing,
> > All connections are at state "ESTABLISHED".
> >
> > Did anyone encounter a similar issue?
> > I'm familiar with https://issues.apache.org/jira/browse/TC-373 but
> believe
> > it is a different case.
> >
> > Thanks,
> > Nir
> >
>


Connection leaks traffic stats -> influxdb

2018-02-11 Thread Nir Sopher
Hi,

On my setup, traffic-stats is installed on the same server as the influxdb
server.
We have noticed that the number of open sockets, from stats to influx, is
constantly increasing,
All connections are at state "ESTABLISHED".

Did anyone encounter a similar issue?
I'm familiar with https://issues.apache.org/jira/browse/TC-373 but believe
it is a different case.

Thanks,
Nir


Re: Delivery Service Self-Service

2018-02-05 Thread Nir Sopher
Hi Jeremy and all,

Jeremy, thanks for putting it all together!

I'll be short, as I mostly agree with most of point the Jeremy's pointed
out.

Regarding the "DS regex", like Ryan IIUC, I believe, the operator needs to
be able to configure it and control it.
First of all as it defines a resource in the operator's domain.
Second, the definition of the regex requires some clear methodology to
avoid collisions, or reuse/abuse.
Following Rawlin remark, some reasonable default should be provided, but we
must have the ability to change it.
Note that path regexs are important as DS identifiers, supporting the SVA
open-caching scheme.

For the last point, the DS/Server assignment, I believe it should be in the
hands of the operator.
In the future it can be delegated the the user, subject to capacity
management. The user should not be familiar with the actual caches, but can
use some filters/tagging for defining the caches to be used.

Nir

On Mon, Feb 5, 2018 at 8:23 PM, Rawlin Peters 
wrote:

> Replies inline
>
> On Fri, Feb 2, 2018 at 1:43 PM, Jeremy Mitchell 
> wrote:
> > 2. Manage DS regexes
> >
> > Here's an explanation of this:
> > http://traffic-control-cdn.readthedocs.io/en/latest/
> admin/traffic_ops/using.html#delivery-service-regexp
> >
> > Currently, this requires the Operations role and for good reason. The
> > danger here involves the risk of a normal user entering a bad regex. For
> > example, it is my understanding that the regex in position zero needs to
> > always follow this format: .*\.foo\..*.
> >
> > Maybe with some better API validation we could let normal users manage DS
> > regexesor maybe these end up going away in favor of something
> > better/easier...not sure yet...
>
> I think the approach that the Traffic Portal takes today is good. Just
> giving a DS a default HOST regex with order = 0 using the xml_id will
> probably cover most use cases for the DS. Then for the cases where
> someone is CNAMEing to the DS FQDN, the DS owner should be able to add
> a max number of HOST regexes with order > 0 matching the CNAME fqdn.
> We should probably just call those "CNAME aliases" or something and
> just expose them as a simple hostname list in the UI rather than as
> HOST regexes with a specific ordering. For a list of aliases, I don't
> think the order really matters at that point as long as they're
> greater than 0 and sequential. That operation could be safe for a
> regular DS owner assuming we validate that the alias is a valid
> hostname (not a regex), unique, and not in use anywhere else in that
> CDN.
>
> We might want to prohibit creating multiple HOST regexes with wildcard
> characters (i.e. non-CNAME-alias)...I'm not even sure there's a valid
> use case for that.
>
> I'm not totally familiar with the usage of PATH and HEADER regexes,
> but they both seem like they should be secondary to the primary HOST
> regex that's created by default (e.g. the request should be matched
> against all primary HOST regexes first before checking against the
> other types). Right now I can create a PATH regex that essentially
> black-holes other DSes (which ones get black-holed depends on the
> order DSes come in CRConfig.json, which seems non-deterministic). So
> we don't want to allow a regular DS owner to modify the PATH and
> HEADER types unless we modify TR to guarantee that primary-secondary
> relationship between HOST and PATH/HEADER regexes.
>
> > 6. Manage DS targets (steering* only)
> >
> > Here's an explanation of this:
> > http://traffic-control-cdn.readthedocs.io/en/latest/
> admin/quick_howto/steering.html?highlight=steering
> >
> > Currently, to manage DS targets requires the Admin or Steering role. Is
> > there any harm in allowing a normal user to "steer" their delivery
> service
> > to another delivery service as long as the target delivery service falls
> in
> > their tenancy?
>
> I think this should be alright as long as *all* target DSes of the
> Steering DS fall in their tenancy.
>
> -Rawlin
>
> On Fri, Feb 2, 2018 at 1:43 PM, Jeremy Mitchell 
> wrote:
> > As we move in the direction of self-service, there are a few obstacles
> that
> > need to be overcome and I'd like to discuss them a bit so grab a cup of
> > coffee...
> >
> > When I say self-service, what I really mean is "delivery service
> > self-service" or the ability to manage your own delivery services (as
> > dictated by tenancy) and everything related to those delivery services.
> > "Everything" includes the following (afaik):
> >
> > 1. Create/Read/Update/Delete delivery services
> > 2. Manage DS regexes
> > 3. Manage DS SSL keys (if applicable)
> > 4. Manage DS URL sig keys (if applicable)
> > 5. Manage DS URI signing keys (if applicable)
> > 6. Manage DS targets (steering* only)
> > 7. Creating DS invalidate content jobs
> > 8. Manage DS / cache assignments
> >
> > If you can't do 1-7 yourself, it's not really self-service is it? #8 is
> > 

Traffic Router Fail - Too Many Open Sockets

2018-02-01 Thread Nir Sopher
Hi,

One of my routers got stuck today, not being able to answer http requests
(routing and API).
When trying to investigate the issue, I found catalina.log with a lot of
messages complaining on failure to open a socket due to too many open
files. See example below.
No issues were found in the log earlier to that point, beyond a periodic
warnings of pulling the certificates every 5 minutes.

When trying to understand "what are these open files", I found about 4k
open connections in "CLOSE_WAIT" towards the monitor.
Note: I'm running TC2.1 RC3 with golang traffic-monitor.

Have anyone encountered a similar issue?
Are the warnings for pulling the certificates a normal thing?

Thanks,
Nir

Feb 01, 2018 7:33:09 AM
com.comcast.cdn.traffic_control.traffic_router.secure.CertificateRegistry
importCertificateDataList
WARNING: Imported handshake data with alias my-ds.my-cdn.com
Feb 01, 2018 8:43:13 AM org.apache.tomcat.util.net.NioEndpoint$Acceptor run
SEVERE: Socket accept failed
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
at
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
at
org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:1309)
at java.lang.Thread.run(Thread.java:745)

Feb 01, 2018 8:43:14 AM org.apache.tomcat.util.net.NioEndpoint$Acceptor run
SEVERE: Socket accept failed
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
at
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
at
org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:1309)
at java.lang.Thread.run(Thread.java:745)


Re: [VOTE] Release Apache Traffic Control (incubating) 2.1.0-RC3

2017-12-20 Thread Nir Sopher
+1

On Thu, Dec 21, 2017 at 5:18 AM, Jeremy Mitchell 
wrote:

> +1
>
> On Wed, Dec 20, 2017 at 5:34 PM, Steve Malenfant 
> wrote:
>
> > +1
> >
> > On Wed, Dec 20, 2017 at 5:14 PM, Dave Neuman  wrote:
> >
> > > +1
> > >
> > > On Wed, Dec 20, 2017 at 8:33 AM, Hank Beatty 
> wrote:
> > >
> > > > Hello All,
> > > >
> > > > I've prepared a release for v2.1.0-RC3
> > > >
> > > > The vote is open for at least 72 hours and passes if a majority of at
> > > > least 3 +1 PPMC votes are cast.
> > > >
> > > > [ ] +1 Approve the release
> > > >
> > > > [ ] -1 Do not release this package because ...
> > > >
> > > > Changes since 2.0.0:
> > > > https://github.com/apache/incubator-trafficcontrol/compare/
> > > > 2.0.x...RELEASE-2.1.0-RC3
> > > >
> > > > This corresponds to git:
> > > >   Hash: 1dcd512f7e2b4898b090837cd3f260e453896e32
> > > >   Tag: RELEASE-2.1.0-RC3
> > > >
> > > > Which can be verified with the following: git tag -v
> RELEASE-2.1.0-RC3
> > > >
> > > > My code signing key is available here:
> > > > https://pgp.mit.edu/pks/lookup?op=get=0x920152B94E0CC77C
> > > >
> > > > The source .tar.gz file, pgp signature (.asc signed with my key from
> > > > above), md5 and sha512 checksums are provided here:
> > > >
> > > > https://dist.apache.org/repos/dist/dev/incubator/
> > > trafficcontrol/2.1.0/RC3
> > > >
> > > > The new proposed download page can be found here:
> > > >
> > > > https://trafficcontrol.incubator.apache.org/downloads/index-new.html
> > > >
> > > > Thanks!
> > > >
> > >
> >
>


Re: [VOTE] Release Apache Traffic Control (incubating) 2.1.0-RC1

2017-10-25 Thread Nir Sopher
Hi,

Trying to build 2.1.x latest on my centos7 VM, I get the below error during
traffic-portal build (due to the recent RedHat openssl issues).
npm: relocation error: npm: symbol SSL_set_cert_cb, version libssl.so.10
not defined in file libssl.so.10 with link time reference

I overcame the issue by changing the image in the Dockerfile to
centos:7.4.1708

We should probably fix it, so "-1".

Would we like to avoid future similar issues by locking the dependencies
using our own Docker images for the build?
We can create our own Docker images, each with all relevant RPMs required
for the specific component build, and maintain them in Docker hub or some
other repository.

Nir


On Tue, Oct 24, 2017 at 2:26 PM, Steve Malenfant 
wrote:

> Is there any Release Notes associated with this release? 1,337 changes and
> the link above will only display 250 of them.
>
> Steve
>
> On Mon, Oct 23, 2017 at 4:01 PM, Hank Beatty  wrote:
>
> > Hello All,
> >
> > I've prepared a release for v2.1.0-RC1
> >
> > The vote is open for at least 72 hours and passes if a majority of at
> > least 3 +1 PPMC votes are cast.
> >
> > [ ] +1 Approve the release
> >
> > [ ] -1 Do not release this package because ...
> >
> > Changes since 2.0.0:
> > https://github.com/apache/incubator-trafficcontrol/compare/
> > 2.0.x...RELEASE-2.1.0-RC1
> >
> > This corresponds to git:
> >  Hash:6ea2ca86d07c16a3b3ca419dd56b975059271206 <(505)%20927-1206>
> >  Tag: RELEASE-2.1.0-RC1
> >
> > Which can be verified with the following: git tag -v RELEASE-2.1.0-RC1
> >
> > My code signing key is available here:
> > https://pgp.mit.edu/pks/lookup?op=get=0x582D3F6E79270895
> >
> > Make sure you refresh from a key server to get all relevant signatures.
> >
> > The source .tar.gz file, pgp signature (.asc signed with my key from
> > above), md5 and sha512 checksums are provided here:
> > https://dist.apache.org/repos/dist/dev/incubator/
> trafficcontrol/2.1.0/RC1
> >
> > Thanks!
> >
> >
>


Re: Promote Golang Traffic Monitor to Default

2017-10-23 Thread Nir Sopher
Hi,

What would be the content of 2.2?
If we want to have very limited content as suggested in the summit, I would
suggest to leave Java TM, removing it only on TC 2.3.

If the 2.2 version has substantial content, I would see leaving the old TM
as part of the release as a liability. Old TM should be adjusted to the
changes and tested regularly.
So in this case, if there are no automated tests to cover its
functionality, I would suggest to remove Java TM from the code base.

Nir

On Mon, Oct 23, 2017 at 5:58 PM, Jeff Elsloo  wrote:

> Hi all,
>
> Apologies for the delay, and thanks to Rob for submitting PR 1427 to
> take care of this. I just merged his PR and that means that
> `traffic_monitor` has been renamed to `traffic_monitor_java` and
> `traffic_monitor_golang` has been renamed to `traffic_monitor` (thanks
> Rob!). This means that we are now one step closer to formally retiring
> the Java version of Traffic Monitor.
>
> Before proposing a vote, I'd like to get a feel for how quickly we can
> do the formal retirement. We're currently working on 2.1 so that means
> that we could retire it as early as 2.2. If we want to be more
> conservative, we could keep both with the renamed structure for 2.2,
> and remove the Java version in 2.3. This is the direction I'm leaning,
> though I'd like to hear from interested parties first.
>
> Thoughts?
> --
> Thanks,
> Jeff
>
>
> On Mon, Jul 24, 2017 at 8:23 AM, Jeff Elsloo  wrote:
> > It sounds like we do not have any -1s on this, so I'm going to assume
> > we're good to make this change. I have some other things to focus on
> > at the moment, but will try to get this done as time permits. I'll
> > send another email out with details when I go to make the change, and
> > will allow some time before pushing anything in case someone has
> > concerns.
> > --
> > Thanks,
> > Jeff
> >
> >
> > On Mon, Jul 17, 2017 at 2:14 PM, Dave Neuman  wrote:
> >> +1 on the rename
> >>
> >> On Mon, Jul 17, 2017 at 10:23 AM, Jan van Doorn 
> wrote:
> >>
> >>> +1
> >>>
> >>> On Mon, Jul 17, 2017 at 9:47 AM Dewayne Richardson 
> >>> wrote:
> >>>
> >>> > When:   Read · Mon, Jul 17.
> >>> > 
> >>> > [image: Timyo expectation line]
> >>> > +1
> >>> >
> >>> > On Fri, Jul 14, 2017 at 2:49 PM, Jeff Elsloo 
> wrote:
> >>> >
> >>> > > For the most part, it's a drop in replacement for the Java version,
> >>> > > and based on our own experience it seems to work exactly as the
> Java
> >>> > > version would, including co-existence. There is a TO API dependency
> >>> > > for monitoring.json that the Java version does not have, and I'm
> not
> >>> > > sure what the history is with that endpoint and how far back we
> could
> >>> > > remain compatible. Traffic Router does not care what version of
> >>> > > Traffic Monitor it talks to, as the format of cr-states.json has
> not
> >>> > > changed. Same goes for TM and ATS. I believe we had co-existence
> >>> > > running in production going back to the 1.8.x releases.
> >>> > >
> >>> > > Keep in mind that the intent is to drive users toward using the
> Golang
> >>> > > component by default starting with the 2.1.0 (or maybe 2.2.0?)
> release
> >>> > > while still allowing one to build, run, or contribute to the Java
> >>> > > version until our next major release (3.0.0). The intent is not to
> >>> > > give people a drop in replacement that works with prior versions;
> we
> >>> > > have not tested that thoroughly across all versions, and while it
> >>> > > might work, we should think of the Golang Traffic Monitor as a
> 2.0.x
> >>> > > and onward component. I think that statement holds for most of our
> >>> > > components; we wouldn't want to run a 1.7 Traffic Stats with a
> 2.0.0
> >>> > > Traffic Ops system. 1.7 is ancient, and have we ever really done
> >>> > > backward compatibility testing with versions?
> >>> > >
> >>> > > To this end, if we do decide to make the Golang version the
> default in
> >>> > > the future, at a minimum we will need to provide release notes that
> >>> > > explain how to convert the Java configuration to the Golang
> version's
> >>> > > config. Ideally we would provide a simple script to convert the
> >>> > > configuration for our users, potentially running it as a
> postinstall
> >>> > > scriptlet in the RPM if the Java version is already installed.
> >>> > > Theoretically we could `yum upgrade traffic_monitor` and seamlessly
> >>> > > move from Java to Golang.
> >>> > > --
> >>> > > Thanks,
> >>> > > Jeff
> >>> > >
> >>> > >
> >>> > > On Fri, Jul 14, 2017 at 2:07 PM, Eric Friedrich (efriedri)
> >>> > >  wrote:
> >>> > > > I think I remember Rob making this point in Miami, but all of TMs
> >>> APIs
> >>> > > (REST, CRConfig, Health.json, etc…) are identical between the Java
> and
> >>> > > Golang version, right?
> >>> > > >
> >>> > > > 

Re: Building Traffic-Router - Failed to bring jdnssec-tools

2017-10-10 Thread Nir Sopher
Thanks Steve for the suggestion.
However, it did not help.
Apparently there is no access from Israel to verisignlabs.com.
I'll discuss with our IT team ways to resolve it, for now I can run on US
servers.
Nir


On Sat, Oct 7, 2017 at 2:07 AM, Steve Malenfant <smalenf...@gmail.com>
wrote:

> Nir,
>
> Try to add "network_mode: bridge" under your services inside of the
> docker-compose file. I know I had to do this to build under Linux, but
> works fine under OSX Docker Engine.
>
> We do have the problem here since we don't allow docker to manage iptables.
> This problem is limited to docker-compose as it will create a new network.
>
> https://docs.docker.com/compose/compose-file/#network_mode
>
> Example :
>   traffic_router_build:
> image: traffic_router_builder
> build:
>   dockerfile: infrastructure/docker/build/Dockerfile-traffic_router
>   context: ../../..
> volumes:
>   - ../../..:/trafficcontrol
> network_mode: bridge
>
> Let me know if that works for you.
>
> Steve
>
> On Thu, Oct 5, 2017 at 5:55 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Hi Rawlin and thank you very much for your help!
> > The link to http://www.verisignlabs.com/jdnssec-tools/packages/
> > old-releases/jdnssec-tools-0.12.tar.gz is not available from our
> > environment (in Israel).
> > Curl first tries IPv4 and when failed, fallback to IPv6... (see output
> > below).
> > Pinging verisignlabs.com
> > <http://www.verisignlabs.com/jdnssec-tools/packages/old-
> > releases/jdnssec-tools-0.12.tar.gz>
> > also
> > fails. It is trying to approach an IPv4 address, but fails to connect to
> it
> > - as if I'm blocked by a firewall.
> >
> > I'll discuss our IT team about method to resolve the issue, possibly
> trying
> > to connect verisignlabs (are there any contact information in their
> > website?).
> > 10x,
> > Nir
> >
> > curl -vvv
> > http://www.verisignlabs.com/jdnssec-tools/packages/old-
> > releases/jdnssec-tools-0.12.tar.gz
> > * About to connect() to www.verisignlabs.com port 80 (#0)
> > *   Trying 72.13.58.64... Connection timed out
> > *   Trying 2620:74:13:4400::201... Failed to connect to
> > 2620:74:13:4400::201: Network is unreachable
> > * Success
> > * couldn't connect to host
> > * Closing connection #0
> > curl: (7) Failed to connect to 2620:74:13:4400::201: Network is
> unreachable
> >
> >
> > On Thu, Oct 5, 2017 at 11:49 PM, Rawlin Peters <rawlin.pet...@gmail.com>
> > wrote:
> >
> > > Hey Nir,
> > >
> > > Are you still having build issues?
> > >
> > > I found an interesting tidbit from `curl --manual` using Curl version
> > > 7.29.0 (x86_64-redhat-linux-gnu) inside a centos:7 Docker container
> > > (what the build uses):
> > >
> > > IPv6
> > >
> > >   curl will connect to a server with IPv6 when a host lookup returns an
> > > IPv6
> > >   address and fall back to IPv4 if the connection fails. The --ipv4 and
> > > --ipv6
> > >   options can specify which address to use when both are available.
> > >
> > > So assuming curl would've fallen back to an IPv4 address if it was
> > > able to get an A record, I think we can assume that in this case your
> > > local resolver did not get an A record when it resolved that hostname
> > > or your build environment is ipv6-only. Is it possible that happened
> > > in your environment, Nir?
> > >
> > > To fix builds in IPv6-only environments, I think we'd have to
> > > configure the docker network to enable ipv6. This doesn't appear
> > > possible using docker-compose format version 2 (what the build
> > > currently uses), but maybe in format version 2.1 [1]. However,
> > > enabling IPv6 might then require an IPv6-enabled host, and IPv6
> > > doesn't appear to be supported on at least Docker For Mac [2]. On
> > > operating systems that support it, maybe you'd just have to configure
> > > the Docker daemon for IPv6 and update the docker-compose.yml file to
> > > enable it for the build.
> > >
> > > - Rawlin
> > >
> > > [1] https://docs.docker.com/compose/compose-file/compose-
> > file-v2/#network-
> > > configuration-reference
> > > [2] https://docs.docker.com/docker-for-mac/troubleshoot/#known-issues
> > >
> > > On Tue, Oct 3, 2017 at 10:06 PM, Mark Torluemke <mtorlue...@apache.org
> >
> > > wrote:
> > > > I think we should be resilient and try b

Re: Building Traffic-Router - Failed to bring jdnssec-tools

2017-10-05 Thread Nir Sopher
Hi Rawlin and thank you very much for your help!
The link to http://www.verisignlabs.com/jdnssec-tools/packages/
old-releases/jdnssec-tools-0.12.tar.gz is not available from our
environment (in Israel).
Curl first tries IPv4 and when failed, fallback to IPv6... (see output
below).
Pinging verisignlabs.com
<http://www.verisignlabs.com/jdnssec-tools/packages/old-releases/jdnssec-tools-0.12.tar.gz>
also
fails. It is trying to approach an IPv4 address, but fails to connect to it
- as if I'm blocked by a firewall.

I'll discuss our IT team about method to resolve the issue, possibly trying
to connect verisignlabs (are there any contact information in their
website?).
10x,
Nir

curl -vvv
http://www.verisignlabs.com/jdnssec-tools/packages/old-releases/jdnssec-tools-0.12.tar.gz
* About to connect() to www.verisignlabs.com port 80 (#0)
*   Trying 72.13.58.64... Connection timed out
*   Trying 2620:74:13:4400::201... Failed to connect to
2620:74:13:4400::201: Network is unreachable
* Success
* couldn't connect to host
* Closing connection #0
curl: (7) Failed to connect to 2620:74:13:4400::201: Network is unreachable


On Thu, Oct 5, 2017 at 11:49 PM, Rawlin Peters <rawlin.pet...@gmail.com>
wrote:

> Hey Nir,
>
> Are you still having build issues?
>
> I found an interesting tidbit from `curl --manual` using Curl version
> 7.29.0 (x86_64-redhat-linux-gnu) inside a centos:7 Docker container
> (what the build uses):
>
> IPv6
>
>   curl will connect to a server with IPv6 when a host lookup returns an
> IPv6
>   address and fall back to IPv4 if the connection fails. The --ipv4 and
> --ipv6
>   options can specify which address to use when both are available.
>
> So assuming curl would've fallen back to an IPv4 address if it was
> able to get an A record, I think we can assume that in this case your
> local resolver did not get an A record when it resolved that hostname
> or your build environment is ipv6-only. Is it possible that happened
> in your environment, Nir?
>
> To fix builds in IPv6-only environments, I think we'd have to
> configure the docker network to enable ipv6. This doesn't appear
> possible using docker-compose format version 2 (what the build
> currently uses), but maybe in format version 2.1 [1]. However,
> enabling IPv6 might then require an IPv6-enabled host, and IPv6
> doesn't appear to be supported on at least Docker For Mac [2]. On
> operating systems that support it, maybe you'd just have to configure
> the Docker daemon for IPv6 and update the docker-compose.yml file to
> enable it for the build.
>
> - Rawlin
>
> [1] https://docs.docker.com/compose/compose-file/compose-file-v2/#network-
> configuration-reference
> [2] https://docs.docker.com/docker-for-mac/troubleshoot/#known-issues
>
> On Tue, Oct 3, 2017 at 10:06 PM, Mark Torluemke <mtorlue...@apache.org>
> wrote:
> > I think we should be resilient and try both address families...curl might
> > even do this 'for free' if we enable retries.
> >
> > On Tue, Oct 3, 2017 at 3:21 PM, Rawlin Peters <rawlin.pet...@gmail.com>
> > wrote:
> >
> >> It's possible that Docker isn't playing nicely with IPv6 in your build
> >> environment. The RPM build script is curling
> >> http://www.verisignlabs.com/jdnssec-tools/packages/old-
> >> releases/jdnssec-tools-0.12.tar.gz,
> >> and in your case is using the  record for some reason. My guess is
> >> that the container doing the build probably only routes IPv4 by
> >> default in some environments. Checking in my build environment, none
> >> of the Docker networks have IPv6 enabled.
> >>
> >> Should we pass `-4` to the curl command here [1] to force it to
> >> resolve to IPv4 addresses only?
> >>
> >> - Rawlin
> >>
> >> [1] https://github.com/apache/incubator-trafficcontrol/blob/
> >> master/traffic_router/build/build_rpm.sh#L41
> >>
> >> On Tue, Oct 3, 2017 at 3:02 PM, Nir Sopher <n...@qwilt.com> wrote:
> >> > I now see that "./pkg traffic_portal_build" fails as well. This time
> with
> >> > no log.
> >> > It worked before, back when I was building it from master.
> >> > Where is jdnssec brought from? Is it built during the process? I
> failed
> >> to
> >> > find it in the standard public repositories.
> >> > Nir
> >> >
> >> > On Tue, Oct 3, 2017 at 11:56 PM, David Neuman <
> david.neuma...@gmail.com>
> >> > wrote:
> >> >
> >> >> I have not seen this issue.  It's interesting that it is trying ipv6
> for
> >> >> that.
> >> >>
> >> >> On Tue, Oct 3, 2017 a

Re: Building Traffic-Router - Failed to bring jdnssec-tools

2017-10-03 Thread Nir Sopher
I now see that "./pkg traffic_portal_build" fails as well. This time with
no log.
It worked before, back when I was building it from master.
Where is jdnssec brought from? Is it built during the process? I failed to
find it in the standard public repositories.
Nir

On Tue, Oct 3, 2017 at 11:56 PM, David Neuman <david.neuma...@gmail.com>
wrote:

> I have not seen this issue.  It's interesting that it is trying ipv6 for
> that.
>
> On Tue, Oct 3, 2017 at 2:33 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Hi,
> >
> > Yesterday I tried to build the latest 2.1.x traffic-control, calling the
> > ./pkg command.
> > The command failed on traffic-router build, and according to the below
> log,
> > it is related to bringing the JDNSSEC tools library, not sure from which
> > repository.
> >
> > Does anybody else encountered a similar issue?
> >
> > Thanks,
> > Nir
> >
> > Building the rpm.
> >   % Total% Received % Xferd  Average Speed   TimeTime Time
> > Current
> >  Dload  Upload   Total   SpentLeft
> > Speed
> >   0 00 00 0  0  0 --:--:--  0:02:07 --:--:--
> >  0curl: (7) Failed to connect to 2620:74:13:4400::201: Network is
> > unreachable
> > Could not download required jdnssec-tools-0.12 library: 7
> >
>


Building Traffic-Router - Failed to bring jdnssec-tools

2017-10-03 Thread Nir Sopher
Hi,

Yesterday I tried to build the latest 2.1.x traffic-control, calling the
./pkg command.
The command failed on traffic-router build, and according to the below log,
it is related to bringing the JDNSSEC tools library, not sure from which
repository.

Does anybody else encountered a similar issue?

Thanks,
Nir

Building the rpm.
  % Total% Received % Xferd  Average Speed   TimeTime Time
Current
 Dload  Upload   Total   SpentLeft
Speed
  0 00 00 0  0  0 --:--:--  0:02:07 --:--:--
 0curl: (7) Failed to connect to 2620:74:13:4400::201: Network is
unreachable
Could not download required jdnssec-tools-0.12 library: 7


Re: Preventing routing to individual caches

2017-08-23 Thread Nir Sopher
Hi,

>From the conversation so far it feels like a new "server type" is needed,
and *maybe *some way to mark delivery services to be deployed on this kind
of servers as well.
If the "marking" is required, it can also be done in the (to be discussed)
"deployed DS versions" table.

Either way, please take into consideration self-service and tenancy  - the
variables of a DS in the DS table should be managed by the DS owner. As I
see it, adding "CDN deployment related parameters" to the DS itself, brings
the DS owner control over something he should not have.

Nir

On Wed, Aug 23, 2017 at 10:45 AM, Gelinas, Derek <derek_geli...@comcast.com>
wrote:

> I suppose it could be a new type.  How are you thinking we'd implement in
> that case? Off the cuff I'm thinking we could have a type filter ds
> parameter which would list the various server types we want routed.  In
> that way we could differentiate between the default edge type and something
> else.  Doing it that way would require a bit of retooling the query used to
> generate the delivery service list, but that's about it.  Though in that
> case it really wouldn't need to be a different delivery service type, so I
> suspect you've something else in mind.
>
> On Aug 23, 2017, at 12:14 AM, Eric Friedrich (efriedri) <
> efrie...@cisco.com<mailto:efrie...@cisco.com>> wrote:
>
> Could this be a new DS type or does it apply to a whole server?
> 
> From: Gelinas, Derek [derek_geli...@comcast.com derek_geli...@comcast.com>]
> Sent: Tuesday, August 22, 2017 6:22 PM
> To: dev@trafficcontrol.incubator.apache.org<mailto:dev@
> trafficcontrol.incubator.apache.org>
> Subject: Re: Preventing routing to individual caches
>
> The use case is fairly specific.  Suffice it to say we have reverse
> proxies that need configuration without being treated as potential
> destinations by traffic router.
>
> DG
>
> On Aug 22, 2017, at 3:19 PM, Nir Sopher <n...@qwilt.com<mailto:nirs@
> qwilt.com><mailto:n...@qwilt.com>> wrote:
>
> Hi Derek,
>
> Could you please shade more light on the problem you are trying to solve?
>
> As I see it, option #3 is indeed more flexible - as it can work in a DS
> granularity.
> It is even more powerful when you combine it with other extensions for this
> table suggested in the "Drop Server -> Delivery Service assignments".
>
> However, as you describe options #1,#2 as valid options, it seems that the
> problem you are dealing with completely resides in the "servers" domain -
> as the server should have the same behavior for all delivery-services.
> Therefore, option #1 might be more suitable.
>
> Nir
>
> On Tue, Aug 22, 2017 at 8:45 PM, Gelinas, Derek <derek_geli...@comcast.com
> <mailto:derek_geli...@comcast.com><mailto:derek_geli...@comcast.com>>
> wrote:
>
> I'd agree with you if this was designed to drain, but this is intended as
> a permanent state for a pretty good long list of caches.
>
> DG
>
> On Aug 22, 2017, at 1:28 PM, Eric Friedrich (efriedri) <
> efrie...@cisco.com<mailto:efrie...@cisco.com><mailto:efrie...@cisco.com>>
> wrote:
>
> What about a modification of option 1- adding a new state per server.
>
> Instead of ADMIN_DOWN, it could be "REPORTED_DRAIN" to indicate the
> difference
>
> -Eric
>
> On Aug 22, 2017, at 1:14 PM, Gelinas, Derek <derek_geli...@comcast.com<
> mailto:derek_geli...@comcast.com><mailto:derek_geli...@comcast.com>>
> wrote:
>
> That's actually the workaround we're using at the moment - setting them
> to admin_down.  That's a temporary measure, though - we want something more
> permanent.
>
> DG
> On Aug 22, 2017, at 1:09 PM, Eric Friedrich (efriedri) <
> efrie...@cisco.com<mailto:efrie...@cisco.com><mailto:efrie...@cisco.com>>
> wrote:
>
> How does your use case differ from marking a server as offline in
> Traffic Ops and snapshotting?
>
> Thats the easiest way I can think of to get a server in this state
>
> -Eric
>
> On Aug 22, 2017, at 1:00 PM, Gelinas, Derek <
> derek_geli...@comcast.com<mailto:derek_geli...@comcast.com> derek_geli...@comcast.com>> wrote:
>
> We've run across a situation in which we need certain caches to
> simultaneously have map rules for a delivery service, but not actually have
> those caches routed to when requests are made via traffic router.
> Essentially, this means removing the delivery service from the cache's info
> in the crconfig file.
>
> There's been a bit of internal debate about the best ways to do this,
> and I'd like to collect some opinions on the

Re: Traffic Ops Golang Rewrite

2017-08-15 Thread Nir Sopher
Got you. 10x,
Nir

On Mon, Aug 14, 2017 at 5:37 PM, Dewayne Richardson <dewr...@gmail.com>
wrote:

> The goal is to hit all the low hanging fruit first to discover all the
> refactor points and infrastructure before we attempt to bite off the bigger
> more impactful areas of Traffic Control like Tenancy and Capabilities.
>
> -Dew
>
> On Mon, Aug 14, 2017 at 5:22 AM, Nir Sopher <n...@qwilt.com> wrote:
>
> > +1!
> >
> > As far as tenancy is concerned, the main logic is held in the
> > "Utils::Tenant" class.
> > I would be happy to use this class as an entry point to the Golang TO
> > development.
> >
> > BTW, the UT coverage of the tenancy checks is quite extensive.
> > Are we going to migrate the UTs as well, write new UTs in Golang, or keep
> > working with the Perl UTs?
> >
> > Nir
> >
> > On Wed, Aug 9, 2017 at 11:14 PM, Durfey, Ryan <ryan_dur...@comcast.com>
> > wrote:
> >
> > > Great write up Dewayne.  Given the length and forward looking nature of
> > > the topic I created a wiki page around this under traffic ops.  Please
> > > continue discussion in the email thread and I will summarize any
> changes
> > > there.
> > >
> > > https://cwiki.apache.org/confluence/display/TC/Golang+
> > > Traffic+Ops+Replacement+-+Vampire+Proxy
> > >
> > > Ryan DurfeyM | 303-524-5099
> > > CDN Support (24x7): 866-405-2993 or cdn_supp...@comcast.com > > cdn_supp...@comcast.com>
> > >
> > >
> > > From: Dewayne Richardson <dewr...@gmail.com>
> > > Reply-To: "dev@trafficcontrol.incubator.apache.org" <
> > > dev@trafficcontrol.incubator.apache.org>
> > > Date: Wednesday, August 9, 2017 at 1:03 PM
> > > To: "dev@trafficcontrol.incubator.apache.org" <
> > > dev@trafficcontrol.incubator.apache.org>
> > > Subject: Traffic Ops Golang Rewrite
> > >
> > > Sorry for the TL;DR, but a lot of information needed to be conveyed.
> So,
> > > based upon the TO Rewrite discussions Rob Butts and I have been working
> > on
> > > a Golang proxy (Mark Torluemke, affectionately calls "the Vampire
> Proxy")
> > > that initially implements the */monitoring* endpoint to lay down the
> > > foundation for rewriting more Traffic Ops endpoints in Go.  The goal is
> > to
> > > do a straight rewrite in Go (that implements any old API's as well as
> any
> > > new ones that follows the /api/1.2 format).  The intent is to make this
> > > proxy 100% backward compatible (including any HTTP header requirements)
> > to
> > > keep the existing TO API clients from breaking.
> > >
> > > This PR is significant because when postinstall runs (after this PR is
> > > merged) it switches the ports according to the following:
> > >
> > > *TO Port Change Overview*
> > > Port *443* will be owned by Golang proxy
> > >
> > > Port *60443* will now be owned by the Mojolicious/Perl Hypnotoad
> service
> > >
> > > See */opt/traffic_ops/app/conf/cdn.conf* for a new property
> > > *traffic_ops_golang_port
> > > => '443'*.
> > >
> > > *Important Operational Changes:*
> > >
> > > *traffic_ops service*
> > > The Golang Proxy Service is now combined with the *traffic_ops*
> service,
> > so
> > > when traffic_ops is restarted so is the Golang Proxy Service.  Since
> the
> > > Golang service is a proxy any APIs that are not implemented in the
> Golang
> > > Service will be forwarded to the existing TO Perl API.  Also, If the
> API
> > is
> > > implemented in the Golang Service the response is serviced by the Proxy
> > > (where it will access the Postgres database as needed).
> > >
> > > *traffic_ops logs*
> > > *access.log - *old access.log is now renamed to *perl_access.log*, and
> > the
> > > Go proxy now takes over the *access.log* while logging in the *exact*
> > > format
> > > as before (no monitoring or tooling changes are required)
> > > *traffic_ops_golang.log* - this is a new file where any errors/debug
> will
> > > be logged from the Go proxy.
> > > *perl_access.log - the existing Mojolicious access.log gets a new name*
> > > *traffic_ops.log* - existing Mojolicious debug file (no change)
> > >
> > > There was a lot of debate and discussion about how to move forward and
> > this
> > > approach was less impactful to operations (w

Re: Traffic Ops Golang Rewrite

2017-08-14 Thread Nir Sopher
+1!

As far as tenancy is concerned, the main logic is held in the
"Utils::Tenant" class.
I would be happy to use this class as an entry point to the Golang TO
development.

BTW, the UT coverage of the tenancy checks is quite extensive.
Are we going to migrate the UTs as well, write new UTs in Golang, or keep
working with the Perl UTs?

Nir

On Wed, Aug 9, 2017 at 11:14 PM, Durfey, Ryan 
wrote:

> Great write up Dewayne.  Given the length and forward looking nature of
> the topic I created a wiki page around this under traffic ops.  Please
> continue discussion in the email thread and I will summarize any changes
> there.
>
> https://cwiki.apache.org/confluence/display/TC/Golang+
> Traffic+Ops+Replacement+-+Vampire+Proxy
>
> Ryan DurfeyM | 303-524-5099
> CDN Support (24x7): 866-405-2993 or cdn_supp...@comcast.com cdn_supp...@comcast.com>
>
>
> From: Dewayne Richardson 
> Reply-To: "dev@trafficcontrol.incubator.apache.org" <
> dev@trafficcontrol.incubator.apache.org>
> Date: Wednesday, August 9, 2017 at 1:03 PM
> To: "dev@trafficcontrol.incubator.apache.org" <
> dev@trafficcontrol.incubator.apache.org>
> Subject: Traffic Ops Golang Rewrite
>
> Sorry for the TL;DR, but a lot of information needed to be conveyed.  So,
> based upon the TO Rewrite discussions Rob Butts and I have been working on
> a Golang proxy (Mark Torluemke, affectionately calls "the Vampire Proxy")
> that initially implements the */monitoring* endpoint to lay down the
> foundation for rewriting more Traffic Ops endpoints in Go.  The goal is to
> do a straight rewrite in Go (that implements any old API's as well as any
> new ones that follows the /api/1.2 format).  The intent is to make this
> proxy 100% backward compatible (including any HTTP header requirements) to
> keep the existing TO API clients from breaking.
>
> This PR is significant because when postinstall runs (after this PR is
> merged) it switches the ports according to the following:
>
> *TO Port Change Overview*
> Port *443* will be owned by Golang proxy
>
> Port *60443* will now be owned by the Mojolicious/Perl Hypnotoad service
>
> See */opt/traffic_ops/app/conf/cdn.conf* for a new property
> *traffic_ops_golang_port
> => '443'*.
>
> *Important Operational Changes:*
>
> *traffic_ops service*
> The Golang Proxy Service is now combined with the *traffic_ops* service, so
> when traffic_ops is restarted so is the Golang Proxy Service.  Since the
> Golang service is a proxy any APIs that are not implemented in the Golang
> Service will be forwarded to the existing TO Perl API.  Also, If the API is
> implemented in the Golang Service the response is serviced by the Proxy
> (where it will access the Postgres database as needed).
>
> *traffic_ops logs*
> *access.log - *old access.log is now renamed to *perl_access.log*, and the
> Go proxy now takes over the *access.log* while logging in the *exact*
> format
> as before (no monitoring or tooling changes are required)
> *traffic_ops_golang.log* - this is a new file where any errors/debug will
> be logged from the Go proxy.
> *perl_access.log - the existing Mojolicious access.log gets a new name*
> *traffic_ops.log* - existing Mojolicious debug file (no change)
>
> There was a lot of debate and discussion about how to move forward and this
> approach was less impactful to operations (which basically means less work
> to move toward Go).  Overtime, the goal is to do a rewrite of all of the
> relevant endpoints that are in Mojolicious into Go (with a heavy focus on
> modularity and unit testing, for a future release with Micro Services).
>
> *What about the Qwilt contributions of API Gateway, Capabilities, and
> Tenancy wasn't that a thing?  *
> Yes, and it still is.  As for the API Gateway we will start "absorbing" the
> code that *Amir* Yeshurun, so graciously contributed for the APIGW and
> JWT.  For the API Capabilities that *Naama Shoresh* also generously
> contributed (all capabilities will be directly accessed from those
> capability tables), please see the *Roadmap* below.  And last but not
> least, once once we are confident that tenancy works as designed, we will
> begin porting all the tenancy hooks as well as any apis related to tenancy
> (tenant apis, user apis, delivery service apis). All of the features that
> Qwilt built will be used in some form or fashion.
>
> *How do I jump in?*
> We are going to take a "Biggest Bang for our Buck" approach to the TPv2 UI
> and TO Golang API.  The following roadmap is an initial plan of attack (of
> course subject to debate).
>
> *Roadmap*
> - The TO API is READ heavy so all READ endpoints will be rewritten first
> (avoiding Tenancy endpoints, like Users and Delivery Services)
> - Circle back to the CREATE, UPDATE, DELETEs where more Golang foundation
> will have to be developed (avoiding Tenancy endpoints)
> - Implement Capabilities in the new TPv2 to help manage the
> Roles/Capabilities with ease
> - Evaluate Capabilities and design 

Re: Starting the 2.1 Branch for Next Release of TC

2017-08-01 Thread Nir Sopher
+1 from me as well
Thanks:)

On Tue, Aug 1, 2017 at 4:14 PM, Dave Neuman <neu...@apache.org> wrote:

> +1 from me.  Thanks Hank.
>
>
> On Tue, Aug 1, 2017 at 6:58 AM, Hank Beatty <hbea...@gmail.com> wrote:
>
> > Hello All,
> >
> > How does getting all the changes in this week and I'll cut the 2.1 branch
> > first thing Monday morning (8/7/17 6AM Eastern)?
> >
> > Thanks,
> > Hank
> >
> >
> > On 07/31/2017 03:58 PM, Dave Neuman wrote:
> >
> >> Hey Hank,
> >> Are you still planning on cutting a release sometime this week?
> >> I have a few PRs that I was planning on merging and wanted to see if I
> >> have
> >> some time.
> >>
> >> Thanks,
> >> Dave
> >>
> >> On Tue, Jul 18, 2017 at 2:46 PM, Nir Sopher <n...@qwilt.com> wrote:
> >>
> >> Hi Hank,
> >>> With guidance and review by Jeremy, we are working on the first phase
> of
> >>> tenancy for version 2.1.
> >>> Tenants were already introduced to the TC database, and next to get in
> >>> are
> >>> the tenancy based access control enforcement - for users as well as
> >>> delivery services.
> >>> We expect it to be fully in master within 2 weeks.
> >>> Thanks,
> >>> Nir
> >>>
> >>>
> >>> On Tue, Jul 18, 2017 at 3:33 PM, Hank Beatty <hbea...@apache.org>
> wrote:
> >>>
> >>> Good Morning,
> >>>>
> >>>> I am very excited to be the Release Manager for the 2.1 version of TC.
> >>>>
> >>>> We are getting ready to start the 2.1 branch of TC. We would like to
> do
> >>>> this in the next 2 weeks.
> >>>>
> >>>> Are there any know issues that would prevent this from happening?
> >>>>
> >>>> Are there any features that can be wrapped up and go into this
> version?
> >>>>
> >>>> Any other comments or suggestions?
> >>>>
> >>>> Thanks,
> >>>> Hank
> >>>>
> >>>>
> >>>
> >>
>


Re: Delivery Service based config generation and Cache Manager

2017-07-26 Thread Nir Sopher
Hi Derek,

As discussed in the summit, we also see significant value in

   1. DS Deployment Granularity - using DS individual config files.
   2. Delivery Service Configuration Versioning (DSCV) -  separating the
   "provisioning" from the "deployment".
   3. Improving the roll-out procedure, joining the capabilities #1 & #2

We are on the same page with these needs:)

However, as I see it, these are #1 & #2 are 2 separate features, each has
different requirements.
For example, for DSCV,  I would suggest to manage the versions as standard
rows in the Delivery-Service table, side by side with the "hot" DS
configuration.
This will allow the existing code (with minor adjustments) to properly work
on these rows.
Furthermore, it also allows you to simply "restore" the DS "hot"
configuration to a specified revision.
It is also more resilient to DS table schema updates.

I'll soon share, on another thread, a link to a "DSCV functional spec" I
was working on. It extends the presentation

we
had in the summit.
I would appreciate any inputs to this spec.

Nir

On Tue, Jul 25, 2017 at 10:13 PM, Gelinas, Derek 
wrote:

> At the summit, there was some talk about changing the manner in which we
> generate configuration files.  The early stages of this idea had me
> creating large CDN definition files, but in the course of our discussion it
> became clear that we would be better served by creating delivery service
> configuration files instead.  This would shift us from a server-generated
> implementation, as we have now, to generating the configuration files for
> the caches locally.  The data for this would come from a new API that would
> provide the delivery service definitions in json format.
>
> What I’m envisioning is creating delivery service “snapshots” which are
> saved to the database as json objects.  These snapshots would have the full
> range of information specific to the delivery service, including the new DS
> profiles.  The database would store up to five of these objects per DS, and
> one DS object would be set to “active” through the UI or API.
>
> In this way, we could create multiple versions of a delivery service, or
> safely modify the definition currently “live” (but not necessarily active)
> in the database without changing the configuration in the field.
> Configuration would only be changed when the DS was saved and then that
> saved version was set to become active.  In the reverse manner, existing
> saved delivery services could be restored to the live DB for modification.
>
> By divorcing the “live” db from the active configuration we prevent the
> possibility of accidental edits affecting the field, or edits-in-progress
> from being sent out prematurely when one person is working on a delivery
> service and another is queueing updates.
>
> Once set, it would be this active delivery service definition that would
> be provided to the rest of traffic ops for any delivery service
> operations.  For config file generation, new API endpoints would be created
> that do the following:
>
> - List the delivery services and the active versions of each assigned to
> the specific server.
> - Provide the json object from the database when requested - I’m thinking
> that the endpoint would send the current active by default, or a specific
> version if specified.
>
> These definitions would be absurdly cacheable - we would not need to worry
> about sending stale data because each new version would have a completely
> different name - and so could be generated once and sent to thousands of
> caches with greatly reduced load on traffic ops.  The load would consist of
> the initial creation of the json object, and the minimal serving of that
> object, so this would still result in greatly reduced load on the traffic
> ops host(s) even without the use of caching.  Because of this, the new
> cache management service could check with traffic ops multiple times per
> minute for updates.  Once a delivery service was changed, the new json
> would be downloaded and configs generated on the cache itself.
>
> Other benefits of the use of a cache manager service rather than the ORT
> script include:
>
> - Decreased load from logins - once the cache has logged in, it could use
> the cookie from the previous session and only re-login when that cookie has
> expired.  we could also explore the use of certificates or keys instead,
> and eliminate logins altogether.
> - Multiple checks per minute rather than every X minutes - faster checks,
> more agile CDN.
> - Service could provide regular status updates to traffic ops, giving us
> the ability to keep an eye out for drastic shifts in i/o, unwanted
> behavior, problems with the ATS service, etc.  This leads to building a
> traffic ops that can adapt itself on the fly to changing conditions and
> 

Re: Removing headers from some traffic ops config files

2017-06-16 Thread Nir Sopher
Yes - an additional field for each parameter saying it is a "file-property"
and not a "standard-parameter".
Then you can simply filter the properties already in the "select" phase for
most parameters iterations

On Fri, Jun 16, 2017 at 1:33 AM, Gelinas, Derek <derek_geli...@comcast.com>
wrote:

> So you're suggesting we create boolean parameters as part of this?  It's
> an interesting notion and I can think of several places in the code where
> that would be simpler.
>
>
>
> > On Jun 15, 2017, at 2:02 PM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > +1 for Chris suggestion.
> > Additionally, can we somehow differentiate "real parameters" from "file
> > properties" (like the "no_header" field). For example with one of the
> below:
> >
> >   1. Using a column in the parameters table marking the "properties"
> >   fields (bool or "type" enum)
> >   2. A variable name convention (e.g. a "TC_CONFIG_FILE_PROPERTY:" prefix
> >   before the parameter name).
> >   3. A file name convention (e.g. "TC_PROPERTY:" prefix beforr the file
> >   name)
> >
> > It may clarify things to the operator, as well as simplify the
> > code changes needs to be done in some of the parameters iteration
> > (identifying the "no_header" parameter and skip it). In the future, if we
> > have more such properties (e.g "format"), this area in the code will
> > already be covered.
> >
> > Nir
> >
> > On Thu, Jun 15, 2017 at 7:25 PM, Chris Lemmons <alfic...@gmail.com>
> wrote:
> >
> >> What if you did something like this?
> >>
> >> Parameter name: "header"
> >> Config file name: "test_file.config"
> >> Value: "none"
> >>
> >> And then, if you wanted to explicitly change it, you could use:
> >>
> >> Parameter name: "header"
> >> Config file name: "test_file.config"
> >> Value: "default"
> >>
> >> The default, ofc, would be "default", since there's no compelling
> reason to
> >> go back and add the parameter everywhere automatically. This makes it a
> >> little clearer what's going on, and provides a bit of flexibility in the
> >> future, should we decide that we need a specialized header for a
> different
> >> set of files.
> >>
> >> On Tue, Jun 13, 2017 at 2:26 PM, Gelinas, Derek <
> derek_geli...@comcast.com
> >>>
> >> wrote:
> >>
> >>> We've come across a use case in which we need to create a "take and
> bake"
> >>> file in traffic ops which cannot have the usual headers automatically
> >> added
> >>> to configuration files.  Rather than hard-code a specific file type
> that
> >>> should not have these headers into the code, I'm thinking about adding
> a
> >>> filter during the take and bake process that, if a parameter with the
> >>> filename is encountered with the name "no_header", the header will not
> be
> >>> included.
> >>>
> >>> In practice, a file without headers would have the following
> parameters:
> >>>
> >>> Parameter name: location
> >>> Config file name: test_file.config
> >>> Value: /opt/trafficserver/etc/trafficserver
> >>>
> >>> Parameter name: data
> >>> Config file name: test_file.config
> >>> Value: rm -rf /
> >>>
> >>> Parameter name: no_header
> >>> Config file name: test_file.config
> >>> Value: [value ignored]
> >>>
> >>> Any thoughts on this?
> >>>
> >>> Derek
> >>
>
>


Re: [VOTE] Move Traffic Control to full GitHub

2017-05-19 Thread Nir Sopher
+1
As I believe it may improve collaboration and PR workflows.

On May 19, 2017 2:58 AM, "Jeremy Mitchell"  wrote:

> +1
>
> On Thu, May 18, 2017 at 3:35 PM, Dave Neuman  wrote:
>
> > +1, thanks for putting the vote up Jan
> >
> > On Thu, May 18, 2017 at 4:57 PM, Steve Malenfant 
> > wrote:
> >
> > > +1
> > >
> >
>


Re: Access Control - Limiting Roles / Capabilities Tenant Admins can Assign to Users

2017-05-05 Thread Nir Sopher
Hi,

Maybe I'm mis-understanding something here, but in the long term, when
tenancy is fully implemented on all items in the system, I do not see any
real issue with handling the capability to assign a role to user like any
other capability.
As I see it, there is a guy in the organization that have the role allowing
him to add users and give roles to them. He does not have tbe capability to
change ds, but I trust him to give the relevant role to the users that are
allowed to do so. What is wrong with that?
If this user belong to a cp, I would not like him to manipulate servers in
the cdn, but it is resolved by tenancy, not capabilities. I do not care if
a cp user has the capability to define a server.
This user can further create a tenant specific roles and capabilities, that
he can manage (unlike the builtin capabilities and roles that should be
read only for everyone)

What am I missing?
Nir

On May 5, 2017 6:34 PM, "Jeremy Mitchell"  wrote:

> @Eric - I don't think we want to take the access control granularity below
> the API level. That would make things pretty messy imo and I think this new
> roles/capabilities model might be enough to digest as it is.
>
> so basically, if you have a role that has a capability that maps to the
> POST /api/version/users api endpoint (create user), then you can create
> users. But of course, new users need a role. I think we should just
> leverage what we have in place now - priv_level.
>
> So, basically, if I have the ability to create users, I can only create
> users with a role that has a priv_level <= my role's priv level.
>
> I don't know if i want to add another hierarchy (role hierarchy)the
> less hierarchies the better :)
>
> Jeremy
>
> On Thu, May 4, 2017 at 6:39 AM, Eric Friedrich (efriedri) <
> efrie...@cisco.com> wrote:
>
> > Could we further differentiate the user creation capabilities to:
> > - Create CDN Admin user
> > - Create CDN Ops user
> > - Create CDN Viewer user
> > - Create Tenant Admin user
> > - Create Tenant Ops user
> > - Create Tenant Viewer user
> >
> > Then only the CDN-Admin role would have the capability to create a cdn
> > admin user. Would be good to see the capabilities assigned at a
> granularity
> > below API endpoint in this case.
> >
> > As for creation of new roles, I like #2 and #3. Users should not be able
> > to level-up anyone’s capabilities beyond their own. Further, capabilities
> > are enforced by code, so we should not allow creation of new capabilities
> > by API
> >
> > - - Eric
> >
> >
> >
> > On May 3, 2017, at 9:44 AM, Durfey, Ryan  mailto:
> > ryan_dur...@comcast.com>> wrote:
> >
> > Moving this active debate into the mailing list.
> > -Jeremy makes a good point.  We need a method for making restricting
> roles
> > and capabilities for lower tier staff that can create new users.  Jeremy
> > has suggested a point system or a hierarchy.  I think either of these
> would
> > work if applied correctly.   I am open to any approach that works.
> >
> > My thoughts:
> > 1. We need to limit which users can build new roles from capabilities or
> > new capabilities from APIs.  This could be limited to a master role like
> > “CDN Admin”.  Otherwise other admins could circumvent the system by
> > matching APIs to lower tier roles.
> > 2. Another simple approach may be to only allow non-CDN Admins to assign
> > roles to users which they have access.  Basically you can’t give anyone
> > more rights than you have.
> > 3. Perhaps with this approach we allow non-CDN Admins to build roles from
> > existing capabilities to which they have access, but not create
> > capabilities from APIs.  Then they can build new roles and assign any
> > capabilities or roles to which they already have access.
> >
> >
> >
> > From: Jeremy Mitchell
> >
> > I like this model of a user has a role which has capabilities which map
> to
> > API endpoints, however, there seems to be one flaw or at least one
> > unaccounted for use case.
> > Let's look at the roles listed above:
> >
> >   *   CDN-Admin
> >   *   CDN-Ops
> >   *   CDN-Viewer
> >   *   Tenant-Admin
> >   *   Tenant-Ops
> >   *   Tenant-Viewer
> >
> > Jeremy is a CDN-Admin which has the user-create capability (among others)
> > so he creates Bob, a Tenant-Admin. Being a Tenant-Admin, Bob also has
> > user-create so he creates Sally and he can give her ANY role so he
> decides
> > to give Sally the CDN-Admin rolewhoops, we don't want that...
> > Bob should be limited to creating users with role=Tenant-Admin (his
> role),
> > Tenant-Ops or Tenant-Viewer...but how do we correlate one role with
> > another? Currently, we have "privilege level" attached to a role. So I
> > guess we could use that like so:
> >
> >   *   CDN-Admin (100)
> >   *   CDN-Ops (50)
> >   *   CDN-Viewer (40)
> >   *   Tenant-Admin (30)
> >   *   Tenant-Ops (20)
> >   *   Tenant-Viewer (10)
> >
> > Now, being a Tenant-Admin with the user-create capability, Bob can only
> 

Re: Delivery-Service Configuration Versioning

2017-05-04 Thread Nir Sopher
Thanks Ryan & Eric for the feedback.
Answers inline.
Thanks again,
Nir


On Thu, May 4, 2017 at 3:59 PM, Eric Friedrich (efriedri) <
efrie...@cisco.com> wrote:

> Thanks Nir-
> Comments inline
> > On May 1, 2017, at 1:12 PM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > Dear all,
> >
> > Planning the efforts toward "self-service", we are considering
> > "delivery-service configuration versioning" (DSCV) as one of our next
> > steps.
> > In a very high level, by DSCV we refer to the ability to hold multiple
> > configuration versions/revisions per delivery-service, and actively
> choose
> > which version should be deployed.
> >
> > A significant portion of the value we would like to bring when working
> > toward "self-service" can be achieved using the initial step of
> > configuration versioning:
> >
> >   1. As the amount of delivery-services handled by TC is increasing,
> >   denying the "non dev-ops" user from changing delivery-services
> >   configuration by himself, and require a "dev-ops" user to actually
> make the
> >   changes in the DB, put an increasing load on the operations team.
> >   Via DSCV the operator may allow the users to really push configurations
> >   into the DB, as it separates the provisioning phase from the
> deployment.
> >   Once commited, the CDN's "dev-ops" user is able to examine the changes
> >   and choose which version should be deployed, subject to the operator's
> >   acceptance policy.
> EF> How do we get from DSCV to the ultimate self-service goals where the
> CDN operator is no longer in the critical path for deploying DS changes?
>
> NS> Indeed Eric, at this stage the deployment itself is still in the hands
of the operator: The operator has to change the deployed DS version, Queue
Update and Cr-Config snapshot.
Allowing the DS owner to "change the DS version to deploy" is a "process"
issue, as it should be subject to the different operators changes
acceptance policy. We will need to model it and add flexible, probably
plugins supporting, building block in the future.
Allowing the DS owner to actually deploy the changes (Cr-Config snapshot
and Queue-Update / future push mechanism) cannot be done as long as the
different DSs configuration is coupled in the sames files (remap.config &
cr-config) and processes.
Once decoupled, the deploy operations can be done on a DS granularity and
therefore the operator can delegate the control to the "DS owner"


> >   2. DSCV brings improved auditing and troubleshooting capabilities,
> which
> >   is important for supporting TC deployment growth, as well as allow
> users to
> >   be more independent.
> >   It allows to investigate issues using versions associated log records,
> >   as well as the data in the DB itself: Examining the delivery-service
> >   versions, their meta data (e.g. "deployed dates") as well as use tools
> for
> >   versions comparisons.
> >   3. DSCV allows a simple delivery service configuration rollback, which
> >   provides a quick remedy for configuration errors issues.
> >
> > Moreover, we suggest to allow the deployment of multiple versions of the
> > same delivery service simultaneously, on the same caches. Doing so, and
> > allowing the operator to orchestrate the usage of the different
> > versions (for example, via "steering"), the below become available:
> EF> This feature will extend to both caches and TR, right? Lots of
> DS-specific policy is evaluated by the TR.
>
NS> There are few ways to implement the feature, but as I currently see it
there is no real need to change the data-plane for supporting this feature.
Please let me know if you think I'm missing something here.
One option is to deploy the different versions of the same delivery service
as if they are entirely different delivery-services. With "ids" and
"host-regexes" which include the version. No changes in Cr-Config
structure, remap.config, etc. Therefore, all changes need to be done are in
traffic-ops, simulating the different version towards the rest of the
system.
Let now improve this solution.
First improvement is giving the different components the ability to
understand a the concept of "DS and version". For example adding the
version field to the cr-config and adjust the different component. For
example, this may allow traffic-stats to show the reports about different
DS version separately, as well as aggregated for the DS. More changes will
probably be required, but as far as I currently see, these changes are all
in the control plane, and not the data plane. Caches are effectively
unaware

Delivery-Service Configuration Versioning

2017-05-01 Thread Nir Sopher
Dear all,

Planning the efforts toward "self-service", we are considering
"delivery-service configuration versioning" (DSCV) as one of our next
steps.
In a very high level, by DSCV we refer to the ability to hold multiple
configuration versions/revisions per delivery-service, and actively choose
which version should be deployed.

A significant portion of the value we would like to bring when working
toward "self-service" can be achieved using the initial step of
configuration versioning:

   1. As the amount of delivery-services handled by TC is increasing,
   denying the "non dev-ops" user from changing delivery-services
   configuration by himself, and require a "dev-ops" user to actually make the
   changes in the DB, put an increasing load on the operations team.
   Via DSCV the operator may allow the users to really push configurations
   into the DB, as it separates the provisioning phase from the deployment.
   Once commited, the CDN's "dev-ops" user is able to examine the changes
   and choose which version should be deployed, subject to the operator's
   acceptance policy.
   2. DSCV brings improved auditing and troubleshooting capabilities, which
   is important for supporting TC deployment growth, as well as allow users to
   be more independent.
   It allows to investigate issues using versions associated log records,
   as well as the data in the DB itself: Examining the delivery-service
   versions, their meta data (e.g. "deployed dates") as well as use tools for
   versions comparisons.
   3. DSCV allows a simple delivery service configuration rollback, which
   provides a quick remedy for configuration errors issues.

Moreover, we suggest to allow the deployment of multiple versions of the
same delivery service simultaneously, on the same caches. Doing so, and
allowing the operator to orchestrate the usage of the different
versions (for example, via "steering"), the below become available:

   1. Manual testing of a new delivery-service configuration, via dedicated
   URL or using request headers.
   2. Staging / Canary testing of new versions, applying them only for a
   specific content path, or filtering base on source IP.
   3. Gradual transition between the different configuration versions.
   4. Configuration versions A/B testing (assuming the reporting/stats also
   becomes "version aware").
   5. Immediate (no CRON wait, cr-config change only) delivery-service
   version"switch", and specifically immediate rollback capabilities.

Note that, engineering wise, one may consider DSCV as a building block for
other "self-service" steps. It allows the system to identify what
configuration is deployed on which server, as well as allows the servers to
identify configuration changes with DS granularity. Therefore, it can help
to decouple the individual delivery services deployment as well as reduce
the load derived from the caches update process.
We would greatly appreciate community input on the subject.

Many thanks,
Nir


Re: Configuration Management - Future State Design for Self Service

2017-05-01 Thread Nir Sopher
Hi all,

First of all +1 for the described efforts. The delivery service deployment
time is an issue that bothers us as well.

I personally believe that the decoupling of servers config and individual
service configs is an important step towards "self-service", where
ultimately the flow of operations on an individual DS is completely
decoupled from of other DSs. Beyond the values already listed, I also see
the value of improved troubleshooting.

As I see it, much of the value provided via the configuration decoupling,
can be achieved working with "delivery-service configuration versioning".
It allows testing of individual DSs, DS level apply granularity, better
auditing and troubleshooting, and even a "no CRON wait" rollback.
Furthermore, engineering wise, one may consider the "delivery-service
configuration versioning" as a building blocks on the way to services
decoupling and efficient deployment. It allows the system to simply test
what is deployed on which cache, and the cache to identify changes on a DS
granularity.
I'll open a separate thread on this feature, and we will probably discuss
it on the summit.

Nir

On Apr 28, 2017 5:43 PM, "Durfey, Ryan"  wrote:

> Great feedback Eric. My response is below.
>
> Ryan DurfeyM | 303-524-5099
> 24x7 CDN Support: 866-405-2993 or cdn_supp...@comcast.com
>
>
> On 4/28/17, 8:18 AM, "Eric Friedrich (efriedri)" 
> wrote:
>
>
> > On Apr 27, 2017, at 12:19 PM, Durfey, Ryan 
> wrote:
> >
> > As we move into Self-Service discussions, Configuration Management
> needs to be discussed in greater depth.  I wanted to get the conversation
> started and get feedback from everyone on what the future state should look
> like and how we get there from our current state.
> >
> > TO 2.1 Changes In Progress
> > Several changes are underway with TO 2.1 that will make a
> significant impact to current challenges. Derek G., keep me honest.
> >
> >  1.  Invalidation is being decoupled from configuration updates
> > *   Allows for invalidations on 1 min CRON jobs which should
> take approximately 3 min total to apply to mid and edge tier in succession.
> > *   Allows configs to be applied to mid tier and edge tier
> simultaneously reducing time by half.
> >  2.  Reduction in the overall number of config files generated per
> configuration update
> > *   Allows configs to be applied on tighter than the current 15
> min CRON intervals, though not sure how rapidly yet.  Will require testing.
> > This still leaves us with a monolithic configuration management
> process which contains all Server and Service configs.  This presents some
> challenges to individual Self-Service.
> >
> > Future State (v3?) Ideas
> >
> >  1.  Separate out Server configs from Service configs so they can be:
> > *   Tested separately
> > *   Applied separately
> > *   Logged & reported separately
> > *   Reverted separately
> > *   Non-blocking when issues are encountered.
> >  2.  Separate out individual Service configs  for same reasons as
> above.
> >  3.  Allow for instant push of new configs, no CRON wait time.
> EF> We have to be careful with Pushes. Depending on how its
> implemented, this may open up a new API (and a new attack surface) on
> publicly accessible caches.
>
> Any reason to believe a 1 minute cronjob would not be fast enough?
>
> RD: I think instant push is the ideal especially if you need to roll
> something back that is broken.  But the reality is that a 1 min cronjob
> would be pretty close and I wouldn’t have issues with that.  I think we
> attempt to articulate the ideal situation and any time we can get an 80/20
> solution that gets us close with significantly less headaches we are very
> happy with that.
>
>
> >  4.  Allow for Service builds and changes to be staged for initial
> testing prior to production roll out.
> >  5.  Allow for rollback of config changes in both staging and
> production environments.
> >  6.  Log all Service changes so that a Tenant User can pull back a
> history of all changes related to their services through the API.
> >
> >
> > Ryan Durfey
> > Sr. Product Manager - CDN | Comcast Technology Solutions
> > 1899 Wynkoop Ste. 550 | Denver, CO 8020
> > M | 303-524-5099 <(303)%20524-5099>
> > ryan_dur...@comcast.com
> > 24x7 CDN Support: 866-405-2993  or cdn_supp...@comcast.com cdn_supp...@comcast.com>
> >
>
>
>
>
>


Re: VPN / DNS issues?

2017-04-30 Thread Nir Sopher
Wrong "dev" mailing list :)
Please ignore.
Nir


On Mon, May 1, 2017 at 8:26 AM, Nir Sopher <n...@qwilt.com> wrote:

> Hi,
> I'm connected via VPN, but not able to ping P4 / NX server / rd.
> Anyone else with similar issues?
> Nir
>


VPN / DNS issues?

2017-04-30 Thread Nir Sopher
Hi,
I'm connected via VPN, but not able to ping P4 / NX server / rd.
Anyone else with similar issues?
Nir


Re: Traffic Ops: Tenancy - User Access to Multiple Tenants

2017-04-24 Thread Nir Sopher
Thanks Jeremy,
I totally agree with the objections for the first phase.
Just one comment - in the future we might also consider objects tenancy as
a "must to be set". A user that is not interested in tenancy, may just put
everything under the same tenant.
Nir

On Sat, Apr 22, 2017 at 12:51 AM, Jeremy Mitchell <mitchell...@gmail.com>
wrote:

> My hope is that we can roll out "tenancy" in small, digestible, optional
> pieces to minimize impact. Here's how i see that rollout going. I'll
> probably skip over some details in my attempt to keep this short.
>
> Phase 1 - Providing the ability to lock down (scope) delivery services
>
> Step #1 - support tenant CRUD
>
> -- a user w/ the proper permissions can CRUD tenants. remember, tenants
> have a required field - parentTenant - so you can really only create
> tenants inside the scope of your tenant :)
> -- i kind of envision an admin user with the "root" tenant will take a
> first pass at creating tenants (see example tenant tree below)
>
> Step #2 - support user->tenant assignment
>
> -- each user can be assigned 0 or 1 tenant
> -- i.e. user 1 will assign user 2 a tenant (but remember user 1 only has
> access to their tenant + children so user 2 can only be assigned a tenant
> from user 1's tenant hierarchy)
>
> Step #3 - support ds->tenant assignment
>
> -- each ds can be assigned 0 or 1 tenant
> -- user 1 will assign ds 1 a tenant (but remember user 1 only has access to
> their tenant + children so ds 1 can only be assigned a tenant from user 1's
> tenant hierarchy)
>
> 
> Once phase 1 is complete, it could play out like this:
>
> 1. An admin user with the root tenant could take a first pass at creating
> tenants and we end up with this tenant hierarchy:
>
> - root (this tenant comes for free and cannot be modified)
> -- Org #1
> --- Disney
>  ESPN
>  ABC
> - ABC Family
> --- Foo entertainment
>  Foo child 1
>  Foo child 2
> -- Org #2
>
> 2. With that tenant hierarchy in place, the admin user could start giving
> users the appropriate tenant
>
> -- all existing "admin" or "operations" users will probably be given the
> root tenant so they can see everything like they do today
> -- bob gets Disney meaning he can only see deliveryservices where
> tenant=null, disney, espn, abc or abc family
> -- cindy gets abc meaning she can only see deliveryservices where
> tenant=null, abc or abc family
> -- phil gets Foo child 1 meaning he can only see deliveryservices where
> tenant=null or Foo child 1
>
> 3. Start locking down delivery services by assigning them a tenant
>
> -- ds1 gets abc family meaning users with tenant=abc family or above can
> see ds1
> -- ds2 gets disney meaning users with tenant=disney or above can see ds2
>
> ***
>
> Remember, this is all optional. If you decide not to create tenants /
> assign users with a tenant / assign ds's with a tenants, then ds's are not
> scoped at all. That might be acceptable for many installations.
>
> However, after phase 1 is complete, you will now have the ability to (or
> not to) lock down delivery services as you please.
>
> Once that is done, I propose we talks about what the next phases are and
> the next set of resources (servers? cachegroups?) we attempt to lock down
> via tenancy.
>
> Thank you for your time :)
>
> Jeremy
>
>
>
>
> On Wed, Apr 19, 2017 at 10:05 AM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Hi Ryan & All,
> >
> > Following a discussion with co-workers @Qwilt, as well as a earlier
> > discussion with Jeremy and Dewayne, I'll try to outline the main concepts
> > of a solution.
> >
> >
> > *As this email turned out to be long, I'll start with the conclusions:*
> >
> >1. As the first "tenancy introduction" phase includes only
> >"org-tenancy", we may to assume that there is a single ISP using the
> TC
> >instance, and that this ISP tenant is the ancestor of all
> > content-provider
> >tenants.
> >Therefore, tenants hierarchy based decision is a reasonable solution
> for
> >now.
> >2. For the next phases we need to create a module that is able to
> >provide the answer to the question:
> > *"may operation  on resource of type  which belongs to tenant
> > be performed by a user from tenant  ?"*.
> >This module can
> >   1. Be derived from a set of relations held in traffic-ops DB.
> >   Note that it may turn out to be complicated to cover all tenants
> >   relations use-cases a

Re: Proposal for CDN definition file based configuration management

2017-04-15 Thread Nir Sopher
Indeed. This is probably only one of quite a few issues required to be
dealt with on the way for the instant apply of ds configuration changes.
Further planning should be done and baby steps towards this goal should be
presented (as well as other "self-service" building blocks).
I just mentioned it as additional motivation for avoiding further coupling
of the different delivery services configurations.

Back to the original discussion, can the proposed versioned definition file
be broken into separate versioned files? A file per DS may be just the
first example for this need. Other "orthogonal" sections can be held
separately as well.

Thanks,
Nir

On Apr 14, 2017 23:38, "David Neuman" <david.neuma...@gmail.com> wrote:

> The discussion around delivery service configs should probably be it's own
> thread, however I am going to contribute to the hijacking of this thread
> anyway.
>
> We need to make sure that we keep the Traffic Router in mind when
> discussing delivery service changes that get applied "instantly" and
> individually.  There are certain attributes of a delivery service that
> affect the Traffic Router and we need to make sure that we don't cause an
> issue by pushing a config to a cache before the Traffic Router has it or
> visa-versa.
>
> On Fri, Apr 14, 2017 at 8:07 AM, Amir Yeshurun <am...@qwilt.com> wrote:
>
> > It seems that with Nir's approach there is no problem to enforce a size
> > limit on historical data
> >
> > On Fri, Apr 14, 2017 at 4:07 PM Eric Friedrich (efriedri) <
> > efrie...@cisco.com> wrote:
> >
> > > I think this sounds good Nir.
> > >
> > > Its not so much the size that is the main concern. Rather, people tend
> to
> > > have strong reactions to “its permanent, it will be there forever”. As
> > long
> > > as we give some way to delete and preferably with a batch mode we
> should
> > be
> > > all set.
> > >
> > > —Eric
> > >
> > > > On Apr 13, 2017, at 3:08 PM, Nir Sopher <n...@qwilt.com> wrote:
> > > >
> > > > Hi Eric,
> > > >
> > > > I thought to start with saving for each configuration the range of
> > dates
> > > it
> > > > was the "head" revision, and the range of dates it was deployed.
> > > > This will allow the operator to remove old versions via designated
> > script
> > > > using criteria like "configuration age", "ds history length" or "was
> it
> > > > deployed". For example "Leave all deployed revisions and up to 100
> non
> > > > deployed revisions".
> > > > I haven't thought of the option to support the marking of
> configuration
> > > > versions as "never delete", but it can surely be added.
> > > >
> > > > I did not intended to create something more sophisticated, and
> believe
> > > that
> > > > the mentioned script will be used only on rare cases that something
> is
> > > > trashing the DB, as the math I did lead me to believe it is a none
> > issue:
> > > > Judging from the kable-town example, a delivery-service configuration
> > > size
> > > > is less than 500B. Lets say the average is *1K *to support future
> > growth.
> > > > Lets also say we support *10K *DSs (which is much much more than any
> TC
> > > > deployment I'm aware of has) and we have *1K *revisions per DS.
> > > > In such a case versioning will use 10GB, which I believe is not an
> > issue
> > > > for postgres to hold (yet, I'm not a postgres expert).
> > > >
> > > > Nir
> > > >
> > > >
> > > > On Thu, Apr 13, 2017 at 3:53 PM, Eric Friedrich (efriedri) <
> > > > efrie...@cisco.com> wrote:
> > > >
> > > >> Hey Nir-
> > > >>  If we keep all DS versions in the DB, are there any concerns about
> > the
> > > >> amount of data retained? I know Delivery Services don’t change very
> > > often,
> > > >> but over time do we really need to keep the last 1000 revisions of a
> > > >> delivery service?
> > > >>
> > > >> Its more of an implementation detail, but I think it would be useful
> > to
> > > >> give some control over version retention policies (i.e. keep last n
> > > based
> > > >> on quantity or dates, mark some as “never delete”)
> > > >>
> > > >> More in

Re: Proposal for CDN definition file based configuration management

2017-04-13 Thread Nir Sopher
Hi Eric,

I thought to start with saving for each configuration the range of dates it
was the "head" revision, and the range of dates it was deployed.
This will allow the operator to remove old versions via designated script
using criteria like "configuration age", "ds history length" or "was it
deployed". For example "Leave all deployed revisions and up to 100 non
deployed revisions".
I haven't thought of the option to support the marking of configuration
versions as "never delete", but it can surely be added.

I did not intended to create something more sophisticated, and believe that
the mentioned script will be used only on rare cases that something is
trashing the DB, as the math I did lead me to believe it is a none issue:
Judging from the kable-town example, a delivery-service configuration size
is less than 500B. Lets say the average is *1K *to support future growth.
Lets also say we support *10K *DSs (which is much much more than any TC
deployment I'm aware of has) and we have *1K *revisions per DS.
In such a case versioning will use 10GB, which I believe is not an issue
for postgres to hold (yet, I'm not a postgres expert).

Nir


On Thu, Apr 13, 2017 at 3:53 PM, Eric Friedrich (efriedri) <
efrie...@cisco.com> wrote:

> Hey Nir-
>   If we keep all DS versions in the DB, are there any concerns about the
> amount of data retained? I know Delivery Services don’t change very often,
> but over time do we really need to keep the last 1000 revisions of a
> delivery service?
>
> Its more of an implementation detail, but I think it would be useful to
> give some control over version retention policies (i.e. keep last n based
> on quantity or dates, mark some as “never delete”)
>
> More inline
> > On Apr 12, 2017, at 12:53 AM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > Thanks Derek for the clarification.
> >
> > So the definition file is a global file for the CDN.
> > Does it contain the information of which server has which DS?
> > Does it hold all CDN's DSs configuration together?
> > On a single DS change, will all servers in the CDN download the entire
> file
> > for every DS change?
> >
> > What I'm practically asking is, if it is not already your intention, "can
> > the definition file hold only the information of which server holds which
> > DS (and configuration version when we add it), and the DS configuration
> be
> > held and pulled separately on a DS level granularity?"
> >
> > When discussing "self-service" we would like to decouple the operations
> of
> > the different users / content providers. Ultimately, when a DS is
> changed,
> > the change should be deployed immediately to the CDN - with no dependency
> > with other DSs, and possibly with "no buffering" by the operator
> deploying
> > batch of DS changes together. This allows to improve the user experience
> > and independence when working on the CDN.
> > Following the change you are suggesting, will the DS configuration
> > deployment coupling get tighter? We prefer not to have the need to
> "finish
> > your work and not start additional work before the queued run has
> > completed".
> EF> Agree. The less steps our users have to take, the happier they are. If
> it was a common workflow to batch a bunch of DS changes and them roll them
> out together, I would probably be a stronger advocate for keeping the queue
> update/snapshot crconfig steps around. In our discussion, that doesn’t seem
> to be often used. Should we consider deprecating those stages and
> immediately (excepting ORT polling interval of course) applying config
> changes when the DS is changed?
>
> >
> > Another requirement is to be able to rollback changes on a DS level and
> not
> > only on a CDN level, as it is not desired to rollback changes of user "A"
> > because of errors of user "B". If I understand correctly, the definition
> > file does not support that.
> EF> I think the definition file can support this- only the rolled-back DS
> would change inside that file. No other users would be affected because
> their DS configs would not change.
>
> >
> > Last, using the definition file, must all servers in a CDN work with the
> > same set of DSs? One of the reasons we consider "DS versioning" is to
> allow
> > deployment of a change in a DS only for a subset of the server for canary
> > testing.
> EF> When I think of canary testing today, my mind first goes to Steering
> DS. With those steering delivery services, do we still need ability to set
> per-cache DS versions?
>
> —Eric
>
>
> &

Re: Proposal for CDN definition file based configuration management

2017-04-11 Thread Nir Sopher
Thanks Derek for the clarification.

So the definition file is a global file for the CDN.
Does it contain the information of which server has which DS?
Does it hold all CDN's DSs configuration together?
On a single DS change, will all servers in the CDN download the entire file
for every DS change?

What I'm practically asking is, if it is not already your intention, "can
the definition file hold only the information of which server holds which
DS (and configuration version when we add it), and the DS configuration be
held and pulled separately on a DS level granularity?"

When discussing "self-service" we would like to decouple the operations of
the different users / content providers. Ultimately, when a DS is changed,
the change should be deployed immediately to the CDN - with no dependency
with other DSs, and possibly with "no buffering" by the operator deploying
batch of DS changes together. This allows to improve the user experience
and independence when working on the CDN.
Following the change you are suggesting, will the DS configuration
deployment coupling get tighter? We prefer not to have the need to "finish
your work and not start additional work before the queued run has
completed".

Another requirement is to be able to rollback changes on a DS level and not
only on a CDN level, as it is not desired to rollback changes of user "A"
because of errors of user "B". If I understand correctly, the definition
file does not support that.

Last, using the definition file, must all servers in a CDN work with the
same set of DSs? One of the reasons we consider "DS versioning" is to allow
deployment of a change in a DS only for a subset of the server for canary
testing.

Thanks,
Nir



On Wed, Apr 12, 2017 at 3:00 AM, Dewayne Richardson 
wrote:

> +1 I was just about to formulate that response.  The "dev" list is our
> discussion forum.
>
> On Tue, Apr 11, 2017 at 9:35 AM, Dave Neuman  wrote:
>
> > @Ryan, I think its better to have conversations on the dev list than a
> wiki
> > page...
> >
> > On Tue, Apr 11, 2017 at 9:01 AM, Durfey, Ryan 
> > wrote:
> >
> > > Started a new wiki page to discuss this here https://cwiki.apache.org/
> > > confluence/display/TC/Configuration+Management
> > >
> > > I will do my best to summarize the discussion below later today.
> > >
> > > Ryan M | 303-524-5099
> > >
> > >
> > > -Original Message-
> > > From: Eric Friedrich (efriedri) [mailto:efrie...@cisco.com]
> > > Sent: Tuesday, April 11, 2017 8:55 AM
> > > To: dev@trafficcontrol.incubator.apache.org
> > > Subject: Re: Proposal for CDN definition file based configuration
> > > management
> > >
> > > A few questions/thoughts, apologies for not in-lining:
> > >
> > > 1) If we move away from individually queued updates, we give up the
> > > ability to make changes and then selectively deploy them. How often do
> TC
> > > operations teams make config changes but do not immediately queue
> > updates.
> > > (I personally think that we currently have a bit of a tricky situation
> > > where queuing updates much later can push down an unknowingly large
> > config
> > > change to a cache- i.e. many new DS added/removed since last time
> updates
> > > were queued maybe months earlier). I wouldn't be sad to see queue
> updates
> > > go away, but don't want to cause hardship on operators using that
> > feature.
> > >
> > > 2) If we move away from individually queued updates, how does that
> affect
> > > the implicit "config state machine"? Specifically, how will edges know
> > when
> > > their parents have been configured and are ready for service? Today we
> > > don't config an edge cache with a new DS unless the mid is ready to
> > handle
> > > traffic as well.
> > >
> > > 3) If we move away from individually queued updates, how do we do
> things
> > > like unassign a delivery service from a cache? Today we have to
> snapshot
> > > CRConfig first to stop redirects to the cache before we queue the
> update.
> > > If updates are immediately applied and snapshot is still separate, how
> do
> > > we get TR to stop sending traffic to a cache that no longer has the
> remap
> > > rule?
> > >
> > > 4) Also along the lines of the config state machine, we never really
> > > closed on if we would make any changes to the queue update/snapshot
> > > CRConfig flow. If we are looking at redoing how we generate config
> files,
> > > it would be great to have consensus on an approach (if not an
> > > implementation) to remove the need to sequence queue updates and
> snapshot
> > > CRConfig. I think the requirement here would be to have Traffic Control
> > > figure out on its own when to activate/deactivate routing to a cache
> from
> > > TR.
> > >
> > > 5) I like the suggestion of cache-based config file generation.
> > >   - Caches only retrieve relevant information, so scale proportional to
> > > number of caches/DSs in the CDN is much better
> > >   - We could modify TR/TM to 

Re: Proposal for CDN definition file based configuration management

2017-04-11 Thread Nir Sopher
Hi,

In the below discussion, I'm leaving aside the "server's profile" scope of
the servers configuration, and focus on Delivery Services. Personally I
believe that a clear decoupling should be made between the 2 scopes (and if
I understand correctly from another thread there are already steps in this
direction).

This issue relates with few of the issues we are trying to think of when
discussing "self-service" (allowing a non-DevOps user to independently
manage his relevant delivery services), and specifically "delivery-service
versioning".
When discussing DS versioning, we basically suggest to keep for each all
the configuration revisions, and when applying a delivery service to a
server, to practically apply a specific version of the delivery service to
the server. This allows simple rollback for a specific delivery service,
better auditing of DS configuration changes and deployment, and DS changes
testing on servers subset.

Assuming we hold versions of the delivery services configuration, should
the file you propose really hold the entire configuration for the server?
Or just the list of "delivery services" + "versions" for the server?
If we go only for a list of DS+"Cfg Version", the server needs to get from
TO the list of DSs (using the suggested file or via
"server/:id/deliveryservice" API), and download the additional files
related to its served DSs upon a "DS deployed configuration version"
change. By doing so we deal with the scalability issue discussed earlier,
as the server does not need to bring the entire configuration again when a
single DS is changed. We further remove the need to duplicate DS
information in the DB (no longer saving the DS configuration per serving
cache).

2 issues I see with what I'm suggesting here:

   1. "remap.config" is cache instance specific, for example as it may hold
   the hostname in the remap rule.
   This however can be changed. For example by holding the relevant "meta"
   the suggested json and let astat glue things together.
   Specifically for the hostname in the remap rule, we may probably
   consider replacing the machine name with a "*".
   2. "remap.config" is a single file that covers all delivery services, so
   how can this file be brought from multiple DSs?
   As the file can now be built of multiple "include" statements, directing
   to other remap files (that can be held per DS) this is a none issue.

What do your think?
Nir

On Tue, Apr 11, 2017 at 7:44 PM, Gelinas, Derek 
wrote:

> Good questions.  Today, when servers are queued, they only check what is
> currently in the database.  This means you need to finish your work and not
> start additional work before the queued run has completed.  With the
> definition files, we will only create a new definition version when work is
> complete.  Once the snapshot is finished, work in Traffic Ops can proceed
> while configurations are loaded without changing those configurations.
>
> Tiered updates aren’t really required, but I understand how in some
> situations they might be desired.  We could make it an optional flag to
> enable/disable, I suppose.
>
> With regards to sending an update to a specific server… That will require
> some thought.  As envisioned that ability would go away.  There might be a
> way to stage these changes – snapshot the config, but not make it
> “active.”  It could then be set as active on specific hosts, tested, and
> made active for the rest of the CDN.  I don’t see any reason it wouldn’t be
> possible.
>
> With regards to unassigning a delivery service from a cache, it’d be the
> same method as today – unassign, snapshot CRConfig, then snapshot the cache
> config.  I’m thinking some interface work might go a long way to making the
> entire process more clear and accessible, too.
>
> Derek
>
> On 4/11/17, 10:54 AM, "Eric Friedrich (efriedri)" 
> wrote:
>
> A few questions/thoughts, apologies for not in-lining:
>
> 1) If we move away from individually queued updates, we give up the
> ability to make changes and then selectively deploy them. How often do TC
> operations teams make config changes but do not immediately queue updates.
> (I personally think that we currently have a bit of a tricky situation
> where queuing updates much later can push down an unknowingly large config
> change to a cache- i.e. many new DS added/removed since last time updates
> were queued maybe months earlier). I wouldn’t be sad to see queue updates
> go away, but don’t want to cause hardship on operators using that feature.
>
> 2) If we move away from individually queued updates, how does that
> affect the implicit “config state machine”? Specifically, how will edges
> know when their parents have been configured and are ready for service?
> Today we don’t config an edge cache with a new DS unless the mid is ready
> to handle traffic as well.
>
> 3) If we move away from individually queued updates, how do we do
> things like 

Re: Traffic-Control Official PostgreSQL Version

2017-04-05 Thread Nir Sopher
Thanks,
Nir

On Tue, Apr 4, 2017 at 11:27 PM, Dewayne Richardson <dewr...@gmail.com>
wrote:

> We have started with 9.6.x, so I'd say we should assume > 9.6.x at the
> moment.
>
> On Tue, Apr 4, 2017 at 2:09 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Hi,
> >
> > Is there an official required PostgreSQL version?
> >
> > One reason it may be important is that supported syntax is added in newer
> > versions.
> > For example, recently one of the DB migrations scripts was added with
> "ADD
> > COLUMN IF NOT EXISTS" statement, which is considered a syntax error (and
> > breaks "admin.pl setup") in PostgreSQL versions < 9.6.
> >
> > Thanks,
> > Nir
> >
>


Traffic-Control Official PostgreSQL Version

2017-04-04 Thread Nir Sopher
Hi,

Is there an official required PostgreSQL version?

One reason it may be important is that supported syntax is added in newer
versions.
For example, recently one of the DB migrations scripts was added with "ADD
COLUMN IF NOT EXISTS" statement, which is considered a syntax error (and
breaks "admin.pl setup") in PostgreSQL versions < 9.6.

Thanks,
Nir


Re: TO - Queue server updates on delivery service servers only?

2017-02-07 Thread Nir Sopher
Hi Jeremy,

As I see it, we need to distinguish delivery-service changes that require a
cr-config update, from those who doesn't.

The simple case, where the changes does not effect on the cr-config, a
"queue-update" of a delivery-service may be simple. Just going over the
list of servers assigned to the DS and queue an update for each of them.
Nevertheless, my understanding is that this solution requires further
improvements in order to allow dealing with more than one delivery services
at a time.

The servers removal issue you discussed, is one of the cases in which the
change also effects the cr-config - where I believe things get more
complicated, as the cr-config is done on a CDN level and a solution dealing
only with the queue-updates is more dangerous as it may cause
inconsistencies and "black hole".

Still, I believe these issues can be simply resolved along with TC-130
applying the ideas discussed in
the 2 "Streamlining TC management and operations sequences" threads.
I would take the opportunity to thank Eric again for his inputs on these
threads.

When trying to highlight the foundations of the solution I would say:

   1. Manage separately the configuration per delivery-service (in both
   queue-update and cr-config)
   2. Define some kind of "configuration generation", incremented on a DS
   apply and managed per DS.
   3. Keep track of each traffic-server - knowing/learning the held
   configuration generation for each DS.

Using these tools, when applying the configuration of a delivery-service,
the system drives the traffic servers to
1. If the traffic-servers is assigned to the delivery-service - hold the
latest generation.
2. O.w. hold no configuration for this delivery service (== "0" generation).

As the system knows which generation is *currently* held by which server,
as well as which generation *should* be held by which server, it can update
only the servers needs to be updated.
Specifically for the case you described, servers that are un-assigned from
the service, would be updated as well to remove the cfg.

What do you think?
Nir

On Fri, Feb 3, 2017 at 4:50 PM, Jeremy Mitchell 
wrote:

> The more I think about this, the more I think it's not as straightforward
> as I think to queue updates for a delivery service (meaning to queue
> updates for all the caches employed by the ds).
>
> For example, imagine you had a delivery service that employed 10
> cachesand you removed 2 caches from the ds...then queue'd updates for
> the ds.this would queue updates for the 8 remaining caches but really
> the 2 caches you removed are the ones that need the updates...
>
> So unless told otherwise or someone has a solution to this problem (I can't
> think of one currently), I'm gonna hold off on that issue.
>
> Jeremy
>
> On Wed, Feb 1, 2017 at 8:58 PM, Jeremy Mitchell 
> wrote:
>
> > Created this issue. Would like input from more people to ensure this is a
> > good idea and I'm not overlooking something...
> >
> > https://issues.apache.org/jira/browse/TC-129
> >
> > On Tue, Jan 31, 2017 at 10:47 AM, Eric Friedrich (efriedri) <
> > efrie...@cisco.com> wrote:
> >
> >> Yes, when modifying a DS, it would be useful to have an API to queue on
> >> just servers relevant to that DS.
> >>
> >> —Eric
> >>
> >> > On Jan 31, 2017, at 12:36 PM, Jeremy Mitchell  >
> >> wrote:
> >> >
> >> > Currently, you can queue updates:
> >> >
> >> > - for a specific server
> >> > - for a specific cachegroup (all the servers in that cachegroup)
> >> > - for a specific cdn (all the servers in that cdn)
> >> >
> >> > but you can't queue updates for:
> >> >
> >> > - a specific delivery service (all the servers explicitly (edges) or
> >> > implicitly (mids) assigned to that DS)
> >> >
> >> > Does it make sense to add this functionality? At least to the API?
> >> >
> >> > I "think" when a delivery service change is made, it is common
> practice
> >> to
> >> > queue updates on the ENTIRE cdn that the DS belongs to. This seems
> like
> >> it
> >> > would unnecessarily queue updates for a lot of servers that don't
> >> belong to
> >> > the DS.
> >> >
> >> > I know we may move away from the queue update approach at some point
> but
> >> > does this functionality make sense in the short term?
> >> >
> >> > Jeremy
> >>
> >>
> >
>


Re: Streamlining TC management and operations sequences

2017-02-02 Thread Nir Sopher
Hi Eric,
Actually, as we imaged it, a "generation" is created only when a new
configuration is applied - when the "consistent hash" is permanently
modified.

I'll open a separate thread to discuss the technical details further,
including an algorithm we have in mind.

I also opened TC-130 - Streamlining TC management and operations sequences
<https://issues.apache.org/jira/browse/TC-130> to further monitor the issue.

Would appreciate community inputs about the issue, especially discussing
the PROs and CONs of the 2 different approaches:
Traffic Ops orchestrated solution vs. A more flexible, traffic-router
algorithm based, solution.

Nir




On Wed, Feb 1, 2017 at 3:33 PM, Eric Friedrich (efriedri) <
efrie...@cisco.com> wrote:

> Hey Nir-
>   Interesting thought for sure.
>
> Would TM “health changes” (loss of connectivity, BW/loadavg too high)
> change the generation count? It seems like the answer is Yes, because the
> health of a cache impacts the state of the consistent hash ring.
>
> If so, how do these generation changes get from the Traffic Monitor to the
> caches, when config changes typically come only from Traffic Ops and only
> when ORT is run?
>
> Or maybe the generation count is just an abstraction to conceptualize the
> problem space and not a literal approach?
>
> —Eric
>
> > On Feb 1, 2017, at 4:14 AM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > Hi Eric,
> >
> > Formalizing the approach you suggested, one may introduce the concept of
> a
> > delivery-service configuration "generation" which would be an ordinal
> > identifier for the a delivery service configuration. A "generation"
> changes
> > whenever the remap rule changes or the consistent hash mapping of content
> > to server changes (e.g. due to additional server assignment).
> > I such a solution, each traffic-server may hold a single generation for
> > each delivery service configuration, while traffic-router may hold a
> > history of generations and know which server holds which configuration
> > generation.
> >
> > This approach introduces a considerable flexibility. It allows
> > configurations to be set one after the other with no need to wait between
> > them.
> > It also fits well with Jeremy's suggestion for queue-update with a
> delivery
> > service granularity.
> >
> > On the other hand, complicated algorithms for solving the issue may
> impose
> > more risk to the network when applied, comparing to a simple
> "traffic-ops"
> > orchestrated solution.
> >
> > I'm not sure what is preferable from an operator point of view. I'm also
> > not familiar with TC 3.0 configuration solution to validate he different
> > approaches against.
> >
> > Please share your thoughts,
> > Thanks,
> > Nir
> >
> > On Tue, Jan 31, 2017 at 6:26 PM, Eric Friedrich (efriedri) <
> > efrie...@cisco.com> wrote:
> >
> >> What about an approach (apologies, still light on details), where TR
> >> (perhaps still via TM) discovers the availability of delivery services
> from
> >> the cache itself, rather than from the CRConfig file? (Astats or its
> >> remap_stats based replacement would publish its remap rules)
> >>
> >> Any changes to the set of servers (add/remove) or DS assignments would
> not
> >> require a specific step to push a changed config to the router. If a
> cache
> >> does not yet, or no longer has remap rules for a specific delivery
> service,
> >> then TR will not see that rule advertised by the cache and will not
> send it
> >> traffic. If adding or removing a server, TM still needs to be updated to
> >> learn about the new server.
> >>
> >> With current configuration, theres a race condition of a few seconds
> where
> >> a cache removes remap rule before TM polls and TR gets health info from
> TM.
> >> In these few seconds, TR would erroneously send traffic to a cache
> without
> >> a proper remap rule.
> >>
> >> We could fix this by
> >>  a) advertising a state of the remap rule in astats to notify TR no
> >> longer to send traffic on that DS for a short period before the rule is
> >> actually removed - all handled inside of ORT).
> >>or
> >>  b) prematurely removing the remap rule from astats, before the config
> on
> >> TS is actually updated (at the cost of missing the final few remap stats
> >> numbers). This is probably unacceptable.
> >>
> >> I’m sure there are other variants on this, but my main goal is for TR t

Re: Server's sarameters - adding some logic

2017-02-01 Thread Nir Sopher
Indeed, under these circumstances, further investing in this area seems not
to be very productive.
Thank you Dewayne, both for your answer and further discussing the issue,
Nir

On Tue, Jan 31, 2017 at 11:43 PM, Dewayne Richardson <dewr...@gmail.com>
wrote:

> Hi Nir,
>
> The server hardware tab we plan on removing eventually.  It was a feature
> that was added long before Traffic Ops existed back then we called it
> Twelve Monkeys.  Once we move toward TC 3.0, we plan on building "bare
> essential" Traffic Ops where we will (initially) leave behind several
> features that can be managed by other ThirdParty software.
>
> The TO Hardware UI and database tables will be one of those candidates.
> So, I wouldn't invest a lot of time populating that area of the application
> (unless you have a valid use case).
>
> > I am basically looking for a way to decouple unrelated configurations
> one from
> the other - avoiding the need to add more profiles due to relatively small
> differences.
>
> We have had design discussions in relation to this question, but Traffic
> Ops doesn't currently support your ask.  So, TO doesn't support multiple
> profiles per server.
>
>
> Thanks,
>
>
> Dewayne
>
> On Mon, Jan 30, 2017 at 3:11 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Thanks Mark,
> > Can you please provide an example to explain how "parameters to
> > cache-group" assignment is used?
> >
> > I assumed the "parameter to cache-group" assignment gives an additional
> > dimension to the profiles table - allowing a server to have more than one
> > profile.
> > I now understand this is not the case.
> >
> > I am basically looking for a way to decouple unrelated configurations one
> > from the other - avoiding the need to add more profiles due to relatively
> > small differences.
> >
> > As I see it, server's configuration parameters are derived from a few
> > aspects:
> >
> >1. The server's main functionality: edge vs mid
> >2. The server's HW "vendor".
> >For example: block devices naming conventions is different for KVM
> >(/dev/vd?) and VmWare (/dev/sd?) .
> >3. The server's HW variant strength/capacity.
> >For example: storage capacity and throughput  - per storage tier;
> >interface speed; possible CPU related bounds.
> >Note that these differences can derive from different platform
> >specifications, but also form the HW generation: Dell R730 platforms
> are
> >able to contain more disks than the previous generation R720
> platforms.
> >4. The server software supported capabilities or settings.
> >For example: Live, SSL, debug, etc.
> >Note that these capabilities can also derive from the cache software
> and
> >version - for example, ATS version supported plugins.
> >
> > IIUC, currently each server is attached with a single profile, and
> > theoretically, each of the above aspects multiply the amount of profiles.
> > I believe that in real life, within a single ISP, the matrix is sparsed
> and
> > the number of profiles is limited and controlled.
> >
> > Still, I believe that an ability to attach multiple profiles to a single
> > server, each focused on a different aspect, would be beneficial. It ill
> > allow:
> >
> >1. The decoupling of server's hardware and software, as already
> >suggested as a future feature in traffic-control documentation.
> >2. Better view of capability related parameters, allowing a better
> >understanding of which parameter value is derived from which
> capability.
> >
> > Can you please share your view on the subject?
> > What is the status of the hardware and software profiles decoupling?
> >
> > Thanks,
> > Nir
> >
> > On Mon, Jan 30, 2017 at 12:40 AM, Mark Torluemke <mtorlue...@apache.org>
> > wrote:
> >
> > > On Sun, Jan 29, 2017 at 2:02 PM, Nir Sopher <n...@qwilt.com> wrote:
> > >
> > > > Thanks Robert.
> > > >
> > > > Indeed. The parameters-profile mechanism practically solves the
> > > scalability
> > > > issue.
> > > > BTW, I noticed there is an ability to assign a variable to a cache
> > group.
> > > > Can someone please elaborate on it? I tried it, setting a variable in
> > > > records.config and examining the result file, but it did not work
> they
> > > way
> > > > I expected.
> > > >
> > >
> > > These parameters are on the cache group itself, not

Re: Streamlining TC management and operations sequences

2017-02-01 Thread Nir Sopher
Hi Eric,

Formalizing the approach you suggested, one may introduce the concept of a
delivery-service configuration "generation" which would be an ordinal
identifier for the a delivery service configuration. A "generation" changes
whenever the remap rule changes or the consistent hash mapping of content
to server changes (e.g. due to additional server assignment).
I such a solution, each traffic-server may hold a single generation for
each delivery service configuration, while traffic-router may hold a
history of generations and know which server holds which configuration
generation.

This approach introduces a considerable flexibility. It allows
configurations to be set one after the other with no need to wait between
them.
It also fits well with Jeremy's suggestion for queue-update with a delivery
service granularity.

On the other hand, complicated algorithms for solving the issue may impose
more risk to the network when applied, comparing to a simple "traffic-ops"
orchestrated solution.

I'm not sure what is preferable from an operator point of view. I'm also
not familiar with TC 3.0 configuration solution to validate he different
approaches against.

Please share your thoughts,
Thanks,
Nir

On Tue, Jan 31, 2017 at 6:26 PM, Eric Friedrich (efriedri) <
efrie...@cisco.com> wrote:

> What about an approach (apologies, still light on details), where TR
> (perhaps still via TM) discovers the availability of delivery services from
> the cache itself, rather than from the CRConfig file? (Astats or its
> remap_stats based replacement would publish its remap rules)
>
> Any changes to the set of servers (add/remove) or DS assignments would not
> require a specific step to push a changed config to the router. If a cache
> does not yet, or no longer has remap rules for a specific delivery service,
> then TR will not see that rule advertised by the cache and will not send it
> traffic. If adding or removing a server, TM still needs to be updated to
> learn about the new server.
>
> With current configuration, theres a race condition of a few seconds where
> a cache removes remap rule before TM polls and TR gets health info from TM.
> In these few seconds, TR would erroneously send traffic to a cache without
> a proper remap rule.
>
> We could fix this by
>   a) advertising a state of the remap rule in astats to notify TR no
> longer to send traffic on that DS for a short period before the rule is
> actually removed - all handled inside of ORT).
> or
>   b) prematurely removing the remap rule from astats, before the config on
> TS is actually updated (at the cost of missing the final few remap stats
> numbers). This is probably unacceptable.
>
> I’m sure there are other variants on this, but my main goal is for TR to
> directly learn from the caches which delivery services they actually have
> available. Rather than the TR learning what TO only thinks each cache has
> available.
>
> —Eric
>
>
>
>
>
> > On Jan 31, 2017, at 8:10 AM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > Hi,
> >
> > In order to further improve the simplicity and robustness of the control
> > path for provisioning infrastructure and delivery services, we are
> > currently considering ways to streamline management and operations.
> >
> > Currently, when applying changes in traffic-control that require the
> > synchronization between the traffic-router and traffic-servers, the user
> > should be conscious to do so in a certain order. Otherwise, "black holes"
> > may be created. Furthermore, in some of the scenarios the user have to
> wait
> > and verify that the configuration reached all traffic server before he
> may
> > apply it to the traffic-router.
> >
> > We have noticed that TC-3.0 is planned to include a "Config State
> Machine",
> > probably dealing with the issue thoroughly. We have no further
> information
> > about this bullet and would appreciate any additional info.
> >
> > We would like to start investing in making TC operations more streamline,
> > robust and user-friendly.
> >
> > The main use-cases we would like to address at this point are:
> >
> >   1. Assign servers to a Delivery-Service.
> >   For this operation, the configuration must first be applied to the
> added
> >   traffic servers, propagate, and only then applied to the
> traffic-router.
> >   2. Remove servers assignment to a Delivery-Service.
> >   For this operation, the configuration must first be applied to the
> >   traffic-router, and only then to the traffic-servers.
> >   3. Add a new delivery service.
> >   This is practically a private case of servers assignment to a
> &g

Re: Server's sarameters - adding some logic

2017-01-29 Thread Nir Sopher
Thanks Robert.

Indeed. The parameters-profile mechanism practically solves the scalability
issue.
BTW, I noticed there is an ability to assign a variable to a cache group.
Can someone please elaborate on it? I tried it, setting a variable in
records.config and examining the result file, but it did not work they way
I expected.

Nir

-- Forwarded message --
From: "Robert Butts" <robert.o.bu...@gmail.com>
Date: Jan 26, 2017 5:58 PM
Subject: Re: Server's sarameters - adding some logic
To: <dev@trafficcontrol.incubator.apache.org>
Cc:

> I think that we should not attempt to invent a scripting language for this
purpose.
>
> My guess is that Lua <https://www.lua.org/about.html> is a good candidate
for the job.

+1 on both counts.

Though I'm not convinced we need a scripting language in parameters yet.

> Separating into 2 profile is not scalable.

Creating or embedding a scripting language is a pretty big feature. You can
assign the same parameters to multiple profiles. So all the parameters
which are the same can be assigned to both profiles, so no parameters are
duplicated. Arguably, this scenario is exactly why we have the
Parameters–Profiles system.

On Thu, Jan 26, 2017 at 5:10 AM, Oren Shemesh <or...@qwilt.com> wrote:

> I think that we should not attempt to invent a scripting language for this
> purpose.
>
> My guess is that Lua <https://www.lua.org/about.html> is a good candidate
> for the job.
> "Lua is a powerful, efficient, lightweight, embeddable scripting
language".
> It can be embedded in all popular languages, specifically in perl
> <http://search.cpan.org/~vparseval/Inline-Lua-0.03/lib/Inline/Lua.pm>and
> (More relevant, I think) in go
> <https://www.google.co.il/webhp?sourceid=chrome-instant;
> rlz=1C1LENP_enIL506IL506=1=2=UTF-8#q=calling+lua+from+go>
> .
>
>
>
> On Thu, Jan 26, 2017 at 12:51 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Hi,
> >
> > Working on TC-121 <https://issues.apache.org/jira/browse/TC-121>,
> allowing
> > variables to be evaluated as part of a traffic-server parameter, made me
> > realize that the simple solution of variable substitution is might not
be
> > strong enough.
> >
> > As an example, lets take traffic server ip bind configuration.
> > Setting :
> > LOCAL proxy.local.incoming_ip_to_bind
> > to be:
> > STRING __CACHE_IPv4__ [__CACHE_IPv6__]
> >
> > If the server's IPv6 address is set, it will work nicely.
> > But if the IPv6 is not set, we will end up with an invalid
configuration.
> >
> > As far as I know, a single profile cannot support both use-cases.
> > Separating into 2 profile is not scalable. Splitting a profile for this
> > purpose may result with many profiles, with small differences between
> them,
> > which are hard to follow and identify.
> >
> > I would like to suggest an improvement that
> >
> >- Allow a parameter to be optional.
> >- Allow some logic in the evaluation of the parameter's value.
> >
> > This can be achieved by using expressions to be evaluated in the
> > parameter's value.
> > The syntax of course, needs to be discussed, but lets say for example
> that:
> > __COND_BEGIN__/__COND_END__ delimits a condition to be evaluated:
> > One may set a value to be:
> > STRING __CACHE_IPv4__ __COND_BEGIN__ __CACHE_IPv6__!="" ?
> [__CACHE_IPv6__]
> > :
> >  ""__COND_END__
> > Having the IPv6 part in the result only if set.
> >
> > Furthermore, a special evaluation result string (e.g. __NA__) may
> identify
> > parameters that should be omitted from the server's configuration.
> >
> >
> > I would appreciate your view on the issue.
> >
> > Thanks,
> > Nir
> >
>
>
>
> --
>
> *Oren Shemesh*
> Qwilt | Work: +972-72-2221637| Mobile: +972-50-2281168 | or...@qwilt.com
> <y...@qwilt.com>
>


Re: Traffic Ops Dev environment - postgresql installation

2017-01-24 Thread Nir Sopher
Great:)
Thank you for the info.
Especially for the "development mode auto restart on change", it makes
things much more convenient,
Nir

On Tue, Jan 24, 2017 at 11:22 PM, Dan Kirkwood <dang...@gmail.com> wrote:

> well done!
>
> admin.pl creates a separate database for each of those environments:
> to_test, to_integration, to_development, and traffic_ops (for
> production).   So yes -- they can all live together in the same
> postgres installation.   To initialize the database for running
> `bin/start.pl`,  you should run `db/admin.pl --env=development`.To
> run from the installed directory (/opt/traffic_ops),  you should run
> with `--env=production`.   And, as you said,  `--env=test` and
> `--env=integration` for running unit tests and integration tests,
> respectively.
>
> The advantage of running development (using `./bin/start.pl`) is that
> it monitors the Perl libraries and automatically restarts the server
> when it detects any changes to them.
>
> I hope that's helpful -- do let us know how you're progressing..
>
> -dan
>
> On Tue, Jan 24, 2017 at 2:04 PM, Nir Sopher <n...@qwilt.com> wrote:
> > Thank you Dan,
> > Indeed, moving to postgres would be the right choice, as I want to test
> the
> > changes on the branch I submit to.
> > I already tested today my dev env with TC 1.8, and I now have some
> > confidence in my its bringup protocol so I can move to a less stable
> branch.
> >
> > I used the command you sent.
> > Additionally I had to add a database with the same name, and adjust
> > "pg_hba.conf".
> > My traffic-ops is now up :)
> >
> > I assume (and tried it out), that:
> >
> >1. In order to run "prove t" I need to run "./db/admin.pl --env=test
> >setup"
> >2. In order to run "prove t_integration" I need to run "./db/admin.pl
> >--env=integration setup"
> >3. In order to launch traffic-ops  I need to run "./db/admin.pl
> >--env=development setup"
> >
> > Am I correct?
> > It looks like these admin.pl injected data can live together in the
> same DB
> > without a conflict. Is it true, or should I drop the DB / replace setup
> > when moving from one env to another?
> >
> > Thanks again.
> > Nir
> >
> >
> > On Tue, Jan 24, 2017 at 5:56 PM, Dan Kirkwood <dang...@gmail.com> wrote:
> >
> >> sorry -- prematurely sent..
> >>
> >> Hi Nir,
> >>
> >> It probably is best to continue with postgres rather than starting with
> >> mysql..
> >>
> >> You need to be running as a user that has superuser privilege on the
> >> postgres db to run the `admin.pl setup`.   Try this:
> >>
> >> sudo su - postgres createuser -s -r -d -E 
> >>
> >> and then try the `db/admin.pl ... setup` command again..
> >>
> >> If you still have problems,  please send the command and output you're
> >> seeing and we'll try to help move you along further..
> >>
> >> -dan
> >>
> >> On Tue, Jan 24, 2017 at 8:50 AM, Dan Kirkwood <dang...@gmail.com>
> wrote:
> >> > Hi Nir,
> >> >
> >> > It probably is best to continue with postgres rather than starting
> with
> >> mysql..
> >> >
> >> > You need to be running as a user that has superuser privilege on the
> >> > postgres db to run the `admin.pl setup`.   Try this:
> >> >
> >> > sudo su postgres createuser -s
> >> >
> >> > On Tue, Jan 24, 2017 at 8:27 AM, David Neuman <
> david.neuma...@gmail.com>
> >> wrote:
> >> >> First of all, it looks like your documentation is to our old site,
> you
> >> will
> >> >> want to use http://trafficcontrol.apache.org/docs/latest/index.html
> in
> >> the
> >> >> future.
> >> >> If you don't have docker and docker-compose on your VM (it would need
> >> to be
> >> >> centos 7.x or above), we should be able to get it working with a
> >> "normal"
> >> >> postgres install; I would start by taking a look at the scripts that
> >> are in
> >> >> `/traffic_control/traffic_ops/app/db/pg-migration`.  Maybe @dangogh
> is
> >> >> familiar enough with the process that he can provide a quick how-to?
> >> >>
> >> >> On Mon, Jan 23, 2017 at 5:14 PM, Nir Sopher <n...@qwilt.com> wrote:
> >> >>
> >> >>> Thank yo

Traffic Ops - Running Test Cases

2017-01-23 Thread Nir Sopher
Hi,

I am trying to run "prove t/" in order to test my branch after a minor
change I did.
However, the command keeps on failing due to missing Perl modules (not
brought a part of the "carton" command). For example: Mojo::Base,
NetAddr::IP, File::Slurp, String::CamelCase, 

Is there somewhere a list of modules to be installed in order to allow the
UTs to run?

10x,
Nir


Re: Issues with using Traffic-Vault

2017-01-20 Thread Nir Sopher
Hi,

The traffic server is pulling the keys from traffic ops
(reading api/1.2/cdns/name/nirs-tc1-cdn/sslkeys.json).
However, the certificates are not saved in the ssl directory.

The ort script seems to verify for each certificate in sslkeys.json if it
matches a ssl_key_name in ssl_multicerts.config.
It ends up comparing
ccr.ynet-images.nirs-tc1-cdn.tc-dev.qwilt.com
with
*.ynet-images-3.nirs-tc1-cdn.tc-dev.qwilt.com

The comparison failed and therefore no certificate was written...

I replaces in the ORT:
$record->{'hostname'} eq $remap
with
Text::Glob::match_glob($record->{'hostname'}, $remap)

And problem was solved.

Any idea what is the root of the issue? Any chance I'm encountering ORT /
Traffic-Ops versions comparability?

Thank You & have a nice weekend,
Nir


Is it a

And skipping the certificate.

On Jan 20, 2017 3:34 AM, "Dave Neuman" <neu...@apache.org> wrote:

> So, is ORT getting the certs from traffic vault like it should now?
>
> On Thu, Jan 19, 2017 at 3:16 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Yes, the parameter is set correctly.
> > The ssl_multicert.config file is on the server in the specified
> directory.
> > The /opt/trafficserver/etc/trafficserver/ssl/ directory however is
> > missing.
> > Thanks,
> > Nir
> >
> > On Thu, Jan 19, 2017 at 11:44 PM, Dave Neuman <neu...@apache.org> wrote:
> >
> > > The certificates should be put on the cache by ORT.  Do you have a
> > location
> > > parameter for ssl_multicert.config?  If not, you will need to create
> that
> > > and assign it to your EDGE profile in order for ORT to know to get the
> > > certificates.
> > > Param Name = location
> > > Config File Name = ssl_multicert.config
> > > Value =  /opt/trafficserver/etc/trafficserver
> > >
> > > On Thu, Jan 19, 2017 at 2:19 PM, Nir Sopher <n...@qwilt.com> wrote:
> > >
> > > > OK!
> > > > Thank you!
> > > >
> > > > After applying the patch, the curl command indeed showed me the
> > > > certificates.
> > > > The traffic-server ort script run "successfully", pulling
> > > > ssl_multicert.config.
> > > >
> > > > However when trying to work with https, I got an SSL error due to a
> > > missing
> > > > certificate on the servers. This was the case for both traffic router
> > and
> > > > traffic-server.
> > > > Furthermore, the traffic router went insane...
> > > >
> > > > I then created a new traffic router, and it apparently pulled the
> > > > certificates. The redirects worked perfectly.
> > > > Still my traffic server was missing the certificates
> themselves.Adding
> > a
> > > > new traffic server did not help. it still had the problem.
> > > >
> > > > I worked around the problem by creating the etc/trafficserver/ssl
> > > directory
> > > > on the traffic-server, and placing there a self signed certificate
> with
> > > the
> > > > proper names.
> > > >
> > > > Any idea why the certificates did not get to the server?
> > > > I did not find any related message in the ort script output. Is it
> the
> > > one
> > > > that should bring the certs?
> > > >
> > > > Thank you again,
> > > > Nir
> > > >
> > > >
> > > > However, the certificates
> > > >
> > > > On Thu, Jan 19, 2017 at 5:02 PM, Dave Neuman <neu...@apache.org>
> > wrote:
> > > >
> > > > > Can you try curl -kvs "https://admin:password@riakURL
> > > > > :8088/search/query/sslkeys?wt=json=cdn:nirs-tc1-cdn" and let me
> > know
> > > > > what
> > > > > that returns?
> > > > > It should return to you the ssl certs for your delivery service. If
> > it
> > > > does
> > > > > not can you try to go into the “Paste Keys” screen in traffic ops,
> > > press
> > > > > the save button to save the SSL certs again, and then re-run the
> > curl?
> > > > > If they are still not showing up after that you may have hit a bug
> we
> > > > found
> > > > > earlier that is now fixed in master where the content-type isn’t
> set
> > > > > correctly on the PUT to Riak. The workaround is to change line 104
> of
> > > > > traffic_ops/app/lib/Connection/RiakAdapter.pm from return
> $ua->put(
> > > > $fqdn,
> > > > &

Re: Issues with using Traffic-Vault

2017-01-19 Thread Nir Sopher
OK!
Thank you!

After applying the patch, the curl command indeed showed me the
certificates.
The traffic-server ort script run "successfully", pulling
ssl_multicert.config.

However when trying to work with https, I got an SSL error due to a missing
certificate on the servers. This was the case for both traffic router and
traffic-server.
Furthermore, the traffic router went insane...

I then created a new traffic router, and it apparently pulled the
certificates. The redirects worked perfectly.
Still my traffic server was missing the certificates themselves.Adding a
new traffic server did not help. it still had the problem.

I worked around the problem by creating the etc/trafficserver/ssl directory
on the traffic-server, and placing there a self signed certificate with the
proper names.

Any idea why the certificates did not get to the server?
I did not find any related message in the ort script output. Is it the one
that should bring the certs?

Thank you again,
Nir


However, the certificates

On Thu, Jan 19, 2017 at 5:02 PM, Dave Neuman <neu...@apache.org> wrote:

> Can you try curl -kvs "https://admin:password@riakURL
> :8088/search/query/sslkeys?wt=json=cdn:nirs-tc1-cdn" and let me know
> what
> that returns?
> It should return to you the ssl certs for your delivery service. If it does
> not can you try to go into the “Paste Keys” screen in traffic ops, press
> the save button to save the SSL certs again, and then re-run the curl?
> If they are still not showing up after that you may have hit a bug we found
> earlier that is now fixed in master where the content-type isn’t set
> correctly on the PUT to Riak. The workaround is to change line 104 of
> traffic_ops/app/lib/Connection/RiakAdapter.pm from return $ua->put( $fqdn,
> Content => $value ); to return $ua->put( $fqdn, Content => $value,
> 'Content-Type'=> $content_type ); and restart traffic_ops. After you
> restart Traffic Ops go into the paste keys screen, save your keys again,
> and run the curl again.
> Let me know how it goes.
>
> Thanks,
> Dave
> ​
>
> On Thu, Jan 19, 2017 at 7:46 AM, Steve Malenfant <smalenf...@gmail.com>
> wrote:
>
> > In not probably the one that can explain that to you, but I believe there
> > is additional settings in riak for TC >1.7. I've heard of enabling riak
> > search and new security parameters...
> >
> > On Thu, Jan 19, 2017 at 8:35 AM Nir Sopher <n...@qwilt.com> wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > After a reboot, key generation indeed works. Thank you:)
> > >
> > > However, the traffic server still encounter the issue:
> > >
> > > ERROR result for http://ops.nirs-tc1.tc-dev.qwilt.com/api/1.2/cdns/
> > >
> > > name/nirs-tc1-cdn/sslkeys.json is: ...{"message":"No SSL certificates
> > > found
> > >
> > > for nirs-tc1-cdn"}...
> > >
> > > FATAL http://ops.nirs-tc1.tc-dev.qwilt.com/api/1.2/cdns/
> > >
> > > name/nirs-tc1-cdn/sslkeys.json returned HTTP 404!
> > >
> > >
> > >
> > > Can it be that something is badly configured in my delivery-service? Or
> > >
> > > maybe in my traffic ops configuration?
> > >
> > > Maybe an RPM missing?
> > >
> > >
> > >
> > > Thank you both again.
> > >
> > > Nir
> > >
> > >
> > >
> > > On Thu, Jan 19, 2017 at 3:12 PM, Steve Malenfant <smalenf...@gmail.com
> >
> > >
> > > wrote:
> > >
> > >
> > >
> > > > Have you tried to simply restart Traffic Ops? We've seen ours (1.6)
> not
> > >
> > > > being able to create Certificates after a while.
> > >
> > > >
> > >
> > > > On Wed, Jan 18, 2017 at 11:10 PM, Nir Sopher <n...@qwilt.com> wrote:
> > >
> > > >
> > >
> > > > > ERROR result for
> > > http://ops.nirs-tc1.tc-dev.qwilt.com/api/1.2/cdns/name/
> > >
> > > > > nirs-tc1-cdn/sslkeys.json is: ...{"message":"No SSL certificates
> > found
> > >
> > > > for
> > >
> > > > > nirs-tc1-cdn"}...
> > >
> > > > > FATAL http://ops.nirs-tc1.tc-dev.qwilt.com/api/1.2/cdns/name/
> > >
> > > > > nirs-tc1-cdn/sslkeys.json returned HTTP 404!
> > >
> > > > >
> > >
> > > > >
> > >
> > > > > On Thu, Jan 19, 2017 at 12:43 AM, Dave Neuman <neu...@apache.org>
> > > wrote:
> > >
&

Re: Issues with using Traffic-Vault

2017-01-18 Thread Nir Sopher
t;:"5","hostname":"*.
ynet-images.nirs-tc1-cdn.tc-dev.qwilt.com","key":"ynet-images"}

On Wed, Jan 18, 2017 at 8:01 PM, Dave Neuman <neu...@apache.org> wrote:

> The second curl would be: curl -k "
> https://admin:admin...@vault-int.nirs-tc1.tc-dev.qwilt.com:8
> 088/riak/ssl/ynet-images-latest
> "
>
> If that works from your traffic_ops host then it should also work when you
> go into the paste keys screen.
>
> Turning on Debug logging might also help. You can set log4perl.rootLogger =
> ERROR, SCREEN, FILE in traffic_ops/app/conf/production/log4perl.conf
>
> Try that out and send me what, if anything, you see in the log.
>
> Thanks,
>
> Dave
> ​
>
> On Wed, Jan 18, 2017 at 9:14 AM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Thanks Dave,
> > I am pasting the keys through the Manange SSL Keys -> Paste Existing Keys
> > screen.
> >
> > Below is the output of the curl commands:
> >
> > $ curl -k "https://admin:admin...@vault-int.nirs-tc1.tc-dev.qwilt.com:
> > 8088/buckets/ssl/keys?keys=true"
> > {"keys":["ynet-images-5","ynet-images-latest","ynet-
> > images-4","ynet-images-3"]}
> >
> > $ curl -k "https://admin:admin...@vault-int.nirs-tc1.tc-dev.qwilt.com:
> > 8088/riak/ssl/xmlid-latest"
> > not found
> >
> > Nir
> >
> > On Wed, Jan 18, 2017 at 4:56 PM, Dave Neuman <neu...@apache.org> wrote:
> >
> > > That sucks that it still doesn't work :(
> > >
> > > Lets start with the config.  You said you had to set `
> > > listener.https.internal= 0.0.0.0:8088`, we have that configured with
> the
> > > IP
> > > of the riak server, but if you can successfully make curl requests from
> > the
> > > traffic_ops server, then I guess that is ok.
> > >
> > > As for the error you are getting...that error is basically saying that
> > Riak
> > > cannot find the SSL Keys that you are looking for.
> > >
> > > Which endpoint are you using when you get that error?  Are you going
> > > through the Manange SSL Keys -> Paste Existing Keys screen?  Or are you
> > > hitting an API?
> > >
> > > You should be able to see if the keys exist by running  `curl -k
> > > "https://admin:password@riakURL:8088/buckets/ssl/keys?keys=true"` and
> > > looking for XMLID-latest in the list of keys; you could also run `curl
> -k
> > > "https://admin:password@riakURL:8088/riak/ssl/xmlid-latest"`
> > >
> > > Thanks,
> > > Dave
> > >
> > > On Tue, Jan 17, 2017 at 1:57 PM, Nir Sopher <n...@qwilt.com> wrote:
> > >
> > > > Thank you Dave:)
> > > >
> > > > Indeed I was using Riak 2.2 with TC 1.7.
> > > > I moved now to Riak 2.1.3 (same traffic ops, just replaced the
> vault).
> > > > I see the same issues. The only change is the added log messages in
> > > traffic
> > > > ops log during certificate generation:
> > > >
> > > > [2017-01-17 20:29:58,119] [ERROR] Active Server Severe Error: 404 -
> > > > vault-int.nirs-tc1.tc-dev.qwilt.com:8088 - not found
> > > >
> > > > Nir
> > > >
> > > > On Tue, Jan 17, 2017 at 6:56 PM, Dave Neuman <neu...@apache.org>
> > wrote:
> > > >
> > > > > Hey Nir,
> > > > > I think I can help here.  First of all, what version of Traffic
> > Control
> > > > are
> > > > > you running and which version of Riak are you running?  We have
> seen
> > > > issues
> > > > > using newer versions of Riak with Traffic Control 1.7 and 1.8.
> Those
> > > > > issues should be resolved in the next release.  For now we
> recommend
> > > you
> > > > > use Riak 2.1.x and not 2.2.x
> > > > >
> > > > > Once I know that we can start digging deeper.
> > > > >
> > > > > Thanks,
> > > > > Dave
> > > > >
> > > > > On Tue, Jan 17, 2017 at 9:44 AM, Nir Sopher <n...@qwilt.com>
> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to launch a traffic vault and connect it to my
> > > traffic-ops
> > > > > > server.
> > > > > > I followed the instructions in the admin guide
> > > > > > <htt

Re: Issues with using Traffic-Vault

2017-01-18 Thread Nir Sopher
Thanks Dave,
I am pasting the keys through the Manange SSL Keys -> Paste Existing Keys
screen.

Below is the output of the curl commands:

$ curl -k "https://admin:admin...@vault-int.nirs-tc1.tc-dev.qwilt.com:
8088/buckets/ssl/keys?keys=true"
{"keys":["ynet-images-5","ynet-images-latest","ynet-
images-4","ynet-images-3"]}

$ curl -k "https://admin:admin...@vault-int.nirs-tc1.tc-dev.qwilt.com:
8088/riak/ssl/xmlid-latest"
not found

Nir

On Wed, Jan 18, 2017 at 4:56 PM, Dave Neuman <neu...@apache.org> wrote:

> That sucks that it still doesn't work :(
>
> Lets start with the config.  You said you had to set `
> listener.https.internal= 0.0.0.0:8088`, we have that configured with the
> IP
> of the riak server, but if you can successfully make curl requests from the
> traffic_ops server, then I guess that is ok.
>
> As for the error you are getting...that error is basically saying that Riak
> cannot find the SSL Keys that you are looking for.
>
> Which endpoint are you using when you get that error?  Are you going
> through the Manange SSL Keys -> Paste Existing Keys screen?  Or are you
> hitting an API?
>
> You should be able to see if the keys exist by running  `curl -k
> "https://admin:password@riakURL:8088/buckets/ssl/keys?keys=true"` and
> looking for XMLID-latest in the list of keys; you could also run `curl -k
> "https://admin:password@riakURL:8088/riak/ssl/xmlid-latest"`
>
> Thanks,
> Dave
>
> On Tue, Jan 17, 2017 at 1:57 PM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Thank you Dave:)
> >
> > Indeed I was using Riak 2.2 with TC 1.7.
> > I moved now to Riak 2.1.3 (same traffic ops, just replaced the vault).
> > I see the same issues. The only change is the added log messages in
> traffic
> > ops log during certificate generation:
> >
> > [2017-01-17 20:29:58,119] [ERROR] Active Server Severe Error: 404 -
> > vault-int.nirs-tc1.tc-dev.qwilt.com:8088 - not found
> >
> > Nir
> >
> > On Tue, Jan 17, 2017 at 6:56 PM, Dave Neuman <neu...@apache.org> wrote:
> >
> > > Hey Nir,
> > > I think I can help here.  First of all, what version of Traffic Control
> > are
> > > you running and which version of Riak are you running?  We have seen
> > issues
> > > using newer versions of Riak with Traffic Control 1.7 and 1.8.  Those
> > > issues should be resolved in the next release.  For now we recommend
> you
> > > use Riak 2.1.x and not 2.2.x
> > >
> > > Once I know that we can start digging deeper.
> > >
> > > Thanks,
> > > Dave
> > >
> > > On Tue, Jan 17, 2017 at 9:44 AM, Nir Sopher <n...@qwilt.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I am trying to launch a traffic vault and connect it to my
> traffic-ops
> > > > server.
> > > > I followed the instructions in the admin guide
> > > > <http://traffic-control-cdn.net/docs/latest/admin/traffic_vault.html
> >,
> > > > installing riak  <http://goog_1273226474>2.2.0-1
> > > > <http://s3.amazonaws.com/downloads.basho.com/riak/2.2/
> > > > 2.2.0/rhel/6/riak-2.2.0-1.el6.x86_64.rpm>
> > > > working with a self signed certificate (created via the instructions
> in
> > > > this
> > > > <http://www.akadia.com/services/ssh_test_certificate.html> link)
> > > >
> > > > I had to deviate from the document in a few places in order to
> > progress:
> > > >
> > > >- Replacing the host part in the riak listener configuration with
> > > >0.0.0.0. Using real hostname made riak to fail. e.g.
> > > > listener.https.internal
> > > >= 0.0.0.0:8088
> > > >- Setting ssl.cacertfile to point at the server.crt (as this is a
> > self
> > > >signed certificate): ssl.cacertfile = /etc/riak/certs/server.crt
> > Note
> > > >that I assume that this certificate is only used for "traffic
> vault
> > > > https"
> > > >connections.
> > > >- In traffic ops, I initially set the "tcp port" to "8098" and
> > "https
> > > >port" to "8088". When traffic ops tried to connect the vault it
> did
> > it
> > > > via
> > > >port "8098", so I changed the "tcp port" to "8088" in order for
> > https
> > > > to be
> > > >used.
> > > >
>

Re: Issues with using Traffic-Vault

2017-01-17 Thread Nir Sopher
Thank you Dave:)

Indeed I was using Riak 2.2 with TC 1.7.
I moved now to Riak 2.1.3 (same traffic ops, just replaced the vault).
I see the same issues. The only change is the added log messages in traffic
ops log during certificate generation:

[2017-01-17 20:29:58,119] [ERROR] Active Server Severe Error: 404 -
vault-int.nirs-tc1.tc-dev.qwilt.com:8088 - not found

Nir

On Tue, Jan 17, 2017 at 6:56 PM, Dave Neuman <neu...@apache.org> wrote:

> Hey Nir,
> I think I can help here.  First of all, what version of Traffic Control are
> you running and which version of Riak are you running?  We have seen issues
> using newer versions of Riak with Traffic Control 1.7 and 1.8.  Those
> issues should be resolved in the next release.  For now we recommend you
> use Riak 2.1.x and not 2.2.x
>
> Once I know that we can start digging deeper.
>
> Thanks,
> Dave
>
> On Tue, Jan 17, 2017 at 9:44 AM, Nir Sopher <n...@qwilt.com> wrote:
>
> > Hi,
> >
> > I am trying to launch a traffic vault and connect it to my traffic-ops
> > server.
> > I followed the instructions in the admin guide
> > <http://traffic-control-cdn.net/docs/latest/admin/traffic_vault.html>,
> > installing riak  <http://goog_1273226474>2.2.0-1
> > <http://s3.amazonaws.com/downloads.basho.com/riak/2.2/
> > 2.2.0/rhel/6/riak-2.2.0-1.el6.x86_64.rpm>
> > working with a self signed certificate (created via the instructions in
> > this
> > <http://www.akadia.com/services/ssh_test_certificate.html> link)
> >
> > I had to deviate from the document in a few places in order to progress:
> >
> >- Replacing the host part in the riak listener configuration with
> >0.0.0.0. Using real hostname made riak to fail. e.g.
> > listener.https.internal
> >= 0.0.0.0:8088
> >- Setting ssl.cacertfile to point at the server.crt (as this is a self
> >signed certificate): ssl.cacertfile = /etc/riak/certs/server.crt Note
> >that I assume that this certificate is only used for "traffic vault
> > https"
> >connections.
> >- In traffic ops, I initially set the "tcp port" to "8098" and "https
> >port" to "8088". When traffic ops tried to connect the vault it did it
> > via
> >port "8098", so I changed the "tcp port" to "8088" in order for https
> > to be
> >used.
> >
> >
> > Validating the installation using curl -kvs "https://admin
> > :password@riakserver:8088/search/query/sslkeys?wt=json=cdn:mycdn"
> > Produced the below output:
> > < HTTP/1.1 200 OK
> > < Server: MochiWeb/1.1 WebMachine/1.10.9 (cafe not found)
> > < Date: Wed, 11 Jan 2017 12:26:07 GMT
> > < Content-Type: application/json; charset=UTF-8
> > < Content-Length: 571
> > <
> > {"responseHeader":{"status":0,"QTime":176,"params":{"shards":"
> > vault-int.nirs-tc1.tc-dev.qwilt.com:8093/internal_solr/sslkeys
> > ","q":"cdn:nirs-tc1-cdn","wt":"json","
> > vault-int.nirs-tc1.tc-dev.qwilt.com:8093":"(_yz_pn:62 AND (_yz_fpn:62))
> OR
> > _yz_pn:61 OR _yz_pn:58 OR _yz_pn:55 OR _yz_pn:52 OR _yz_pn:49 OR
> _yz_pn:46
> > OR _yz_pn:43 OR _yz_pn:40 OR _yz_pn:37 OR _yz_pn:34 OR _yz_pn:31 OR
> > _yz_pn:28 OR _yz_pn:25 OR _yz_pn:22 OR _yz_pn:19 OR _yz_pn:16 OR
> _yz_pn:13
> > OR _yz_pn:10 OR _yz_pn:7 OR _yz_pn:4 OR _yz_pn:1"}},"response":{"numFo
> > und":0,"start":0,"maxScore":0.0,"docs":[]}}
> > * Connection #0 to host vault-int.nirs-tc1.tc-dev.qwilt.com left intact
> > * Closing connection #
> >
> > However, when I created a delivery-service and tried to "generate" a
> > certificate via traffic-ops, I got the below message:
> > SSL keys for  could not be created.  Response was: Error creating key
> > and csr. Result is -1
> > No log message found int traffic_ops log or in the riak log, to explain
> the
> > issue.
> >
> > When pasting a certificate (self signed, including the "" headers and
> > footers), the operation succeed. However, when the traffic servers tried
> to
> > pull this configuration, I got the below message:
> > ERROR result for
> > http://ops.nirs-tc1.tc-dev.qwilt.com/api/1.2/cdns/name/
> > nirs-tc1-cdn/sslkeys.json
> > is: ...{"message":"No SSL certificates found for nirs-tc1-cdn"}...
> > FATAL
> > http://ops.nirs-tc1.tc-dev.qwilt.com/api/1.2/cdns/name/
> > nirs-tc1-cdn/sslkeys.json
> > returned HTTP 404!
> >
> > Any idea what may cause these issues?
> > Any experience in debugging similar issues?
> >
> > Thanks,
> > Nir
> >
>


Re: Question regarding TC integration

2017-01-12 Thread Nir Sopher
Thank you Eric,
Indeed, it seems that these 2 fields derive the following list of fields
from astat: "system.proc.loadavg", "system.proc.net.dev",
"system.inf.speed".
Nir



On Thu, Jan 12, 2017 at 2:49 PM, Eric Friedrich (efriedri) <
efrie...@cisco.com> wrote:

> Hey Nir, Adan-
>   I don’t think #2 configuration pull is actually required. The cache must
> be configured in Traffic Ops, so Traffic Monitor can learn its hostname and
> capacity. Other than that it should just be meeting the astats criteria.
>
> I think you might need interface speed and actual used bandwidth in the
> astats (rather than available bandwidth in Kbps). I can’t remember if the
> availableBandwidth is computed in astats or in TM.
>
> —Eric
>
>
> > On Jan 12, 2017, at 4:20 AM, Nir Sopher <n...@qwilt.com> wrote:
> >
> > Hello,
> >
> >
> >
> > One of our team members (Adan Alper, CCed) is currently trying to
> connect a
> > non ATS cache to Traffic Control.
> >
> > In order to be able to do this, we tried to figure out what is the bare
> > minimum information that needs to be sent by the cache to TC in order for
> > it to consider the cache as active.
> >
> > Currently we found the following:
> >
> > 1.   Monitoring file (_astat) is read from the traffic monitor ones
> > every few seconds. In this file we must provide the following data:
> >
> > a.   availableBandwidthInKbps
> >
> > b.  loadavg
> >
> > 2.   Configuration pull – The cache should pull configuration from
> his
> > queue in traffic OPS once every 15 minutes and suppose to signal traffic
> > ops once it was read and applied.
> >
> >
> >
> > Our questions are:
> >
> > 1.   Are these the only information pieces (bare minimum) that we
> need
> > to provide or are there any more?
> >
> > 2.   Regarding the configuration read signaling. what exactly does
> that
> > cache needs to send to traffic ops for it to acknowledge that the
> > configuration was read and applied? Is this a must for basic
> functionality?
> >
> >
> >
> > Thanks,
> >
> > Nir
>
>