Re: [tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

2019-10-03 Thread Tor Bug Tracker & Wiki
#28322: Deploy better notification system for operational issues
-+
 Reporter:  karsten  |  Owner:  irl
 Type:  project  | Status:  closed
 Priority:  High |  Milestone:
Component:  Metrics  |Version:
 Severity:  Normal   | Resolution:  fixed
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:   | Points:  10
 Reviewer:   |Sponsor:
-+
Changes (by irl):

 * status:  accepted => closed
 * resolution:   => fixed


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

2019-05-18 Thread Tor Bug Tracker & Wiki
#28322: Deploy better notification system for operational issues
-+--
 Reporter:  karsten  |  Owner:  irl
 Type:  project  | Status:  accepted
 Priority:  High |  Milestone:
Component:  Metrics  |Version:
 Severity:  Normal   | Resolution:
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:   | Points:  10
 Reviewer:   |Sponsor:
-+--

Comment (by irl):

 I took a go at deploying this on AWS. I had heard about this new fancy
 Lightsail and deployed it there. This was a mistake and ended up being a
 waste of time for a number of reasons.

 Instead this is going to need to use EC2 (which Lightsail is based on
 anyway) so that we have better control over the firewall (ICMP is blocked
 on Lightsail) and so that it is possible to use the metadata service for
 AWS credentials (which will allow us to use SNS for alerting).

 In good news though, the Ansible playbook works well for deploying the
 software and configuration.

 As this is going to involve an EC2 instance, a couple of SNS topics, an
 IAM role and some glue I would like to see if I can get a CloudFormation
 template for this together so that we don't have AWS resources scattered
 and forgotten (and billed for) when we change this in the future.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

2019-05-06 Thread Tor Bug Tracker & Wiki
#28322: Deploy better notification system for operational issues
-+--
 Reporter:  karsten  |  Owner:  irl
 Type:  project  | Status:  accepted
 Priority:  High |  Milestone:
Component:  Metrics  |Version:
 Severity:  Normal   | Resolution:
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:   | Points:  10
 Reviewer:   |Sponsor:
-+--

Comment (by irl):

 [[Image(Screen Shot 2019-05-06 at 09.35.48.png​)]]

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

2019-05-06 Thread Tor Bug Tracker & Wiki
#28322: Deploy better notification system for operational issues
-+--
 Reporter:  karsten  |  Owner:  irl
 Type:  project  | Status:  accepted
 Priority:  High |  Milestone:
Component:  Metrics  |Version:
 Severity:  Normal   | Resolution:
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:   | Points:  10
 Reviewer:   |Sponsor:
-+--
Changes (by irl):

 * Attachment "Screen Shot 2019-05-06 at 09.35.48.png" added.


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

2019-05-06 Thread Tor Bug Tracker & Wiki
#28322: Deploy better notification system for operational issues
-+--
 Reporter:  karsten  |  Owner:  irl
 Type:  project  | Status:  accepted
 Priority:  High |  Milestone:
Component:  Metrics  |Version:
 Severity:  Normal   | Resolution:
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:   | Points:  10
 Reviewer:   |Sponsor:
-+--
Changes (by irl):

 * owner:  metrics-team => irl
 * status:  new => accepted
 * points:  5 => 10


Comment:

 Status update:

 * I think we're going to end up running our own Nagios instance, which is
 OK if it helps us move forward here.
 * I've got a testing environment running in Vagrant+Ansible and looking at
 adding checks now.
 * I'm using bushel's library code to implement fetching/parsing of Tor-
 specific documents.
 * I'm going to build a new repo "tor-metrics-nagios-checks" that builds a
 Debian package with all the checks in it.
 * I'm going to continue expanding the fetching and parsing logic in
 bushel, such that it's reusable elsewhere.
 * Once I've worked out secret handling in Ansible we can publish also the
 git repo that stands up the testing environment.
 * bushel will need a Debian package if we plan to deploy on a TPA machine.
 I'm thinking though that we could instead deploy to an AWS/GCP/Azure VM
 (yet to decide which of these I like best, we might want to do more cloud-
 native things in the future).

 Current tests:

 * Check for latest index generated on CollecTor and that it is in a
 reasonable time.
 * Check for latest documents published on CollecTor and that they are in a
 reasonable time.

 I'm increasing the points on this task to 10, as I think that is roughly
 the amount of time to spend to get something working and useful. I'll
 remove the points from this ticket once we have child tickets in place,
 each with specific points. Maybe this estimate will go up, maybe down.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

2019-02-11 Thread Tor Bug Tracker & Wiki
#28322: Deploy better notification system for operational issues
-+--
 Reporter:  karsten  |  Owner:  metrics-team
 Type:  project  | Status:  new
 Priority:  High |  Milestone:
Component:  Metrics  |Version:
 Severity:  Normal   | Resolution:
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:   | Points:  5
 Reviewer:   |Sponsor:
-+--
Changes (by irl):

 * keywords:   => metrics-roadmap-2019-q2
 * priority:  Medium => High
 * points:   => 5
 * type:  task => project


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

[tor-bugs] #28322 [Metrics]: Deploy better notification system for operational issues

2018-11-05 Thread Tor Bug Tracker & Wiki
#28322: Deploy better notification system for operational issues
-+--
 Reporter:  karsten  |  Owner:  metrics-team
 Type:  task | Status:  new
 Priority:  Medium   |  Milestone:
Component:  Metrics  |Version:
 Severity:  Normal   |   Keywords:
Actual Points:   |  Parent ID:
   Points:   |   Reviewer:
  Sponsor:   |
-+--
 We have been using Nagios to monitor Onionoo for a few years now, and we
 recently extended (#28242) or added new Nagios checks (#28271).

 We should consider adding even more checks. One year ago we
 [https://lists.torproject.org/pipermail/metrics-
 team/2017-November/000523.html discussed what checks that could be], and
 it seems like this list could still serve as starting point for adding new
 checks now.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs