[Wikidata-bugs] [Maniphest] [Changed Subscribers] T212189: New Service Request: Wikidata Termbox SSR

2019-03-07 Thread WMDE-leszek
WMDE-leszek added a subscriber: sbassett.
WMDE-leszek added a comment.


  @akosiaris thanks for listing up information needed by SRE. This is very 
helpful.
  Before I add those to the task description, we'd appreciated you having a 
look at these (especially SLOs) and advice in case we are somehow off. In 
particular, given the service is going to be accessed via MediaWiki, its 
availability is depending on availability of MW. As we don't know what is the 
uptime of Wikipedia, so please tell us if we're going over board here.
  
  Contact details in case the service suffers an outage: Wikimedia Deutschland, 
leszek_wmde at freenode, @WMDE-leszek on phabricator
  
  Person/team to be the service-owner: Wikimedia Deutschland
  
  SLOs of this service
  
  - 500ms < request latency (seconds) < 1500ms
  - error rate (1/n) < 1/1000
  - system throughput (1/second) < 10
  - availability (% of time) > 99.9
  
  SLIs for this service:
  
  - request latency (seconds)
  - error rate (1/n)
  - system throughput (1/second)
  - availability (% of time)
  
  An estimation of the traffic the service is expected to receive: ~1 req/s
  
  A schedule for when we would like to have this deployed to production:
  
  - ASAP as feasible, preferably 2019-03-31 at the latest.
  - Note, there is security review of the service code pending 
(https://phabricator.wikimedia.org/T216419) being performed by @sbassett, who 
could possibly inform on status if needed. We don’t know whether the service 
can be deployed but kept inactive/unused until the security review is done.

TASK DETAIL
  https://phabricator.wikimedia.org/T212189

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: WMDE-leszek
Cc: sbassett, thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, 
Krinkle, Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, 
Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, 
alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, 
Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, 
QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, 
Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T212189: New Service Request: Wikidata Termbox SSR

2019-03-04 Thread akosiaris
akosiaris added subscribers: Tarrow, thcipriani.
akosiaris added a comment.


  Per some IRC discussions we had in #wikimedia-serviceops, the code should be 
updated to be service-runner compatible as this will greatly increase 
homogeneity and allow for easy handling of things like logging, metrics as well 
as potentially rate limiting and DNS cache management. As far as I have 
understood, @Tarrow is already working on that (many thanks!). Following that, 
we should enable the pipeline for the project so that it builds docker images 
for this services. The first part is easy, we will need just a 
`.pipeline/blubber.yaml` and enabling the pipeline.  Adding @thcipriani for 
that. Docs are currently under https://wikitech.wikimedia.org/wiki/Blubber. I 
can help with the next step which is the creation of a helm chart for the 
service. After that, (and assuming all other prereqs are done) it's time for 
deployment.
  
  There are a number of questions to answer as well, regardless of all the 
technical questions above:
  
  - We will need contact details in case the service suffers an outage
  - We will need a person/team to be the service-owner (that can be the same as 
above)
  - The service owner will have to state what will the required availability of 
this service (no, it can't be 100%) be.
- In order to answer that question in a structured way another question 
needs to be answered and it's  "What will be the SLO(s) of this service" (SLO 
stands for Service Level Objective). Which in turns implies another question (I 
promise it's the last in this stack) which is "What are the SLIs for this 
service" (SLI stands for Service Level Indicator aka a metric). Assuming a 
service-runner integration we will be able to have easily metrics (and graphs) 
for requests/sec, latency, errors. Any of these (or all + whatever else is 
deemed important to measure) can be chosen as SLIs and a target (aka an SLO) 
can be chosen on those. For better explanation of the terms SLI, SLO for now 
please have a look at 
https://landing.google.com/sre/sre-book/chapters/service-level-objectives/, as 
we are still building the documentation for all of this.
  - An estimation of the traffic the service is expected to receive: Already 
given, it's ~1 req/s
  - A schedule for when we would like to have this deployed to production as 
SRE will have to reserve some cycles for this.

TASK DETAIL
  https://phabricator.wikimedia.org/T212189

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: akosiaris
Cc: thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle, 
Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE, 
Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek, 
alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan, 
Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, 
QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj, 
Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs