akosiaris added subscribers: Tarrow, thcipriani.
akosiaris added a comment.
Per some IRC discussions we had in #wikimedia-serviceops, the code should be
updated to be service-runner compatible as this will greatly increase
homogeneity and allow for easy handling of things like logging, metrics as well
as potentially rate limiting and DNS cache management. As far as I have
understood, @Tarrow is already working on that (many thanks!). Following that,
we should enable the pipeline for the project so that it builds docker images
for this services. The first part is easy, we will need just a
`.pipeline/blubber.yaml` and enabling the pipeline. Adding @thcipriani for
that. Docs are currently under https://wikitech.wikimedia.org/wiki/Blubber. I
can help with the next step which is the creation of a helm chart for the
service. After that, (and assuming all other prereqs are done) it's time for
deployment.
There are a number of questions to answer as well, regardless of all the
technical questions above:
- We will need contact details in case the service suffers an outage
- We will need a person/team to be the service-owner (that can be the same as
above)
- The service owner will have to state what will the required availability of
this service (no, it can't be 100%) be.
- In order to answer that question in a structured way another question
needs to be answered and it's "What will be the SLO(s) of this service" (SLO
stands for Service Level Objective). Which in turns implies another question (I
promise it's the last in this stack) which is "What are the SLIs for this
service" (SLI stands for Service Level Indicator aka a metric). Assuming a
service-runner integration we will be able to have easily metrics (and graphs)
for requests/sec, latency, errors. Any of these (or all + whatever else is
deemed important to measure) can be chosen as SLIs and a target (aka an SLO)
can be chosen on those. For better explanation of the terms SLI, SLO for now
please have a look at
https://landing.google.com/sre/sre-book/chapters/service-level-objectives/, as
we are still building the documentation for all of this.
- An estimation of the traffic the service is expected to receive: Already
given, it's ~1 req/s
- A schedule for when we would like to have this deployed to production as
SRE will have to reserve some cycles for this.
TASK DETAIL
https://phabricator.wikimedia.org/T212189
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: akosiaris
Cc: thcipriani, Tarrow, Smalyshev, jijiki, fsero, CDanis, akosiaris, Krinkle,
Milimetric, daniel, mobrovac, Joe, Matthias_Geisler_WMDE, Jakob_WMDE,
Pablo-WMDE, Aklapper, Lydia_Pintscher, Lea_WMDE, Addshore, WMDE-leszek,
alaa_wmde, holger.knust, Legado_Shulgin, Nandana, thifranc, AndyTan,
Davinaclare77, Qtn1293, Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0,
QZanden, LawExplorer, Zppix, _jensen, rosalieper, Wong128hk, Eevans, Hardikj,
Wikidata-bugs, aude, faidon, Jdforrester-WMF, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs