Addshore added a comment.
So the spike reported again above was before https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseQualityConstraints/+/464812/ was merged and deployed.
WBQC now respects the 429 header and will throttle based on it.
I added a new panel to https://grafana.wikimed
gerritbot added a comment.
Change 483112 abandoned by Addshore:
Split SPARQL pre and post throttling metrics
Reason:
This can probably be done with the metrics we already have..
https://gerrit.wikimedia.org/r/483112TASK DETAILhttps://phabricator.wikimedia.org/T204267EMAIL PREFERENCEShttps://phabr
gerritbot added a comment.
Change 483112 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/WikibaseQualityConstraints@master] Split SPARQL pre and post throttling metrics
https://gerrit.wikimedia.org/r/483112TASK DETAILhttps://phabricator.wikimedia.org/T204267E
Addshore added a comment.
In T204267#4652483, @Smalyshev wrote:
Happened again, bumping the priority.
The spike can be seen here
https://grafana.wikimedia.org/d/00344/wikidata-quality?orgId=1&from=1539267066438&to=1539596899898
It doesn't look like there was a bit spike in hits to the api
Smalyshev added a comment.
Also, 1,182,961 events is a lot. What's going on there? Why so many? Is it a legit scenario? I wonder also if the most of it is regex matching shouldn't we make some service to do just that. Using full-blown SPARQL database to do regex matching is kinda like hammering in
Smalyshev added a comment.
Is there a reason that all mediawiki hosts show as "localhost"?
This is probably coming from Jetty, which takes it from connection info. Since we have nginx in front of the Blazegraph and no X-Client-IP is supplied, probably, it has no way to discover the originating hos
Addshore added a comment.
It looks like there was another little flood on the 1st of October with requests being banned again:
https://logstash.wikimedia.org/goto/77f3d01f6e7eaf56e5436727b5643ba2
Is there a reason that all mediawiki hosts show as "localhost"?TASK DETAILhttps://phabricator.wikimed
Addshore added a comment.
In T204267#4584397, @Smalyshev wrote:
All bans are temporary, so as soon as traffic returns to normal the bans will expire. It would be nice if there was a way to wbqc to respect the 429 throttling header, which will avoid bans.
That sounds like a great idea that could
Smalyshev added a comment.
All bans are temporary, so as soon as traffic returns to normal the bans will expire. It would be nice if there was a way to wbqc to respect the 429 throttling header, which will avoid bans.TASK DETAILhttps://phabricator.wikimedia.org/T204267EMAIL PREFERENCEShttps://phabr
Tpt added a comment.
@Jonas Thank you for your feedback.
is it necessary for your tool to run the constraint checks in parallel?
No, I am going to switch to a sequential processing. Thanks for the idea!
Using WDQS instead would be a good idea, because then only your tool would get throttled.
It
Jonas added a comment.
@Tpt is it necessary for your tool to run the constraint checks in parallel?
Using WDQS instead would be a good idea, because then only your tool would get throttled.
The problem there is that just a fraction of all violations are in there at the moment.TASK DETAILhttps://pha
Tpt added a comment.
Sorry everyone for the troubles. I was experimenting with a tool that tries to find corrections for constraint violations.
I have modified it to send a proper User-Agent for all its requests to the Wikidata API but not restarted it.
@Wikidata team: what would be an ok request
Stashbot added a comment.
Mentioned in SAL (#wikimedia-cloud) [2018-09-14T11:22:34Z] T204267 stop the corhist tool (k8s) because is hammering the wikidata APITASK DETAILhttps://phabricator.wikimedia.org/T204267EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: A
aborrero added a comment.
In T204267#4583629, @Pintoch wrote:
@aborrero thanks for the ping. I do not recognize the shape of the queries as coming from this tool though. The openrefine-wikidata tool should do relatively few SPARQL queries, whose results are cached in redis. How did you determine t
Pintoch added a comment.
@aborrero thanks for the ping. I do not recognize the shape of the queries as coming from this tool though. The openrefine-wikidata tool should do relatively few SPARQL queries, whose results are cached in redis. How did you determine that this tool is the source of the pro
Stashbot added a comment.
Mentioned in SAL (#wikimedia-cloud) [2018-09-14T10:51:35Z] T204267 stop the openrefine-wikidata tool (k8s) because is hammering the wikidata APITASK DETAILhttps://phabricator.wikimedia.org/T204267EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailprefe
Addshore added a comment.
Looks like it is tools-worker-1021.tools.eqiad.wmflabsTASK DETAILhttps://phabricator.wikimedia.org/T204267EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: AddshoreCc: Jonas, Addshore, TerraCodes, Liuxinyu970226, Aklapper, Smalyshev, An
Jonas added a comment.
Seems the #wikibase-quality-constraints extension is the source:
https://grafana.wikimedia.org/dashboard/db/wikidata-quality?refresh=10s&orgId=1&from=now-30d&to=now
https://github.com/wikimedia/mediawiki-extensions-WikibaseQualityConstraints/blob/c3687c1626323134e546ba924f3
Smalyshev added a comment.
Kibana log for banned requests.
Example request:
PREFIX wd:
PREFIX wds:
PREFIX wdt:
PREFIX wdv:
PREFIX p:
PREFIX ps:
PREFIX pq:
PREFIX pqv:
PREFIX pr:
PREFIX prv:
PREFIX wikibase:
PREFIX wikibase-beta: SELECT (REGEX("533892", "^(?:[1-9][0-9]+|)$") AS ?matches
19 matches
Mail list logo