Re: [Wikidata] SPARQL service timeouts

2016-04-19 Thread Stas Malyshev
Hi! > I'm late to the game, but a quick look into the nginx logs does not > show all that much. I see a few connection refused, but that should > translate in an HTTP 502 error, not in a partial answer. > > I'm really not good at reading VCL, but it seems that we do have some > rules in our

Re: [Wikidata] SPARQL service timeouts

2016-04-19 Thread Guillaume Lederrey
I'm late to the game, but a quick look into the nginx logs does not show all that much. I see a few connection refused, but that should translate in an HTTP 502 error, not in a partial answer. I'm really not good at reading VCL, but it seems that we do have some rules in our Varnish config to

Re: [Wikidata] SPARQL service timeouts

2016-04-19 Thread Addshore
Yes the size reported there will be the compressed size, so actual bytes over the port! Looking at the patch further it looks like some nginx settings were changed while caching was enabled that may also be worth looking at. On 19 April 2016 at 10:42, Markus Krötzsch

Re: [Wikidata] SPARQL service timeouts

2016-04-19 Thread Markus Krötzsch
On 19.04.2016 11:33, Addshore wrote: Also per https://phabricator.wikimedia.org/T126730 and https://gerrit.wikimedia.org/r/#/c/274864/8 requests to the query service are now cached for 60 seconds. I expect this will include error results from timeouts so retrying a request within the same 60

Re: [Wikidata] SPARQL service timeouts

2016-04-19 Thread Markus Krötzsch
On 19.04.2016 11:05, Addshore wrote: In the case we are discussing here the truncated JSON is caused by blaze graph deciding it has been sending data for too long and then stopping (as I understand). Thus you will only see a spike on the graph for the amount of data actually sent from the

Re: [Wikidata] SPARQL service timeouts

2016-04-19 Thread Addshore
Also per https://phabricator.wikimedia.org/T126730 and https://gerrit.wikimedia.org/r/#/c/274864/8 requests to the query service are now cached for 60 seconds. I expect this will include error results from timeouts so retrying a request within the same 60 seconds as the first won't event reach the

Re: [Wikidata] SPARQL service timeouts

2016-04-19 Thread Addshore
In the case we are discussing here the truncated JSON is caused by blaze graph deciding it has been sending data for too long and then stopping (as I understand). Thus you will only see a spike on the graph for the amount of data actually sent from the server, not the size of the result blazegraph

Re: [Wikidata] SPARQL service timeouts

2016-04-18 Thread Markus Kroetzsch
On 18.04.2016 22:21, Markus Kroetzsch wrote: On 18.04.2016 21:56, Markus Kroetzsch wrote: Thanks, the dashboard is interesting. I am trying to run this query: SELECT ?subC ?supC WHERE { ?subC p:P279/ps:P279 ?supC } It is supposed to return a large result set. But I am only running it once

Re: [Wikidata] SPARQL service timeouts

2016-04-18 Thread Markus Kroetzsch
On 18.04.2016 21:56, Markus Kroetzsch wrote: Thanks, the dashboard is interesting. I am trying to run this query: SELECT ?subC ?supC WHERE { ?subC p:P279/ps:P279 ?supC } It is supposed to return a large result set. But I am only running it once per week. It used to work fine, but today I

Re: [Wikidata] SPARQL service timeouts

2016-04-18 Thread Info WorldUniversity
Hi Markus and All, In what to best scale and develop Wikidata analytics to anticipate great future SPARQL use, as well querying in all ~300 Wikipedia languages and possibly in all ~8000 languages +? Are there new Wikidata job opportunities here even? Scott On Apr 18, 2016 12:31 PM, "Markus

Re: [Wikidata] SPARQL service timeouts

2016-04-18 Thread Stas Malyshev
Hi! > I have the impression that some not-so-easy SPARQL queries that used to > run just below the timeout are now timing out regularly. Has there been > a change in the setup that may have caused this, or are we maybe seeing > increased query traffic [1]? We've recently run on a single server

[Wikidata] SPARQL service timeouts

2016-04-18 Thread Markus Kroetzsch
Hi, I have the impression that some not-so-easy SPARQL queries that used to run just below the timeout are now timing out regularly. Has there been a change in the setup that may have caused this, or are we maybe seeing increased query traffic [1]? Cheers, Markus [1] The deadline for the