[Wikidata-bugs] [Maniphest] T283962: Bug: Timeout in Wikidata Query Service #WQDS breaks the JSON response syntax by injecting a raw text Error message with a Java stack dump

2021-06-07 Thread Gehel
Gehel added subscribers: LucasWerkmeister, Gehel.
Gehel closed this task as "Declined".
Gehel added a comment.


  As @LucasWerkmeister said, this is complicated to implement. This would also 
have to be implemented upstream (by Blazegraph) and is unlikely to happen. I'm 
closing this as it is unreasonable to expect a resolution by anyone.

TASK DETAIL
  https://phabricator.wikimedia.org/T283962

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Gehel, LucasWerkmeister, Lucas_Werkmeister_WMDE, Timbl, Aklapper, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T283962: Bug: Timeout in Wikidata Query Service #WQDS breaks the JSON response syntax by injecting a raw text Error message with a Java stack dump

2021-05-31 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.


  See also T169666: Render partial results 
.
  
  I suspect this isn’t possible. Getting a timeout in the middle of a JSON 
response is fairly rare (normally you only get the timeout) – as far as I 
understand, it happens when the query service had enough time to collect all 
the results (or the query was simple enough that it could start sending results 
immediately), but then the timeout happened while it was sending the results. I 
don’t see how it could anticipate this situation at the start of the results 
sending (when the “head” is written).

TASK DETAIL
  https://phabricator.wikimedia.org/T283962

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: Lucas_Werkmeister_WMDE, Timbl, Aklapper, Invadibot, MPhamWMF, maantietaja, 
CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T283962: Bug: Timeout in Wikidata Query Service #WQDS breaks the JSON response syntax by injecting a raw text Error message with a Java stack dump

2021-05-29 Thread Maintenance_bot
Maintenance_bot added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T283962

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Maintenance_bot
Cc: Timbl, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T283962: Bug: Timeout in Wikidata Query Service #WQDS breaks the JSON response syntax by injecting a raw text Error message with a Java stack dump

2021-05-29 Thread Timbl
Timbl created this task.
Timbl added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  **List of steps to reproduce**
  
  - To reproduce:  Request a query which will cause the server to his its 
processing time limit.
  - Ask for JSON return payload.
  - For example,
  
SELECT ?subject ?name
 WHERE {
   ?klass wdt:P279*  .
   ?subject wdt:P31 ?klass .
   ?subject rdfs:label ?name.
   FILTER regex(?name, "mass", "i")
 } LIMIT 3000 
  
  **What happens?**:
  
  - Check the returned HTTP response. The JSON starts well like
  
{
 "head" : {
   "vars" : [ "subject", "name" ]
 },
 "results" : {
   "bindings" : [ {
 "subject" : {
   "type" : "uri",
   "value" : "http://www.wikidata.org/entity/Q118786;
 },
 "name" : {
   "xml:lang" : "it",
   "type" : "literal",
   "value" : "abbazia di San Massimino"
 }
   }, {
 "subject" : {
   "type" : "uri",
   "value" : "http://www.wikidata.org/entity/Q3603258;
 },
  
  but after a lot of valid results, is interrupted like this:
  
 }
   }, {
 "subject" : {
   "type" : "uri",
   "value" : "http://www.wikidata.org/entity/Q268770;
 },
 "name" : {
   "xml:lang" : "es",
   "type" : "literal",
SPARQL-QUERY: queryStr=SELECT ?subject ?name
 WHERE {
   ?klass wdt:P279*  .
   ?subject wdt:P31 ?klass .
   ?subject rdfs:label ?name.
   FILTER regex(?name, "mass", "i")
 } LIMIT 3000 
java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:205)
at 
com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292)
at 
com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:678)
  
  followed by the rest of the Java stack dump.
  
  **What should have happened instead?**:
  
  The JSON payload should have been  preserved as intact parse-able JSON under 
all circumstances.
  
  The result JSON should for example contain the bindings in normal format but 
also a status part in the `head` maybe or separate in the `results` section 
beside the bindings which does not break or detract from the `bindings` part.
  The status part could be checked for status flags like "Did the query hit the 
limit?" and "Did the query hit a timeout?"
  
  The reporting of Java stack dumps is probably a good idea of an unknown 
exception is caught, but timeout is a very kniwn condition and 
  a stack dump probably is not a good idea.
  
  I don't know about standards  or common practice for this, good to check.
  
  **Motivation**
  
  When you are doing for example an autocomplete query (as just one example), 
when there are a lot of results returned, it is valuable to have a subset.   
The data is valuable, even though it may not be complete.
  
  In the case of the timeout having happened, then there will in fact be a 
particularly large amount valuable data  which the user has waited for, the 
query engine has burned energy, and which will have been already transferred 
over the net.  But he typical client-side JSON parser will throw an exception 
on the syntax error, and return no data at all
  
  The user will be fine having a bunch of matching things thrown up and they 
can then
  
  Note it **is important** for the response to distinguish between one which 
returned all matching values, and which hit a limit, be it LIMIT number given 
by the client, or a timeout imposed by the server.  Once the client knows it 
has all the matches, it can locally subset them as the user's search is more 
and more specific, without going back to the sever.
  
  **Software version (if not a Wikimedia wiki), browser information, 
screenshots, other information, etc**:
  
  Current running wikidata query service. 
  This example above was on 2021-05-29

TASK DETAIL
  https://phabricator.wikimedia.org/T283962

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Timbl
Cc: Timbl, Aklapper, MPhamWMF, CBogen, Namenlos314, Gq86, 
Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org