Thanks Andy In the Tomcat logs I see
Query Cancelled - results truncated (but 200 already sent) but the Javascript web app becomes unresponsive and locks up the browser (Chrome and Safari tested). The context to this is that I have modified shiro.ini to prevent non-admin remote users from modifying or uploading data - they can just use the UI to create and run SPARQL queries. So I wanted a way of preventing the user doing things they shouldn't, such as running queries that might attempt to return all the data. I was thinking the easiest way to do this would be to sniff the query and append a LIMIT onto the end of it if there isn't one already present. Cheers Phil On Mon, Oct 30, 2017 at 2:47 PM, Andy Seaborne <[email protected]> wrote: > When are you getting server overload, what's happening to the server? When > I've tried it, yes, the request is doing a lot of work and the network is > working but the server itself was just a bit sluggish. > > > There isn't a way ATM and it is messy in HTTP (like timeouts). > But it would be a good thing to have. > > HTTP requires the status code be sent first so if it is "200 OK" the > contract is that the response will complete properly. > > > If you or somone wants to put in a contribution, that would be great. > > Recorded as JENA-1412. > > What is needed is handling in the same way as query timeouts. (The old > JENA-228 tried by inserting a LIMIT but then you don't know if the query > overran or not. What is needed is a QueryIterator to wrap the execution > and throw QueryCancelledException , then check the handling of results. > Does really need to be done the same as query timeout). > > Andy > > > On 30/10/17 13:43, Phil Gooch wrote: > >> Hi Andy >> >> Thanks, the timeout works fine, it’s just the number of rows returned that >> I’d like to impose a hard limit on via a configuration file, if possible. >> >> Cheers >> >> Phil >> >> >> On Mon, 30 Oct 2017 at 13:11, Andy Seaborne <[email protected]> wrote: >> >> Phil, >>> >>> Anthing thing to try: >>> >>> Adding "?timeout=1" to a query HTTP URL sets a 1 second timeout on the >>> query. >>> >>> Be careful - it is different from context settings - it is in seconds, >>> not milliseconds, and does not provide "X,Y" >>> >>> Andy >>> >>> On 30/10/17 12:33, Andy Seaborne wrote: >>> >>>> Hi Phil, >>>> >>>> That all looks OK to me. >>>> >>>> I tried your configuration with timeout of "1000,1000" and a query of: >>>> >>>> PREFIX afn: <http://jena.apache.org/ARQ/function#> >>>> >>>> ASK{ >>>> FILTER(afn:wait(1000)) >>>> FILTER(afn:wait(1000)) >>>> FILTER(afn:wait(1000)) >>>> FILTER(afn:wait(1000)) >>>> FILTER(afn:wait(1000)) >>>> } >>>> >>>> and I got back a query timeout (using the latest code - I don't see any >>>> changes in the codebase). >>>> >>>> I tried the standalone server and as a war file. >>>> >>>> Could you try the same please? >>>> >>>> Andy >>>> >>>> On 27/10/17 15:43, Phil Gooch wrote: >>>> >>>>> @Dave - thanks for the info about the two value timeout, I'll try that. >>>>> >>>>> @Andy - according to the META-INF in the fuseki.war file I'm running >>>>> 2.6.0 >>>>> >>>>> #Generated by Maven >>>>> #Tue May 02 13:43:43 EDT 2017 >>>>> version=2.6.0 >>>>> groupId=org.apache.jena >>>>> artifactId=jena-fuseki-war >>>>> >>>>> The config file for demo.ttl in the configuration directory looks like >>>>> this >>>>> >>>>> @prefix : <http://base/#> . >>>>> @prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> . >>>>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . >>>>> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . >>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . >>>>> @prefix fuseki: <http://jena.apache.org/fuseki#> . >>>>> >>>>> :service_tdb_all a fuseki:Service ; >>>>> rdfs:label "TDB demo" ; >>>>> fuseki:dataset :tdb_dataset_readwrite ; >>>>> fuseki:name "demo" ; >>>>> fuseki:serviceQuery "query" , "sparql" ; >>>>> fuseki:serviceReadGraphStore "get", "post" ; >>>>> fuseki:serviceReadWriteGraphStore >>>>> "data" ; >>>>> fuseki:serviceUpdate "update" ; >>>>> fuseki:serviceUpload "upload" . >>>>> >>>>> :tdb_dataset_readwrite >>>>> a tdb:DatasetTDB ; >>>>> ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue >>>>> "30000,60000" ] ; >>>>> tdb:location "/etc/fuseki/databases/demo" . >>>>> >>>>> >>>>> Cheers >>>>> >>>>> Phil >>>>> >>>>> >>>>> >>>>> On Fri, Oct 27, 2017 at 3:27 PM, Andy Seaborne <[email protected]> >>>>> wrote: >>>>> >>>>> Phil - >>>>>> >>>>>> Which version are you running? >>>>>> >>>>>> Can you show the configuration file? >>>>>> >>>>>> Andy >>>>>> >>>>>> >>>>>> On 27/10/17 08:30, Dave Reynolds wrote: >>>>>> >>>>>> On 26/10/17 12:27, Phil Gooch wrote: >>>>>>> >>>>>>> Hi there >>>>>>>> >>>>>>>> I am running Fuseki2 within Tomcat and I'm looking for a way to >>>>>>>> configure >>>>>>>> Fuseki to limit the number of rows returned by a query. For >>>>>>>> example, to >>>>>>>> prevent a rogue query such as >>>>>>>> >>>>>>>> SELECT * WHERE {?s ?v ?o} >>>>>>>> >>>>>>>> from being executed to completion. >>>>>>>> >>>>>>>> I've imposed a maximum timeout via >>>>>>>> >>>>>>>> ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "60000" ] >>>>>>>> ; >>>>>>>> >>>>>>>> in config.ttl and also in the individual <dataset>.ttl files, but >>>>>>>> >>>>>>> this >>> >>>> does >>>>>>>> not seem to prevent the above query from locking up the server. >>>>>>>> >>>>>>>> >>>>>>> Timeouts do generally work. There used to be problems with sort >>>>>>> >>>>>> queries >>> >>>> but those have been resolved and that's not a sort query. >>>>>>> >>>>>>> Might be worth trying the two value version (time to first result and >>>>>>> time for whole query): >>>>>>> >>>>>>> ja:context [ja:cxtName "arq:queryTimeout"; ja:cxtValue >>>>>>> "30000,60000" ]; >>>>>>> >>>>>>> >>>>>>> I've looked through the documentation at >>>>>>>> >>>>>>>> >>>>>>>> https://jena.apache.org/documentation/fuseki2/fuseki-configu >>> ration.html >>> >>>> >>>>>>>> https://jena.apache.org/documentation/serving_data/#fuseki- >>>>>>>> configuration-file >>>>>>>> https://github.com/apache/jena/tree/master/jena-fuseki2/examples >>>>>>>> >>>>>>>> but I've not found the right config option. >>>>>>>> >>>>>>>> Is this possible, or will I need to modify the source code to add a >>>>>>>> LIMIT n >>>>>>>> if this is not specified in the original query? >>>>>>>> >>>>>>>> >>>>>>> There's no built-in machinery to limit the number of rows so far as I >>>>>>> know. So if timeouts really don't work for you then indeed you would >>>>>>> need >>>>>>> to inject a LIMIT clause into the queries yourself. >>>>>>> >>>>>>> Timeouts are generally better because some queries are really really >>>>>>> hard >>>>>>> but return few results whereas queries like the above stream >>>>>>> perfectly well >>>>>>> and should impose low load, they just go on for a long time. >>>>>>> >>>>>>> In our case the endpoints we expose are typically APIs where we can >>>>>>> inject API-specific hard/soft row limits as part of the query >>>>>>> generation >>>>>>> phase. For full sparql endpoints then we rely on timeouts. >>>>>>> >>>>>>> Dave >>>>>>> >>>>>>> >>>>>>> >>>>> >>> >>
