Re: [Fuseki] Configuring Fuseki2 to impose a maximum limit on the number of rows returned.

Phil Gooch Fri, 27 Oct 2017 07:43:47 -0700

@Dave - thanks for the info about the two value timeout, I'll try that.

@Andy - according to the META-INF in the fuseki.war file I'm running 2.6.0


#Generated by Maven
#Tue May 02 13:43:43 EDT 2017
version=2.6.0
groupId=org.apache.jena
artifactId=jena-fuseki-war

The config file for demo.ttl in the configuration directory looks like this

@prefix :      <http://base/#> .
@prefix tdb:   <http://jena.hpl.hp.com/2008/tdb#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ja:    <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .

:service_tdb_all  a                   fuseki:Service ;
        rdfs:label                    "TDB demo" ;
        fuseki:dataset                :tdb_dataset_readwrite ;
        fuseki:name                   "demo" ;
        fuseki:serviceQuery           "query" , "sparql" ;
        fuseki:serviceReadGraphStore  "get", "post" ;
        fuseki:serviceReadWriteGraphStore
                "data" ;
        fuseki:serviceUpdate          "update" ;
        fuseki:serviceUpload          "upload" .

:tdb_dataset_readwrite
        a             tdb:DatasetTDB ;
        ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue
"30000,60000" ] ;
        tdb:location  "/etc/fuseki/databases/demo" .


Cheers

Phil



On Fri, Oct 27, 2017 at 3:27 PM, Andy Seaborne <[email protected]> wrote:

> Phil -
>
> Which version are you running?
>
> Can you show the configuration file?
>
>      Andy
>
>
> On 27/10/17 08:30, Dave Reynolds wrote:
>
>> On 26/10/17 12:27, Phil Gooch wrote:
>>
>>> Hi there
>>>
>>> I am running Fuseki2 within Tomcat and I'm looking for a way to configure
>>> Fuseki to limit the number of rows returned by a query. For example, to
>>> prevent a rogue query such as
>>>
>>> SELECT * WHERE {?s ?v ?o}
>>>
>>> from being executed to completion.
>>>
>>> I've imposed a maximum timeout via
>>>
>>> ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "60000" ] ;
>>>
>>> in config.ttl and also in the individual <dataset>.ttl files, but this
>>> does
>>> not seem to prevent the above query from locking up the server.
>>>
>>
>> Timeouts do generally work. There used to be problems with sort queries
>> but those have been resolved and that's not a sort query.
>>
>> Might be worth trying the two value version (time to first result and
>> time for whole query):
>>
>> ja:context [ja:cxtName "arq:queryTimeout";  ja:cxtValue "30000,60000" ];
>>
>>
>>> I've looked through the documentation at
>>>
>>> https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html
>>> https://jena.apache.org/documentation/serving_data/#fuseki-
>>> configuration-file
>>> https://github.com/apache/jena/tree/master/jena-fuseki2/examples
>>>
>>> but I've not found the right config option.
>>>
>>> Is this possible, or will I need to modify the source code to add a
>>> LIMIT n
>>> if this is not specified in the original query?
>>>
>>
>> There's no built-in machinery to limit the number of rows so far as I
>> know. So if timeouts really don't work for you then indeed you would need
>> to inject a LIMIT clause into the queries yourself.
>>
>> Timeouts are generally better because some queries are really really hard
>> but return few results whereas queries like the above stream perfectly well
>> and should impose low load, they just go on for a long time.
>>
>> In our case the endpoints we expose are typically APIs where we can
>> inject API-specific hard/soft row limits as part of the query generation
>> phase. For full sparql endpoints then we rely on timeouts.
>>
>> Dave
>>
>>

Re: [Fuseki] Configuring Fuseki2 to impose a maximum limit on the number of rows returned.

Reply via email to