I have just had a problem with query timeout overrides when upgrading to
Fuseki 3.12.0. I think the issue is related to JENA-1620 [1] which was
deployed with Jena 3.10.0.
Essentially, JENA-1620 modified the query timeout override functionality
to constrain timeout overrides so that they must be less than the
timeout specified in the Fuseki configuration. We have a production
system that relies on being able override timeouts with a value greater
than that specified in the Fuseki configuration file.
My question is - what is the best way for us to implement our use case
using Fuseki? We don't have to do it the way we used to do it, but some
guidance on how to approach the problem would be welcome.
We have single largish (500M triples) dataset. We expose a SPARQL
query endpoint to this data dataset to the public on the internet, and
naturally, we specify a timeout.
We also have internal applications that query the same dataset. Their
queries takes longer than the public timeout.
Prior to Fuseki 3.10.0 we could we could do this:
* two services were configured in the Fuseki config.ttl file
o a public service
o a private service
* both services shared the same dataset
o the dataset was configured with a timeout suitable for queries
from the public internet
* the private service was configured to allow query timeout override
o which we used to give our internal services more time than
specified in the configuration file
o this does not work after JENA-1620
* our proxy configuration ensured that queries from the internet can
only reach the public service
I have included a simplified version of our config.ttl file below [2].
I've been thinking about ways of achieving the desired effect whilst
respecting the change introduced by JENA-1620. An obvious approach
would be to duplicate the dataset and set different timeouts on the
different datasets. This would mean that the two datasets were
competing for memory and I would rather not do that as it is likely to
have a negative impact on performance.
I've been thinking about other approaches also, but I'll spare you those
as there might be a real simple solution I'm unaware of.
Is there a way to configure Fuseki so that different timeouts can be set
for different classes of requestor?
Brian
[1] https://issues.apache.org/jira/browse/JENA-1620
[2]
[] rdf:type fuseki:Server ;
ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue
"90000,120000" ] ;
fuseki:services (
<#service_ds>
<#service_ds_timeout_override>
) .
# TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
<#service_ds> rdf:type fuseki:Service ;
rdfs:label "TDB Service" ;
fuseki:name "public" ;
fuseki:serviceQuery "query" ;
fuseki:dataset <#ds> ;
.
<#service_ds_timeout_override>
rdfs:label "TDB Service Query with
timeout override" ;
fuseki:name "private" ;
fuseki:allowTimeoutOverride true;
fuseki:serviceQuery "query" ;
fuseki:dataset <#ds> ;
.
<#ds> rdf:type tdb:DatasetTDB ;
tdb:location "/var/lib/fuseki/databases/DS" ;
tdb:unionDefaultGraph true ;
.