On 19/09/17 13:42, George News wrote:
On 2017-09-19 14:24, Dave Reynolds wrote:
On 19/09/17 11:33, George News wrote:
On 2017-09-19 09:57, Dave Reynolds wrote:
On 19/09/17 01:13, Dimov, Stefan wrote:
Hi,
I have Tomcat setup, that receives REST requests, “translates” them
into SAPRQL queries, invokes them on the underlying FUSEKI and returns
the results:
USER AGENT
^
REST
v
---------------
TOMCAT
^
REST
v
-------------
FUSEKI
------------
JENA
-----------
TDB
----------
Would I be able to achieve significant performance improvement, if I
use directly the JENA libraries and bypass FUSEKI?
Unlikely. We successfully use the set up you describe for dozens of
services, some quite high load. We have a few which go direct to Jena
for legacy reasons and they show no particular performance benefits.
If your payloads can be large then make sure the way you are driving
fuseki is streaming and doesn't accidentally store the entire SPARQL
results in your tomcat app. This also means chosing a streamable media
type for your fuseki requests.
I'm using Jena to create my own REST service and I'm facing some issues
when SPARQL resultsets are big. Could you please give me a hint on the
streaming stuff from fuseki so I can incorporate that to my REST service?
If you are just doing SELECTs then it should be straightforward. Of the
sparql results media types then at least XML and TSV are streaming. We
just use Jena's QueryExecutionFactory.sparqlService in the REST service
to set up the execution. We wrap the ResultSet from execSelect and
process that one row at a time. Our wrapper keeps track of the
underlying QueryExecution so we can close that when finished or in the
event of a problem.
In my case I'm handling everything using Jena and not Fuseki so I'm not
using sparqlService but execSelect().
However your comment about streaming lead me towards this new approach
of handling ResultSet and not having to store everything in memory. I
think this achieve a similar thing.
QueryExecution qExec = QueryExecutionFactory.create(query, m);
ResultSet rs = qExec.execSelect();
StreamingOutput stream = new StreamingOutput() {
@Override
public void write(OutputStream os) throws IOException,
WebApplicationException {
ResultSetFormatter.outputAsJSON(os, rs);
res.close();
}
};
return Response.ok(stream).build();
You might what to check if ResultSetFormatter.outputAsJSON is itself
streaming, it may not be. In our case we have custom JSON and CSV
serializers which take care to stream.
As I say, we explicitly track the qExec and ensure that is closed when
the formatting is completed, using try/finally to be sure. I'm not sure
whether ResultSet.close() will ensure that for you. [Not saying it
won't, just not sure.]
Dave