To add to what else has been said Query execution in Apache Jena ARQ is based upon lazy evaluation wherever possible. Calling execSelect() simply prepares a ResultSet that is capable of delivering the results but doesn't actually evaluate the query and produce any results until you call hasNext()/next(). When you call either of these methods then ARQ does the minimum amount of work to return the next result (or batch of results) depending on the underlying algebra of the query.
Rob On 23/02/2020, 18:58, "Steve Vestal" <[email protected]> wrote: I'm looking for suggestions on a SPARQL performance issue. My test model has ~800 sentences, and processing of one select query takes about 25 minutes. The query is a basic graph pattern with 9 variables and 20 triples, plus a filter that forces distinct variables to have distinct solutions using pair-wise not-equals constraints. No option clause or anything else fancy. I am issuing the query against an inference model. Most of the asserted sentences are in imported models. If I iterate over all the statements in the OntModel, I get ~1500 almost instantly. I experimented with several of the reasoners. Below is the basic control flow. The thing I found curious is that the execSelect() method finishes almost instantly. It is the iteration over the ResultSet that is taking all the time, it seems in the call to selectResult.hasNext(). The result has 192 rows, 9 columns. The results are provided in bursts of 8 rows each, with ~1 minute between bursts. OntModel ontologyModel = getMyOntModel(); // Tried various reasoners String selectQuery = getMySelectQuery(); QueryExecution selectExec = QueryExecutionFactory.create(selectQuery, ontologyModel); ResultSet selectResult = selectExec.execSelect(); while (selectResult.hasNext()) { // Time seems to be spent in hasNext QuerySolution selectSolution = selectResult.next(); for (String var : getMyVariablesOfInterest() { RDFNode varValue = selectSolution.get(var); // process varValue } } Any suggestions would be appreciated.
