Hi, I compared two solutions regarding the creation of algebra (one graph, with one bgp with 7 triple patterns and a filter with 9 conditions):
a) It consists in creating the algebra (by instantiating around 50 java objects) each time I receive a quadruple. b) It uses a template and placeholders as you suggested. The template is created once. Placeholders are represented by using Node_Var with a dedicated name (there are 4 variables). Each time a quadruple is received, the template is rewritten by using NodeTransformLib#transform with a custom NodeTransform that checks the name of each Node_Var and replaces it by the desired value is that name is one of the placeholders. I was thinking that the solution you (Andy) suggested b) was better than a) because if I have understood how Transform works, several object instantiation can be avoided. Unfortunately, after running 10^6 times both solutions (with JDK7), solution b) is 11% slower than a). Laurent On Tue, Jan 31, 2012 at 3:54 PM, Laurent Pellegrino <[email protected]> wrote: > Thanks for information and advices. > >> Which storage layer are you using? > > Iam using TDB with Datasetgraph and transactions. > > Laurent > > On Tue, Jan 31, 2012 at 12:15 PM, Andy Seaborne <[email protected]> wrote: >> On 30/01/12 16:02, Laurent Pellegrino wrote: >>> >>> Hi all, >>> >>> Some context to understand why I am asking more information: I have an >>> application where each time it is called, a new SPARQL query (as >>> String) is created based on the template and a quadruple (Java >>> object). This means that interesting values from the quadruple have to >>> be transformed to be put inside the SPARQL template by using a node >>> formatter. Then, the SPARQL query has to be parsed and the result has >>> to be evaluated against a dataset. >>> >>> I was wondering whether I can skip these parsing steps to save some >>> time during the execution of the application. It seems it is possible >>> by working at the algebra level. >>> >>> If some optimizations are done on queries by Jena, are they done >>> before the evaluation or after parsing? I mean, when I give to >>> Algebra.exec(...) a query is it always optimized via a call to >>> Algebra#optimize? >> >> >> Optimizations are done at the start of execution. >> >> They happen at the point when QueryEngineBase calls modifyOp. >> >> In QueryEngineBase, modifyOp just returns the op unchanged. >> >> In QueryEngineMain, used by ARQ for general evaluation, mainly in-memory, >> modifyOp is a call to Algebra.optimize >> >> QueryEngineTDB extends QueryEngineMain. It calls super.modifyOp and does >> some additional stuff. >> >> QueryEngineSDB inherits from QueryEngineBase so it does nothing. It's >> processing is done earlier (historical reason) in QueryEngineSDB.init and it >> calls a couple of optimizations directly. It does not want the join >> optimizations. >> >> You can replace the optimizer even down to a per-execution basis: see >> ARQConstants.sysOptimizerFactory and Optimize.decideOptimizer. Or turn off >> various optimizations by symbol setting in the context. See >> Optimize.rewrite. >> >> >>> Is there any builder to ease the construction of algebra? >> >> >> One way might be to construct the algebra using placeholders (well-known >> nodes), then use a Transform to change it. >> >> >>> I have also seen in a wiki page [1] it is possible to work at the >>> syntax level. Do you think it better to work at the syntax level or at >>> the algebra level to do what I want? >> >> >> Algebra. >> >> >>> >>> [1] >>> http://incubator.apache.org/jena/documentation/query/manipulating_sparql_using_arq.html >>> >>> Kind Regards, >>> >>> Laurent >> >> >> Which storage layer are you using? >> >> Andy
