This thread got me thinking -- couldn't we have reusable, dereferenceable SPARQL optimizations/rewrites as SPARQL CONSTRUCTs over SPIN RDF? Which would be mappings between the original syntax tree and the optimized syntax tree.
On Mon, Feb 1, 2016 at 8:55 PM, Carlo.Allocca <[email protected]> wrote: > Dear Andy, Lorenz and All, > > Thank you very much for your help and support. > I would like to share the first achievement based on your suggestions. > > I have implemented > > 1) public Element transform(ElementPathBlock eltPB) > 2) public Element transform(ElementGroup arg0, List<Element> arg1) > > running it over a number of example queries I obtain the expected results. > For example, for the input query > > String qString6 = "prefix > rdfs:<http://www.w3.org/2000/01/rdf-schema#> " > + "prefix ex:<http://www.semanticweb.org/dataset1/> " > + "prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> " > + " SELECT DISTINCT ?ind ?boss ?g where " > + "{ " > + " {?ind rdf:type ?z. } " > + "UNION " > + " {" > + " { ?boss ex:isBossOf1 ?ind .}" > + " UNION {?boss ex:isBossOf ?ind ." > + " Filter(?boss=\"mathieu\") }" > + "}" > + "}"; > > when removing the triple (?boss ex:isBossOf ?ind .”), I get > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { { ?ind rdf:type ?z } > UNION > { { ?boss ex:isBossOf1 ?ind } > UNION > { # Empty BGP > > } > } > } > > which is OK. > I just need to find out how to remove an ElementGroup which contains only one > element which is the EMPTY one. > Of course, I need to do the same for the other case, e.g. OPTION, SUBquery, > etc. > > Many Thanks for your help one again. > Best Regards, > Carlo > > > > == Here the code > > > @Override > public Element transform(ElementPathBlock eltPB) { > if (eltPB.isEmpty()) { > > System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)] > ElementPathBlock IS EMPTY:: " + eltPB.toString()); > return eltPB; > } > System.out.println("[RemoveOpTransform::transform(ElementPathBlock > arg0)] ElementPathBlock:: " + eltPB.toString()); > Iterator<TriplePath> l = eltPB.patternElts(); > while (l.hasNext()) { > TriplePath tp = l.next(); > if (tp.asTriple().matches(this.triple)) { > l.remove(); > > System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)] > ElementPathBlock:: " + tp.toString()+" TRIPLE JUST REMOVED!!!"); > > //System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)] > TRIPLE JUST REMOVED!!! "); > System.out.println(""); > return this.transform(eltPB);//eltPB; > } > } > return eltPB; > } > > > @Override > public Element transform(ElementGroup arg0, List<Element> arg1) { > //ElementGroup arg0New = new ElementGroup(); > > List<Element> elemList = arg0.getElements(); > Iterator<Element> itr = elemList.iterator(); > while (itr.hasNext()) { > Element elem = itr.next(); > if (elem instanceof ElementGroup) { > // if(((ElementGroup) elem).isEmpty()){ > > System.out.println("[RemoveOpTransform::visit(ElementGroup arg0)] > ElementGroup IS EMPTY!!! " +((ElementGroup) elem).toString()); > > System.out.println("[RemoveOpTransform::visit(ElementGroup arg0)] > List<Element> arg1 IS EMPTY!!! " +arg1.toString()); > // } > } > > if (elem instanceof ElementFilter) { > //... check if this filter is the one that we should remove > //...get the variables of the triple pattern that we want to > delete > Set<Var> tpVars = new HashSet(); > Node subj = this.triple.getSubject(); > if (subj.isVariable()) { > tpVars.add((Var) subj); > } > Node pred = this.triple.getPredicate(); > if (pred.isVariable()) { > tpVars.add((Var) pred); > } > Node obj = this.triple.getObject(); > if (obj.isVariable()) { > tpVars.add((Var) obj); > } > //...get the variables of the FILTER expression > Set<Var> expVars = ((ElementFilter) > elem).getExpr().getVarsMentioned(); > //...check whether the FILTER expression contains any of the > triple pattern variable > for (Var var : expVars) { > //..if it does then we have to delete the entire FILTER > expression > if (tpVars.contains(var)) { > > System.out.println("[RemoveOpTransform::visit(ElementGroup arg0)] THE > "+((ElementFilter) elem).toString() +"IS GOING TO BE REMOVED!!!"); > //Expr e = new NodeValueBoolean(true); > //ElementFilter newFilter = new ElementFilter(e); > itr.remove(); > return this.transform(arg0, arg1); > } > } > > } > } > // System.out.println("[RemoveOpTransform::transform(ElementGroup > arg0)] arg1 " + arg1.toString()); > // System.out.println(""); > //return arg0New; > return arg0; > } > > > > >> On 31 Jan 2016, at 15:40, Carlo.Allocca <[email protected]> wrote: >> >> Dear Andy, >> >> >>> On 31 Jan 2016, at 15:25, Andy Seaborne <[email protected]> wrote: >>> >>> Cleaning up is slightly easier in the syntax because BGPs and filters are >>> inside a group (nothing to do with GROUP - a syntax group is things between >>> {}. >>> >>> So when you want to delete something, have some custom Element class >>> 'ElementDeleteMe' to return from the ElementFilter or ElementPathBlock. >>> Something that can not appear in a legal query. it does not need to >>> implement anything. >>> >>> Then in >>> transform(ElementGroup el, List<Element> members) ; >>> >>> rewrite 'members' to remove any ElementDeleteMe. >>> >>> if members.size == 0 >>> return ElementDeleteMe >>> >>> so it recursive works. >>> >>> Take care at the top of syntax tree. >>> >> Thank you for the details. >> It seems that I should be using org.apache.jena.sparql.syntax rather than >> org.apache.jena.sparql.algebra. >> I will try to implement it as suggested above. >> >> Many Thanks, >> Best Regards, >> Carlo >> >> >>> Andy >>> >>> On 31/01/16 15:11, Carlo.Allocca wrote: >>>> Hello Lorenz, >>>> >>>> >>>> Sure no problem. >>>> I can share it here. In case you have suggest any other place I will do so. >>>> >>>> Please, let me know. >>>> Many Thanks, >>>> >>>> Best Regards, >>>> Carlo >>>> >>>>> On 31 Jan 2016, at 15:07, Lorenz Bühmann >>>>> <[email protected]> wrote: >>>>> >>>>> Hello Carlo, >>>>> >>>>> I'm sure you'll make it, but maybe you should also share the complete >>>>> code somehow? Maybe this makes it easier for people to help you. >>>>> >>>>> Lorenz >>>>> >>>>>> Hello Lorenz, >>>>>> >>>>>> Thank you very much for your help. >>>>>> Some comments follow in line. >>>>>> >>>>>> >>>>>>> On 31 Jan 2016, at 13:36, Lorenz Bühmann >>>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Hello Carlo, >>>>>>> >>>>>>> there is usually no link from a child node to its parent in the JENA >>>>>>> data structures. >>>>>> It is very good to confirm it as I don’t spend more time to look for it. >>>>>> I thought that I was missing some JENA data structure and packages. >>>>>> >>>>>>> That means you have to keep track of it when you're processing the >>>>>>> parents. Why not run some "clean up" after each child has been >>>>>>> processed? E.g. check if there is no BGP left and thus mark this >>>>>>> current parent as also "toBeRemoved”. >>>>>> Sure. I have already started doing it by using some boolean variables. >>>>>> Let’s see if I can make it. >>>>>> Otherwise I have to develop a module that transform the Jena Query >>>>>> Expression into a Tree based representation and then apply the triple >>>>>> remove operation. >>>>>> >>>>>> Thank you again for your support and answers. >>>>>> >>>>>> Best Regards, >>>>>> Carlo >>>>>> >>>>>>> Lorenz >>>>>>> >>>>>>>> and of course, the below is for processing an BGP: >>>>>>>> >>>>>>>> @Override >>>>>>>> public Op transform(OpBGP opBGP) { >>>>>>>> >>>>>>>> System.out.println("[TransformRemoveOp::transform(OpBGP opBGP)] >>>>>>>> opBGPCounter " + opBGPCounter++); >>>>>>>> System.out.println("[TransformRemoveOp::transform(OpBGP opBGP)] >>>>>>>> " + opBGP.toString()); >>>>>>>> System.out.println(""); >>>>>>>> Op newOpBGP = opBGP.copy(); >>>>>>>> BasicPattern newBP = ((OpBGP) newOpBGP).getPattern(); >>>>>>>> List<Triple> tripleList = newBP.getList(); >>>>>>>> >>>>>>>> Iterator<Triple> itr = tripleList.iterator(); >>>>>>>> while (itr.hasNext()) { >>>>>>>> Triple tp = itr.next(); >>>>>>>> if (tp.matches(this.triple)) { >>>>>>>> itr.remove(); >>>>>>>> isParent = true; >>>>>>>> isStarted = true; >>>>>>>> } >>>>>>>> } >>>>>>>> //...it can be empty >>>>>>>> if (((OpBGP) newOpBGP).getPattern().getList().isEmpty()) { >>>>>>>> System.out.println("[TransformRemoveOp::transform(OpBGP >>>>>>>> opBGP)] opBGP is empty " + opBGP.toString()); >>>>>>>> //return subOp; >>>>>>>> } >>>>>>>> return newOpBGP; >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> Many Thanks, >>>>>>>> Carlo. >>>>>>>> >>>>>>>> >>>>>>>>> On 31 Jan 2016, at 04:36, Carlo.Allocca <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Dear Andy, >>>>>>>>> >>>>>>>>> Thank you very much one more time. >>>>>>>>> >>>>>>>>> Please, some comments follow in line. >>>>>>>>> >>>>>>>>> On 30 Jan 2016, at 17:53, Andy Seaborne >>>>>>>>> <[email protected]<mailto:[email protected]>> wrote: >>>>>>>>> >>>>>>>>> On 30/01/16 15:52, Carlo.Allocca wrote: >>>>>>>>> Hello Lorenz, >>>>>>>>> >>>>>>>>> Thank you for your help. >>>>>>>>> >>>>>>>>> It was clear to me that I have to remove from the parent expression >>>>>>>>> and such is that using a patter-based approach >>>>>>>>> make it a bit difficult. I tried three of them and I got to the same >>>>>>>>> results. >>>>>>>>> >>>>>>>>> You need to process the parent, not just the ElementFilter. >>>>>>>>> >>>>>>>>> How can I process the parent? >>>>>>>>> My understanding is: >>>>>>>>> >>>>>>>>> 1. Building the Abstract Query Tree: Op op = Algebra.compile(q); >>>>>>>>> >>>>>>>>> For example, given the query >>>>>>>>> >>>>>>>>> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> >>>>>>>>> PREFIX ex: <http://www.semanticweb.org/dataset1/> >>>>>>>>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> >>>>>>>>> >>>>>>>>> SELECT DISTINCT ?ind ?boss ?g >>>>>>>>> WHERE >>>>>>>>> { { ?ind rdf:type ?z } >>>>>>>>> UNION >>>>>>>>> { ?boss ex:isBossOf ?ind >>>>>>>>> FILTER ( ?boss = "mathieu" ) >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> I obtain >>>>>>>>> >>>>>>>>> (distinct >>>>>>>>> (project (?ind ?boss ?g) >>>>>>>>> (union >>>>>>>>> (bgp (triple ?ind >>>>>>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?z)) >>>>>>>>> (filter (= ?boss "mathieu") >>>>>>>>> (bgp (triple ?boss >>>>>>>>> <http://www.semanticweb.org/dataset1/isBossOf> ?ind)))))) >>>>>>>>> >>>>>>>>> 2) Travere the tree to modify its internal structure. In terms of >>>>>>>>> code, I can write: >>>>>>>>> >>>>>>>>> (A) Transform transform = new TransformRemoveOp(q,tp) ; >>>>>>>>> (B) op = Transformer.transform(transform, op); >>>>>>>>> >>>>>>>>> where TransformRemoveOp implements Transform interface. >>>>>>>>> It seems that there is no link between children and parent that can >>>>>>>>> be used to figure out at which level the process is and modify the >>>>>>>>> Tree structure, accordingly. Please correct me if I am wrong. >>>>>>>>> In fact running it over the above example, I got the following: >>>>>>>>> >>>>>>>>> =========== DURING >>>>>>>>> [TransformRemoveOp::transform(OpBGP opBGP)] opBGPCounter 1 >>>>>>>>> [TransformRemoveOp::transform(OpBGP opBGP)] (bgp (triple ?ind >>>>>>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?z)) >>>>>>>>> >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpBGP opBGP)] opBGPCounter 2 >>>>>>>>> [TransformRemoveOp::transform(OpBGP opBGP)] (bgp (triple ?boss >>>>>>>>> <http://www.semanticweb.org/dataset1/isBossOf> ?ind)) >>>>>>>>> >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpBGP opBGP)] opBGP is empty (bgp >>>>>>>>> ) >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpFilter opFilter, Op subOp)] opFilter >>>>>>>>> (filter (= ?boss "mathieu") >>>>>>>>> (bgp >>>>>>>>> )) >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpFilter opFilter, Op subOp)] subOp >>>>>>>>> Name bgp >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpUnion opUnion, Op left, Op right)] >>>>>>>>> left: (bgp (triple ?ind >>>>>>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?z)) >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpUnion opUnion, Op left, Op right)] >>>>>>>>> right: (filter (exprlist (= ?boss "mathieu") true) >>>>>>>>> (bgp >>>>>>>>> )) >>>>>>>>> >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpProject opProject, Op subOp)] >>>>>>>>> (project (?ind ?boss ?g) >>>>>>>>> (union >>>>>>>>> (bgp (triple ?ind <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >>>>>>>>> ?z)) >>>>>>>>> (filter (exprlist (= ?boss "mathieu") true) >>>>>>>>> (bgp >>>>>>>>> )))) >>>>>>>>> >>>>>>>>> >>>>>>>>> [TransformRemoveOp::transform(OpDistinct opDistinct, Op subOp)] >>>>>>>>> (distinct >>>>>>>>> (project (?ind ?boss ?g) >>>>>>>>> (union >>>>>>>>> (bgp (triple ?ind >>>>>>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?z)) >>>>>>>>> (filter (exprlist (= ?boss "mathieu") true) >>>>>>>>> (bgp >>>>>>>>> ))))) >>>>>>>>> >>>>>>>>> >>>>>>>>> Moreover, implementing a first version of the OpFilter as reported >>>>>>>>> below: >>>>>>>>> >>>>>>>>> @Override >>>>>>>>> public Op transform(OpFilter opFilter, Op subOp) { >>>>>>>>> System.out.println("[TransformRemoveOp::transform(OpFilter >>>>>>>>> opFilter, Op subOp)] opFilter " + opFilter.toString()); >>>>>>>>> System.out.println("[TransformRemoveOp::transform(OpFilter >>>>>>>>> opFilter, Op subOp)] subOp Name " + subOp.getName()); >>>>>>>>> System.out.println(""); >>>>>>>>> >>>>>>>>> if (isParent == false) { >>>>>>>>> return opFilter; >>>>>>>>> } >>>>>>>>> >>>>>>>>> //...get the variables of the triple pattern that we want to >>>>>>>>> delete >>>>>>>>> Set<Var> tpVars = new HashSet(); >>>>>>>>> Node subj = this.triple.getSubject(); >>>>>>>>> if (subj.isVariable()) { >>>>>>>>> tpVars.add((Var) subj); >>>>>>>>> } >>>>>>>>> Node pred = this.triple.getPredicate(); >>>>>>>>> if (pred.isVariable()) { >>>>>>>>> tpVars.add((Var) pred); >>>>>>>>> } >>>>>>>>> Node obj = this.triple.getObject(); >>>>>>>>> if (obj.isVariable()) { >>>>>>>>> tpVars.add((Var) obj); >>>>>>>>> } >>>>>>>>> //...get the variables of the FILTER expression >>>>>>>>> Op opNew = opFilter.copy(subOp); >>>>>>>>> Set<Var> expVars = ((OpFilter) >>>>>>>>> opNew).getExprs().getVarsMentioned(); >>>>>>>>> >>>>>>>>> //...check whether the FILTER expression contains any of the >>>>>>>>> triple pattern variable >>>>>>>>> boolean isContained = false; >>>>>>>>> for (Var var : expVars) { >>>>>>>>> //..if it does then we have to delete the entire FILTER >>>>>>>>> expression >>>>>>>>> if (tpVars.contains(var)) { >>>>>>>>> isContained = true; >>>>>>>>> break; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> //... if the filter contains any variable of the triple that >>>>>>>>> has been removed, then.... >>>>>>>>> if (isContained) { >>>>>>>>> Op newOP; >>>>>>>>> Expr e; >>>>>>>>> if (subOp instanceof OpBGP) { >>>>>>>>> if (((OpBGP) subOp).getPattern().getList().isEmpty()) { >>>>>>>>> e = new NodeValueBoolean(true); >>>>>>>>> newOP = OpFilter.filter(e, opFilter);//filter(e, ); >>>>>>>>> return newOP; >>>>>>>>> } else { >>>>>>>>> e = new NodeValueBoolean(false); >>>>>>>>> newOP = OpFilter.filter(e, opFilter);//filter(e, ); >>>>>>>>> return newOP; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> return opFilter; >>>>>>>>> } >>>>>>>>> >>>>>>>>> It produces the following SPARQL query: >>>>>>>>> >>>>>>>>> SELECT DISTINCT ?ind ?boss ?g >>>>>>>>> WHERE >>>>>>>>> { { ?ind a ?z } >>>>>>>>> UNION >>>>>>>>> { # Empty BGP >>>>>>>>> >>>>>>>>> FILTER ( ?boss = "mathieu" ) >>>>>>>>> FILTER ( true ) >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> It seems quite challenging even transform the expression of "?boss = >>>>>>>>> “mathieu” into “true” (as reported above) does not work as It should. >>>>>>>>> >>>>>>>>> If anyone could help me who is more experienced with this kind of >>>>>>>>> task, I would be very very grateful. >>>>>>>>> I would like to see something concrete in sense of line of code as I >>>>>>>>> am using. >>>>>>>>> >>>>>>>>> Many Thanks, >>>>>>>>> Best Regards, >>>>>>>>> Carlo >>>>>>>>> On 30 Jan 2016, at 17:53, Andy Seaborne >>>>>>>>> <[email protected]<mailto:[email protected]>> wrote: >>>>>>>>> >>>>>>>>> element has active parts >>>>>>>>> >>>>>>>>> -- The Open University is incorporated by Royal Charter (RC 000391), >>>>>>>>> an exempt charity in England & Wales and a charity registered in >>>>>>>>> Scotland (SC 038302). The Open University is authorised and regulated >>>>>>>>> by the Financial Conduct Authority. >>>>>>>> >>>>>>> -- >>>>>>> Lorenz Bühmann >>>>>>> AKSW group, University of Leipzig >>>>>>> Group: http://aksw.org - semantic web research center >>>>>>> >>>>>> >>>>>> >>>>> -- >>>>> Lorenz Bühmann >>>>> AKSW group, University of Leipzig >>>>> Group: http://aksw.org - semantic web research center >>>>> >>>> >>> >> >
