On 02/02/16 19:29, Paul Houle wrote:
Carlo, Andy,
I like the Iterator<> interfaces in the Jena framework for getting data
out, but I make a habit of always putting results in a List or Queue or
something before putting them back into the same Jena model because i get
less BS per mile that way in terms of Exceptions and other exceptional
events.
Does Jena have an official policy on being reenterable in that way?
Carlo's issues are nothing to do with iterator policy.
Carlo - Use arg1 else you will not see your changes so far.
In
public Element transform(ElementGroup arg0, List<Element> arg1)
arg0 is the element from the AST before modification and
arg1 is the new elements to go in after modification by lower levels of
the bottom-up rewrite.
So if you rewrite a ElementFilter by making a new one, it will appear in
arg1 not in arg0.
Do not modify the Element* arguments in place.
See ElementTransformCopyBase
The default implementation is:
@Override
public Element transform(ElementGroup el, List<Element> elts) {
if ( el.getElements() == elts )
return el ;
ElementGroup el2 = new ElementGroup() ;
el2.getElements().addAll(elts) ;
return el2 ;
}
i.e if any change, detected by being not the exact identical list, then
do a copy of the structure. This saves object churn.
Andy
On Tue, Feb 2, 2016 at 2:13 PM, Carlo.Allocca <[email protected]>
wrote:
Dear Andy and All,
while I was extending and testing the code that I wrote so far concerning
the removing a triple from a given SPARQL query,
I realised that I get different outputs depending on how I start the
implementation of the public Element transform(ElementGroup arg0,
List<Element> arg1).
In particular, if I start with (1) I obtain some results, if I start with
(2) I obtain something different (you can see below the details).
I have also used ElementTransformCleanGroupsOfOne when ElementGroup is
empty
ElementTransform transform = new
ElementTransformCleanGroupsOfOne();
Element el2 = ElementTransformer.transform(eg, transform);
return el2;
but no difference in results. I am sure I am doing something wrong.
Moreover, my questions are: what is the main difference between the two
approaches? and when I should use ElementGroup arg0 and when List<Element>
arg1?
(1) public Element transform(ElementGroup arg0, List<Element> arg1) {
List<Element> elemList = arg0.getElements();
Iterator<Element> itr = elemList.iterator();
while (itr.hasNext()) {
}
…
…
}
(2) public Element transform(ElementGroup arg0, List<Element> arg1) {
Iterator<Element> itr = arg1.iterator();
while (itr.hasNext()) {
}
…
…
}
I know that it may be related to the little knowledge about Jena.
Many Thanks in advice for your clarification on the above.
Best Regards,
Carlo
=======
Below, I reported the used code (at very bottom), the two used scenario
with test-cases and results. In practice, you can notice that:
==== TESTING:
Scenario A:
public Element transform(ElementGroup arg0, List<Element> arg1) {
List<Element> elemList = arg0.getElements();
Iterator<Element> itr = elemList.iterator();
while (itr.hasNext()) {
}
…
…
}
Test 1:
The triple to remove is (?x foaf:mbox ?mbox ) using the below query Q1:
=========== BEFORE Q1
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name ?mbox
WHERE
{ ?x foaf:name ?name
OPTIONAL
{ ?x foaf:mbox ?mbox }
}
============= AFTER Q1
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name ?mbox
WHERE
{ ?x foaf:name ?name }
Test2:
The triple to remove is (?boss1 ex:isBossOf1 ?ind ) using the below
query Q2:
=========== BEFORE Q2
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ { ?ind rdf:type ?z
OPTIONAL
{ ?boss1 ex:isBossOf1 ?ind }
}
UNION
{ { ?boss ex:isBossOf1 ?ind }
UNION
{ ?boss ex:isBossOf ?ind
FILTER ( ?boss = "mathieu" )
}
}
}
============= AFTER Q2: it does not remove the triple.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ { ?ind rdf:type ?z }
UNION
{ { ?boss ex:isBossOf1 ?ind }
UNION
{ ?boss ex:isBossOf ?ind
FILTER ( ?boss = "mathieu" )
}
}
}
Test 3: The triple to remove is (?ind rdf:type ?z) using the below query
Q3:
=========== BEFORE Q3:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ ?ind rdf:type ?z
FILTER ( ?ind = "mathieu" )
}
============= AFTER Q3: There is still an empty BGP present.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ # Empty BGP
}
Scenario B:
public Element transform(ElementGroup arg0, List<Element> arg1) {
Iterator<Element> itr = arg1.iterator();
while (itr.hasNext()) {
}
…
…
}
Test 1:
The triple to remove is (?x foaf:mbox ?mbox ) using the below query Q1:
=========== BEFORE Q1
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name ?mbox
WHERE
{ ?x foaf:name ?name
OPTIONAL
{ ?x foaf:mbox ?mbox }
}
============= AFTER Q1: there is still the OPTION (with a ElementGroup
empty) clause.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?name ?mbox
WHERE
{ ?x foaf:name ?name
OPTIONAL
{ # Empty BGP
}
}
Test 2:
The triple to remove is (?boss1 ex:isBossOf1 ?ind ) using the below
query Q2:
=========== BEFORE Q2
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ { ?ind rdf:type ?z
OPTIONAL
{ ?boss1 ex:isBossOf1 ?ind }
}
UNION
{ { ?boss ex:isBossOf1 ?ind }
UNION
{ ?boss ex:isBossOf ?ind
FILTER ( ?boss = "mathieu" )
}
}
}
============= AFTER Q2: it does not remove the OPTION and it leaves an
empty BGP.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ { ?ind rdf:type ?z
OPTIONAL
{ # Empty BGP
}
}
UNION
{ { ?boss ex:isBossOf1 ?ind }
UNION
{ ?boss ex:isBossOf ?ind
FILTER ( ?boss = "mathieu" )
}
}
}
Test 3: The triple to remove is (?ind rdf:type ?z) using the below query
Q3:
=========== BEFORE Q3
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ ?ind rdf:type ?z
FILTER ( ?ind = "mathieu" )
}
============= AFTER Q3: It does not remove the FILTER, but just the triple.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://www.semanticweb.org/dataset1/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ { ?ind rdf:type ?z }
UNION
{ # Empty BGP
FILTER ( ?boss = "mathieu" )
}
}
=== FULL CODE used with public Element transform(ElementPathBlock
eltPB)
@Override
public Element transform(ElementPathBlock eltPB) {
if (eltPB.isEmpty()) {
//System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)]
ElementPathBlock IS EMPTY:: " + eltPB.toString());
return eltPB;
}
System.out.println("[RemoveOpTransform::transform(ElementPathBlock
arg0)] ElementPathBlock:: " + eltPB.toString());
Iterator<TriplePath> l = eltPB.patternElts();
while (l.hasNext()) {
TriplePath tp = l.next();
if (tp.asTriple().matches(this.triple)) {
l.remove();
System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)]
ElementPathBlock:: " + tp.toString() + " TRIPLE JUST REMOVED!!!");
//System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)]
TRIPLE JUST REMOVED!!! ");
System.out.println("");
return this.transform(eltPB);//eltPB;
}
}
return eltPB;
}
=== FULL CODE used with public Element transform(ElementGroup arg0,
List<Element> arg1)
@Override
public Element transform(ElementGroup arg0, List<Element> arg1) {
List<Element> elemList = arg0.getElements();
Iterator<Element> itr = elemList.iterator();
//Iterator<Element> itr = arg1.iterator();
while (itr.hasNext()) {
Element elem = itr.next();
if (elem instanceof ElementOptional) {
boolean isElementOptionalEmpty =
isElementOptionalEmpty((ElementOptional) elem);
if (isElementOptionalEmpty) {
itr.remove();
}
}
else if (elem instanceof ElementGroup) {
boolean isElementGroupEmpty =
isElementGroupEmpty((ElementGroup) elem);
if (isElementGroupEmpty) {
itr.remove();
}
}
else if (elem instanceof ElementFilter) {
//... check if this filter is the one that we should remove
//...get the variables of the triple pattern that we want
to delete
Set<Var> tpVars = new HashSet();
Node subj = this.triple.getSubject();
if (subj.isVariable()) {
tpVars.add((Var) subj);
}
Node pred = this.triple.getPredicate();
if (pred.isVariable()) {
tpVars.add((Var) pred);
}
Node obj = this.triple.getObject();
if (obj.isVariable()) {
tpVars.add((Var) obj);
}
//...get the variables of the FILTER expression
Set<Var> expVars = ((ElementFilter)
elem).getExpr().getVarsMentioned();
//...check whether the FILTER expression contains any of
the triple pattern variable
for (Var var : expVars) {
//..if it does then we have to delete the entire
FILTER expression
if (tpVars.contains(var)) {
itr.remove();
}
}
}
else if (elem instanceof ElementUnion) {
boolean isUnionBothSidesEmpty =
isUnionBothSidesEmpty1((ElementUnion) elem);
if (isUnionBothSidesEmpty) {
itr.remove();
}
}
}
return arg0;
}
On 2 Feb 2016, at 10:54, Carlo.Allocca <[email protected]<mailto:
[email protected]>> wrote:
Dear Andy,
Thank you for your time. Very appreciated.
Some comments follow in lines.
On 2 Feb 2016, at 09:36, Andy Seaborne <[email protected]<mailto:
[email protected]>> wrote:
when removing the triple (?boss ex:isBossOf ?ind .”), I get
SELECT DISTINCT ?ind ?boss ?g
WHERE
{ { ?ind rdf:type ?z }
UNION
{ { ?boss ex:isBossOf1 ?ind }
UNION
{ # Empty BGP
}
}
}
which is OK.
I just need to find out how to remove an ElementGroup which contains only
one element which is the EMPTY one.
Of course, I need to do the same for the other case, e.g. OPTION,
SUBquery, etc.
Do note that evaluating {} (empty syntax group) yields one row of zero
columns - it contributes to the overall results (it's the join identity).
I see. To avoid this I am going to apply a
ElementTransformCleanGroupsOfOne as you suggested.
Now you have to look at all the elements that have a group in
ElementUnion, ElementOptional, ElementMinus, …
Yes, I need to cover all the SPARQL language from the “public Element
transform(ElementGroup arg0, List<Element> arg1)” call.
At least this is my understanding so far.
That is what ElementTransformCleanGroupsOfOne does, except it looks for
"groups of one"
.. UNION { { stuff } }
and isn't to fussy about finding them all (it's an optimization, more a
tidying of the tree, not a change in the effect of a query which is what
removing triple patterns is).
And of course changes from the bottom could potentially cause change all
the way up to the top of the syntax tree.
also: they maybe be original, legal empty groups in the tree.
Thanks for the detailed clarifications. Indeed, I will consider them.
Many Thanks,
Best Regards,
Carlo
Andy
-- The Open University is incorporated by Royal Charter (RC 000391), an
exempt charity in England & Wales and a charity registered in Scotland (SC
038302). The Open University is authorised and regulated by the Financial
Conduct Authority.