Paul, If you could try the query below which mimics the effect of placing the ?var4 filter part, it will help determine if this is a filter placement issue or not.

The difference is that first basic graph pattern is inside a {} with the relevant part of the filter expression.

        Andy


PREFIX  :     <http://example/>

SELECT  *
WHERE
  { FILTER ( ( ?var3 = "str1" ) || ( ?var3 = "str2" ) )
    { ?var2  :p1  ?var4 ;
             :p2  ?var3
      FILTER ( ! ( ( ( ?var4 = "" ) ||
               ( ?var4 = "str3" ) ) ||
               regex(?var4, "pat1") ) )
    }
    {   { ?var1  :p3  ?var4 }
      UNION
        { ?var1  :p4  ?var4 }
    }
  }


        Andy


On 14/09/16 13:15, Paul Tyson wrote:
On Wed, 2016-09-14 at 10:57 +0100, Andy Seaborne wrote:
Hi Paul,

It's difficult to tell what's going on from your report. Plain strings
are not quite identical in RDF 1.0 and RDF 1.1 so I hope you have
related the data for running Jena 3.x.

I admit I have not studied the subtleties around string literals with
and without datatype tags. None of my data loadfiles have tagged string
literals, nor do my queries. Are you saying they should?


On less data, does either case produce the wrong answers?


I'll produce a smaller dataset to test.

The regex is not being pushed inwards in the same way which may be an
issue - it "all depends" on the data.

A smaller query exhibiting a timing difference would be very helpful.
Are all parts of the FILTER necessary for the effect?
Yes, they eliminate spurious matches.


        Andy

Unrelated:

{
?var1 :p3 ?var4 .
} UNION {
?var1 :p4 ?var4 .
}

can be written

?var1 (:p3|:p4) ?var4


Yes, but I generate these queries from RIF source, and UNION is easier
for the general RIF statement "Or(x,y)". The surface syntax doesn't make
any difference in the algebra, does it?

Regards,
--Paul

On 14/09/16 02:01, Paul Tyson wrote:
I have some queries that worked fine in jena-2.13.0 but not in
jena-3.1.0, using the same data.

For a long time I've been running a couple dozen queries regularly over
a large (900M triples) TDB, using jena-2.13.0. When I recently upgraded
to jena-3.1.0, I found that 5 of these queries would not return (ran
forever). qparse revealed that the sparql algebra is quite different in
2.13.0 and 3.1.0 (or apparently any 3.n.n version).

Here is a sample query that worked in 2.13.0 but not in 3.1.0, along
with the algebra given by qparse --explain for 2.13.0 and 3.1.0:

prefix : <http://example.org>
CONSTRUCT {
?var1 <http://www.w3.org/2004/02/skos/core#exactMatch> ?var2 .
}
WHERE {
FILTER (((?var3 = "str1" || ?var3 = "str2") && !(?var4 = "" || ?var4 =
"str3" || regex(?var4,"pat1"))))
?var2 :p1 ?var4 ; :p2 ?var3 .
{{
?var1 :p3 ?var4 .
} UNION {
?var1 :p4 ?var4 .
}}
}

Jena-2.13.0 produces algebra:
(prefix ((: <http://example.org>))
  (sequence
    (filter (|| (= ?var3 "str1") (= ?var3 "str2"))
      (sequence
        (filter (! (|| (|| (= ?var4 "") (= ?var4 "str3")) (regex ?var4
"pat1")))
          (bgp (triple ?var2 :p1 ?var4)))
        (bgp (triple ?var2 :p2 ?var3))))
    (union
      (bgp (triple ?var1 :p3 ?var4))
      (bgp (triple ?var1 :p4 ?var4)))))

Jena-3.1.0 produces algebra:
(prefix ((: <http://example.org>))
  (filter (! (|| (|| (= ?var4 "") (= ?var4 "str3")) (regex ?var4
"pat1")))
    (disjunction
      (assign ((?var3 "str1"))
        (sequence
          (bgp
            (triple ?var2 :p1 ?var4)
            (triple ?var2 :p2 "str1")
          )
          (union
            (bgp (triple ?var1 :p3 ?var4))
            (bgp (triple ?var1 :p4 ?var4)))))
      (assign ((?var3 "str2"))
        (sequence
          (bgp
            (triple ?var2 :p1 ?var4)
            (triple ?var2 :p2 "str2")
          )
          (union
            (bgp (triple ?var1 :p3 ?var4))
            (bgp (triple ?var1 :p4 ?var4))))))))

Thanks for any insight or assistance into this problem.

Regards,
--Paul



Reply via email to