On 10/07/12 06:05, Neubert Joachim wrote:
In the following query

PREFIX gnd:     <http://d-nb.info/standards/elementset/gnd#>

SELECT ?uri
WHERE {
   BIND (<http://d-nb.info/gnd/10244669> AS ?uri1)
   BIND (<http://d-nb.info/gnd/1024466-9> AS ?uri2)
   { {
       SELECT (?uri1 AS ?uri)
       WHERE {
         ?uri1 a gnd:CorporateBody .
       }
     } UNION {
       SELECT (?uri2 AS ?uri)
       WHERE {
         ?uri2 a gnd:CorporateBody .
       }
   } }
}

I'd expect that the ?uri1 and ?uri2 variables are bound in the
subqueries, and as a result to get zero, one or two values for ?uri.
However, I get every possible gnd:CorporateBody (more than a million).

It would be nice if somebody could point out why this happens, and how I
could work arround it. (Duplicating the BIND part and moving it into the
subquery works, but since it involves a query to a remote service and
some function calls, I'd prefer not to).

Cheers, Joachim


Evaluation is bottom-up - subparts are evaluated then combined.

SELECT (?uri1 AS ?uri) exposes ?uri and any mention of ?uri1 inside the SELECT is hidden (it's a different ?uri -- strictly it's the same name but it will never meet the ?uri1 BIND

So the only thing coming out of SELECT (?uri1 AS ?uri) is a result row of one variable, ?uri. There is no ?uri1 outside the projection.

You have the structure:

BIND ... ?uri1
BIND ... ?uri2
{
SELECT ... ?uri
   union
SELECT ... ?uri
}


This query

PREFIX gnd:     <http://d-nb.info/standards/elementset/gnd#>

SELECT ?uri
WHERE {
  ?uri a gnd:CorporateBody .
  FILTER ( <http://d-nb.info/gnd/10244669> = ?uri ||
           <http://d-nb.info/gnd/1024466-9> = ?uri )
}


finds the ?uri that are one of the two URIs.

Or
SELECT ?uri
WHERE {
  ?uri a gnd:CorporateBody .
  FILTER ( ?uri IN (<http://d-nb.info/gnd/10244669>,
                    <http://d-nb.info/gnd/1024466-9> ))
}

which gets to the same execution plan --

It's optimized as well:

(project (?uri)
  (disjunction
    (assign ((?uri <http://d-nb.info/gnd/10244669>))
      (bgp (triple <http://d-nb.info/gnd/10244669>
                   rdf:type gnd:CorporateBody)))
    (assign ((?uri <http://d-nb.info/gnd/1024466-9>))
      (bgp (triple <http://d-nb.info/gnd/1024466-9>
                   rdf:type gnd:CorporateBody)))))

i.e. it tries one case

{ <http://d-nb.info/gnd/10244669> rdf:type gnd:CorporateBody }

then tries the other

{ <http://d-nb.info/gnd/10244669-9> rdf:type gnd:CorporateBody }

which is two probes of the database, not filtering a million items.

        Andy

Reply via email to