Re: Passing distinct subquery solutions to aggregate outer query

Andy Seaborne Fri, 25 Jan 2013 00:32:50 -0800


On 24/01/13 16:54, Paul Tyson wrote:


Ok, just to close the loop on this:

In the subquery were clauses like:

filter not exist {?s :p "str"}

Replacing these with a different negation form eliminated the problem.

I don't know if this is per spec or a quirk of jena query library.


Unclear without a complete, minimal example.

If you actually had

{ filter not exist {?s :p "str"} }

somewhere in the subquery then the extra {} makes a difference.

The use of {} means it is evaluated, and the other elements in the groupevaluated then the results combined by join.

It is different to without {} around the filter when the FILTER willapply after all other group elements are evaluated.


When
{ filter not exist {?s :p "str"} }

is evaluated ?s has not been bound. The outer {} are changing the scopeof ?s - you would use any name with no difference. Any ??? :p "str"triple will be looked for. If there is one then NOT EXISTS is false.


Then {filter...} is no rows

(and of course there may be quirks in Jena)

        Andy


Regards,
--Paul


On Jan 24, 2013, at 9:21, Paul Tyson <[email protected]> wrote:





On Jan 24, 2013, at 8:45, Andy Seaborne <[email protected]> wrote:

Joining the evaluation of {} with X yields X, not a single solution of null 
bindings.


Yes, I just learned that reading further in the spec. Now it is more of a 
puzzle. I will try to work up a simple example to replicate the problem.


Is it a typo that:

[[
select distinct ?var1 ?var ?var3
]]


Yes, typo. The select variables are identical in inner and outer queries.

Thanks,
--Paul


has ?var not ?var2

   Andy


On 24/01/13 14:31, Paul Tyson wrote:


Lee,


On Jan 24, 2013, at 0:36, Lee Feigenbaum <[email protected]
<mailto:[email protected]>> wrote:

Hi Paul,

Why would the outer query need any graph patterns other than the
subquery? You ought to be able to do exactly what you have below
without anything in the "what goes here" spot.

That's what I thought at first, but it returns a single solution with no
bindings. After studying the spec (SPARQL 1.1 section 12) I see this is
probably as specified, because it joins the solution projected from the
inner query to the solution from the outer query. The empty outer graph
pattern returns a single solution of null bindings (per spec 5.2.1).

Regards,
--Paul

Lee

On 1/23/2013 3:56 PM, Paul Tyson wrote:

Hi all,

I'm wondering if there is a simple solution to this problem.

I have a rather complicated query, consisting of several union
clauses, which by its nature will return duplicates. I need to get a
unique solution set so I can group them and sum a couple of fields.

Simply wrapping the union query in a nested SELECT DISTINCT doesn't
work, because the outer query has no graph pattern to match the
variables projected from the subquery.

I tried adding a series of BIND statements to simply rename the
subquery variables for use by the aggregate outer query, but that
didn't work (with jena, at least).

The source dataset is nearly 500M triples. I'm using Jena 2.7.3. The
subquery will return anywhere from a few dozen to a few hundred
solutions, and by itself runs very quickly.

Here's a skeleton view of the query. Is there something to fill "what
goes here" that will pass the subquery results up to the grouping
function?

select ?var1 ?var2 (sum(?var3) as ?var3_total)
where {
{ ??? what goes here ??? }
{select distinct ?var1 ?var ?var3
where { ... complicated union query ... }}
}
group by ?var1 ?var2

Or any other suggestions on how to tackle this problem?

Thanks,
--Paul

Re: Passing distinct subquery solutions to aggregate outer query

Reply via email to