Sigh.  One more time:

PREFIX example: <http://example.org/>

SELECT ?subj1 ?subj2
WHERE
{
  ?subj1 example:pred ?obj1 .
  ?subj2 example:pred ?obj1 .
  FILTER (?subj1 != ?subj2)

  MINUS
  {
    {
      SELECT ?obj1 (COUNT(?obj1) as ?objOccurrences)
      WHERE
      {
        ?s example:pred ?obj1 .
      }
      GROUP BY ?obj1
    }
    FILTER (?objOccurrences > 100)
  }
}


On Thu, Sep 6, 2012 at 6:03 PM, Stephen Allen <[email protected]> wrote:
> Oops, typo in the query I gave you.  You need to share the variable!
> Corrected query:
>
> PREFIX example: <http://example.org/>
>
> SELECT ?subj1 ?subj2
> WHERE
> {
>   ?subj1 example:pred ?obj1 .
>   ?subj2 example:pred ?obj1 .
>   FILTER (?subj1 != ?subj2)
>
>   MINUS
>   {
>     {
>       SELECT ?obj1 (COUNT(?obj1) as ?objOccurrences)
>       WHERE
>       {
>         ?s example:pred ?obj1 .
>       }
>       GROUP BY ?obj1
>     }
>     FILTER (?objOccurrences > 100)
>   }
> }
>
>
>
> On Thu, Sep 6, 2012 at 5:58 PM, Stephen Allen <[email protected]> wrote:
>> On Thu, Sep 6, 2012 at 3:21 PM, Rob Stewart <[email protected]> wrote:
>>> Hi,
>>>
>>> Firstly, I'm having trouble finding any *full* examples of SPARQL 1.1
>>> queries that FILTER on "NOT IN". I also cannot find any documentation
>>> on the ARQ engine support for "NOT IN", or indeed the fuseki support
>>> for "NOT IN". Could someone point me to various canonical examples of
>>> such "NOT IN" queries that fuseki supports?
>>>
>>> I've come up with my own for now. Would people mind commenting on
>>> whether they believe that fuseki would support the query? It doesn't
>>> seem to be negating the commonly occurring objects. I'm using Fuseki
>>> 0.2.4 and the tdbloader from "apache-jena-2.7.4-SNAPSHOT". The
>>> intention is to find two distinct subjects that share the same objects
>>> for a given predicate, negating the most common objects. I deem
>>> "common" to be more than 100 occurrences in the TDB store.
>>>
>>> -----
>>>
>>> SELECT ?subj1 subj2
>>> WHERE
>>>  {
>>>
>>>  ?subj1 example:pred ?obj1 .
>>>  ?subj2 example:pred ?obj1 .
>>>  FILTER (?subj1 != ?subj2)
>>>
>>>  {
>>>   SELECT ?veryPopularObj
>>>   WHERE
>>>    {
>>>      {
>>>      SELECT ?veryPopularObj (COUNT(?veryPopularObj) as ?objOccurrences)
>>>      WHERE
>>>       {
>>>       ?s example:pred ?veryPopularObj .
>>>       }
>>>       GROUP BY ?veryPopularObj
>>>      }
>>>    FILTER (?objOccurrences > 100)
>>>   }
>>>  }
>>>
>>>  FILTER ( ?obj1 NOT IN (?veryPopularObj) )
>>>
>>> }
>>
>>
>> Rob,
>>
>> IN and NOT IN evaluate expressions.  In your query, you are performing
>> a cross product between the binding (?subj1, ?subj2, ?obj1) and the
>> binding (?veryPopularObj).  This occurs because there are no shared
>> variables.  Your NOT IN filter will then pass for most rows.
>>
>> Instead, you should use SPARQL's negation feature [1].  Here is your
>> query rewritten to use MINUS:
>>
>> PREFIX example: <http://example.org/>
>>
>> SELECT ?subj1 ?subj2
>> WHERE
>> {
>>   ?subj1 example:pred ?obj1 .
>>   ?subj2 example:pred ?obj1 .
>>   FILTER (?subj1 != ?subj2)
>>
>>   MINUS
>>   {
>>     {
>>       SELECT ?veryPopularObj (COUNT(?veryPopularObj) as ?objOccurrences)
>>       WHERE
>>       {
>>         ?s example:pred ?veryPopularObj .
>>       }
>>       GROUP BY ?veryPopularObj
>>     }
>>     FILTER (?objOccurrences > 100)
>>   }
>> }
>>
>> -Stephen
>>
>> [1] http://www.w3.org/TR/sparql11-query/#negation

Reply via email to