On Thu, Sep 6, 2012 at 3:21 PM, Rob Stewart <[email protected]> wrote: > Hi, > > Firstly, I'm having trouble finding any *full* examples of SPARQL 1.1 > queries that FILTER on "NOT IN". I also cannot find any documentation > on the ARQ engine support for "NOT IN", or indeed the fuseki support > for "NOT IN". Could someone point me to various canonical examples of > such "NOT IN" queries that fuseki supports? > > I've come up with my own for now. Would people mind commenting on > whether they believe that fuseki would support the query? It doesn't > seem to be negating the commonly occurring objects. I'm using Fuseki > 0.2.4 and the tdbloader from "apache-jena-2.7.4-SNAPSHOT". The > intention is to find two distinct subjects that share the same objects > for a given predicate, negating the most common objects. I deem > "common" to be more than 100 occurrences in the TDB store. > > ----- > > SELECT ?subj1 subj2 > WHERE > { > > ?subj1 example:pred ?obj1 . > ?subj2 example:pred ?obj1 . > FILTER (?subj1 != ?subj2) > > { > SELECT ?veryPopularObj > WHERE > { > { > SELECT ?veryPopularObj (COUNT(?veryPopularObj) as ?objOccurrences) > WHERE > { > ?s example:pred ?veryPopularObj . > } > GROUP BY ?veryPopularObj > } > FILTER (?objOccurrences > 100) > } > } > > FILTER ( ?obj1 NOT IN (?veryPopularObj) ) > > }
Rob, IN and NOT IN evaluate expressions. In your query, you are performing a cross product between the binding (?subj1, ?subj2, ?obj1) and the binding (?veryPopularObj). This occurs because there are no shared variables. Your NOT IN filter will then pass for most rows. Instead, you should use SPARQL's negation feature [1]. Here is your query rewritten to use MINUS: PREFIX example: <http://example.org/> SELECT ?subj1 ?subj2 WHERE { ?subj1 example:pred ?obj1 . ?subj2 example:pred ?obj1 . FILTER (?subj1 != ?subj2) MINUS { { SELECT ?veryPopularObj (COUNT(?veryPopularObj) as ?objOccurrences) WHERE { ?s example:pred ?veryPopularObj . } GROUP BY ?veryPopularObj } FILTER (?objOccurrences > 100) } } -Stephen [1] http://www.w3.org/TR/sparql11-query/#negation
