Oops, typo in the query I gave you. You need to share the variable! Corrected query:
PREFIX example: <http://example.org/> SELECT ?subj1 ?subj2 WHERE { ?subj1 example:pred ?obj1 . ?subj2 example:pred ?obj1 . FILTER (?subj1 != ?subj2) MINUS { { SELECT ?obj1 (COUNT(?obj1) as ?objOccurrences) WHERE { ?s example:pred ?obj1 . } GROUP BY ?obj1 } FILTER (?objOccurrences > 100) } } On Thu, Sep 6, 2012 at 5:58 PM, Stephen Allen <[email protected]> wrote: > On Thu, Sep 6, 2012 at 3:21 PM, Rob Stewart <[email protected]> wrote: >> Hi, >> >> Firstly, I'm having trouble finding any *full* examples of SPARQL 1.1 >> queries that FILTER on "NOT IN". I also cannot find any documentation >> on the ARQ engine support for "NOT IN", or indeed the fuseki support >> for "NOT IN". Could someone point me to various canonical examples of >> such "NOT IN" queries that fuseki supports? >> >> I've come up with my own for now. Would people mind commenting on >> whether they believe that fuseki would support the query? It doesn't >> seem to be negating the commonly occurring objects. I'm using Fuseki >> 0.2.4 and the tdbloader from "apache-jena-2.7.4-SNAPSHOT". The >> intention is to find two distinct subjects that share the same objects >> for a given predicate, negating the most common objects. I deem >> "common" to be more than 100 occurrences in the TDB store. >> >> ----- >> >> SELECT ?subj1 subj2 >> WHERE >> { >> >> ?subj1 example:pred ?obj1 . >> ?subj2 example:pred ?obj1 . >> FILTER (?subj1 != ?subj2) >> >> { >> SELECT ?veryPopularObj >> WHERE >> { >> { >> SELECT ?veryPopularObj (COUNT(?veryPopularObj) as ?objOccurrences) >> WHERE >> { >> ?s example:pred ?veryPopularObj . >> } >> GROUP BY ?veryPopularObj >> } >> FILTER (?objOccurrences > 100) >> } >> } >> >> FILTER ( ?obj1 NOT IN (?veryPopularObj) ) >> >> } > > > Rob, > > IN and NOT IN evaluate expressions. In your query, you are performing > a cross product between the binding (?subj1, ?subj2, ?obj1) and the > binding (?veryPopularObj). This occurs because there are no shared > variables. Your NOT IN filter will then pass for most rows. > > Instead, you should use SPARQL's negation feature [1]. Here is your > query rewritten to use MINUS: > > PREFIX example: <http://example.org/> > > SELECT ?subj1 ?subj2 > WHERE > { > ?subj1 example:pred ?obj1 . > ?subj2 example:pred ?obj1 . > FILTER (?subj1 != ?subj2) > > MINUS > { > { > SELECT ?veryPopularObj (COUNT(?veryPopularObj) as ?objOccurrences) > WHERE > { > ?s example:pred ?veryPopularObj . > } > GROUP BY ?veryPopularObj > } > FILTER (?objOccurrences > 100) > } > } > > -Stephen > > [1] http://www.w3.org/TR/sparql11-query/#negation
