Re: Question on VALUES clauses, "whitelists" and "blacklists"

Mark Feblowitz Fri, 11 Sep 2015 07:49:43 -0700

Good - 

I have 3 options now:


FILTER ( ?x NOT IN ( a, b, c, d))
FILTER NOT EXISTS { ?x ?q ?v . VALUES ?x ( a b c d ) }
MINUS { ?x ?q ?v . VALUES ?x ( a b c d ) }

I like the first two because they read like proper declarative formal 
expressions.

As I’m using a few UNIONed blocks and any one could contribute an objectionable 
?x, I’m leaning toward the first.
 
        Any ?x not in S

But if I end up distributing the FILTER to be applied within the UNIONed 
blocks, I’d much prefer the NOT EXISTS form:

        All ?x ?q ?v where ?x not in S

Either way, my notion of consolidating potentially many replicated sets of 
values into a single, easily-maintained declaration S seems doomed, and likely 
also to have some relatively unpleasant side effects, based on how the queries 
are processed/optimized. 

Perhaps the right way of thinking about this is to not not make multiple uses 
of a single VALUES set S but to instead think about the application scope for a 
single use (apply the test once, to a larger block of statements).

As I don’t know what optimizations are being applied, it’s not clear to me what 
the performance tradeoffs might be. Guess I’ll just have to see/test.

Thanks,

Mark




> On Sep 11, 2015, at 8:38 AM, Andy Seaborne <[email protected]> wrote:
> 
> On 10/09/15 22:32, Mark Feblowitz wrote:
>> I’m trying to understand the semantics and implementation of the VALUES 
>> clause, specifically with respect to testing a “blacklist” of values.
>> 
>> I have used VALUES for whitelists, whereby I can match a bound variable to 
>> one of the values:
>> 
>> VALUES ?WL ( a b c d)
>> FILTER ( ?V = ?WL)
>> 
>> This is nice, in that I can refer to ?WL in several graphs in a query that 
>> UNIONs various graphs
>> 
>> {  stmt1 .
>>    stmt2.
>>  FILTER (?V = ?WL)
>> }
>> UNION
>> { ….
>>  FILTER (?W = ?WL)
>> }
>> ….
>> 
>> What I’d like to do is to use a similar approach for blacklisting, where if 
>> the value is in the VALUES list  ?BL, the filter expression is false.
>> 
>> The obvious ( FILTER (?X != ?BL)) doesn’t work, as the semantics are wrong - 
>> that is, if there’s just one item that’s not equal to ?X, the expression is 
>> true.
>> 
>> Use of "NOT IN" doesn’t work, as there just one expression (?BL) to compare 
>> against, and this reduces to the ?X != ?BL problem.
>> 
>> The only thing that’s worked for me is:
>> 
>>     FILTER ( ?X NOT IN ( e, f, g, h, i))
>> 
>> It’s not as good, since the list has to be replicated in each context it’s 
>> being used.
>> 
>> Is there something I’m missing ( usually is :-S ) or have I found the least 
>> bad option?
> 
> There are two other negation features in SPARQL
> 
> FILTER NOT EXISTS { pattern }
> 
> and MINUS { pattern }
> 
> In each case, pattern is a graph pattern.  It can involved VALUES.  The 
> VALUES must be inside the {}
> 
> 1/ FILTER NOT EXISTS
> 
> Remove solutions where the graph pattern does not match.
> 
> Something like:
> # Pattern involving ?x
> ?x ?p ?o .
> FILTER NOT EXISTS { ?x ?q ?v . VALUES ?x ( a b c d ) }
> 
> 2/
> # Pattern involving ?x
> ?x ?p ?o .
> MINUS { ?x ?q ?v . VALUES ?x ( a b c d ) }
> 
> They are similar but not identical.  Details matter.  Personally, I find 
> FILTER NOT EXISTS easier to think about.
> 
>       Andy
> 
>> 
>> 
>> 
>

Re: Question on VALUES clauses, "whitelists" and "blacklists"

Reply via email to