Re: [rdflib-dev] doing AND on leaves of a tree

Chimezie Ogbuji Thu, 09 Aug 2007 08:58:16 -0700

On 8/6/07, Niklas Lindström <[EMAIL PROTECTED]> wrote:
> Hello!
>
> I did some digging and discovered that the test will work (using
> latest trunk, 1236 right now) if you change DAWG_DATASET_COMPLIANCE in
> "rdflib/sparql/Algebra.py" to False. Not sure wether or not this
> signifies a much bigger problem, my example code below does it by
> "monkey patch" (and I will not change this in the file for now, I feel
> it may be a more deep problem).


Some explanation of this switch (DAWG_DATASET_COMPLIANCE) might shed
some light.  If you follow SPARQL specification verbatim, the
'default' graph is always the first one that is matched.  The
'default' graph is a graph without an identifier - or at least, its
identifier cannot be matched.  This breaks with RDFLib where *all*
Graphs have an identifier which can be matched (note the Graph.quads
method).  In the absence of a default graph specified explicitely as
the dataset for the query (via FROM <..>), the default graph is an
empty graph, so any pattern without a GRAPH directive will always
match nothing! In order to comply with SPARQL, the assumption had to
be made (when the compliance flag is set to True) that the default
context for a ConjunctiveGraph is *the* default graph (as defined by
SPARQL).

Take a look at the graph tests for an example of what is the
'expected' verbatim behavior here - and how it might be problematic
for RDFLib without these caveats:

http://www.w3.org/2001/sw/DataAccess/tests/data-r2/graph/

This switch was a first attempt to allow a break from this verbatim
interpretation which is problematic for practical use where you have
an RDF dataset all with well defined identifiers and you wish to have
SPARQL patterns without GRAPH operators search within all the named
graphs (which is the expected behavior of ConjunctiveGraph.quads(.)).
The default is True, currently.  Perhaps it should be false by default
instead?

This is not the only area where RDFLib's SPARQL capabilities break
from the verbatim spec interpretation.  SPARQL does not allow direct
matching of BNodes in persistence, RDFLib does.  So SPARQL patterns
with BNode identifiers evaluated within RDFLib will match those BNodes
in persistence with the same name.  It is a well known frustration to
not be able to carry BNode identifiers from one query session to
another (especially against RDF graphs which make heavy use of
BNodes).


> This is the "test" code I used. (Please notice my use of FILTER - I'm
> not sure wether that is needed in SPARQL or if the spec says that
> SPARQL does no such "no uniqueness assumption" for bindings by
> default.)

It doesn't, the filter is neccessary.  At least, the sparql-p
evaluation algorithm will unify variables regardless of the names of
the variables.

_______________________________________________
Dev mailing list
Dev@rdflib.net
http://rdflib.net/mailman/listinfo/dev

Re: [rdflib-dev] doing AND on leaves of a tree

Reply via email to