Re: [rdflib-dev] doing AND on leaves of a tree

Chimezie Ogbuji Wed, 19 Sep 2007 14:32:01 -0700

On 9/19/07, whit <[EMAIL PROTECTED]> wrote:
>
>
> > This should only be used against a CG since it will cause it to load
> > the default graph ( "context" ).  If you are dispatching SPARQL, you
> > should either do it against a CG or use a GRAPH name / variable in the
> > expression (perhaps with a topLevel binding of the graphs identifier:
> > g.query(..query..,initBindings={Variable('graphName') :
> g.identifier
> > }).


> I think saying graph.query(somequery) explicitly says "I want to query this
> graph", and no ids should be necessary

True.  In retrospect, I guess it would simply be an API shortcut for
"construct a RDF dataset consisting of an empty set of URI labeled
graphs and a default graph composed of the source graph and evaluate
the query".

That accommodates both SPARQL and the RDFLib API

> (note... this is what seems to be
> going on) so though I accept that passing in the initBinding would work, why
> are you making me do that?

It was only a first attempt to reconcile the expected behavior for
matching RDF datasets with the RDFLib API.  This is not a trivial
consideration.  See: "No way to specify an RDF dataset of all the
known named graphs", for instance:

http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2007Apr/0001.html

> And why should CG and Graph act differently here
> (and why should I have to care?)

See above.  I can add the above behavior.

> > SPARQL queries are dispatched against RDF datasets, which are
> > composed of 'multiple' RDF graphs.
> hmmm... I picked that up from the  crazy conditional blocks in
> rdflib.sparql.Algebra.TopEvaluate (speaking of which,
> shouldn't we be following
> http://www.python.org/dev/peps/pep-0008/?)

Sure.  I've been a little more preoccupied with implementing full
SPARQL behavior than formatting printable code =).

> would the determination of dataset be something that should happen outside
> of the graph API?

In the absence of a dataset identified in the protocol request or  in
the query itself, the dataset used to match against a SPARQL query is
application-specific.  So, other than the graph API there isn't any
other immediately obvious place to identify the dataset.

> the fact that a query can call in other resources beyond the graph makes me
> sort of feel like a more explicit api would be something like this:

This is built into the SPARQL language.  Any use of FROM <..>
*requires* that the 'default graph' is loaded with the RDF resolved
from the URI.

> from rdflib import sparql
>
> ... set up a list of graph and resource, etc
>
> sparql.query(myquery, data=list_of_graphs_or_resources,
> default_graphs='someid')

The data argument wouldn't make sense there since the dataset is
either determined already (if specified at the protocol), specified in
the SPARQL query itself, or it is application-specific (in which case
if it is evaluated against a CG, this is the RDF dataset, otherwise
it's a bit wonky and the compromise seems to be to assume an RDF
dataset with no named graphs and a single default graph consisting of
the object of the 'query' method).

The default_graph keyword would really only apply if the query is
called against a CG and you wanted to explicitly identify the BNode to
use as the identifier of the default graph (instead of using the
'first' one).  i.e:

class ConjunctiveGraph(..)
  def query(self,query,default_graph=self.default_context,....):
    pass

> Otherwhise, it feels like we are conflating graphs and datasets or at least
> hiding the actual relationship (and leading to nasty surprises).

There is no conflation happening with graphs and datasets.  The
confusion is between the RDFLib API, the 'formal' definition of an RDF
dataset, and the way for RDFLib to setup an 'application-specific'
RDF dataset (in the abcense of any indication of one).

> If my graph is ided as "http://myns/dataname";, I expect FROM to use that if
> my graph is ided "http://myns/dataname"; as it, not attempt to download some
> web resource (what currently happens).

The FROM operation is supposed to behave that way by definition:

"The FROM and FROM NAMED keywords allow a query to specify an RDF
dataset by reference; they indicate that the dataset should include
graphs that are obtained from representations of the resources
identified by the given IRIs"

By "representations of the resources .." they mean a HTTP dereference
- at least with FROM <...>.  FROM NAMED Is different.

> Sparql seems to default sensibly the
> graph the query dispatched from, but not recognize the id of that default
> graph.

Here is the chain:

1. No FROM / FROM NAMED in SPARQL query
2. Does the protocol specify a URI which resolves to an RDF
representation to use as the default graph?  No...
3. Use the application-specific default graph.  Either:
3a. The default_context of the CG the query was dispatched against
3b. The Graph in CG with the given BNode as its identifier
3c. The *first* Graph in CG with a BNode identifier (or en empty graph
if there is none)
3d. Use the graph the query was dispatched against
_______________________________________________
Dev mailing list
Dev@rdflib.net
http://rdflib.net/mailman/listinfo/dev

Re: [rdflib-dev] doing AND on leaves of a tree

Reply via email to