I'm wondering how to best issue SPARQL queries when the data is
structured as follows.

I'm seeing some RDF data sets where some of the properties name other
RDF models that contain other data.  For example, the OSLC property
oslc_cm:cmServiceProviders has as its object an rdf:resource that is
interpreted as a URL to fetch another model.  The PubChemRDF property
http://rdf.ncbi.nlm.nih.gov/pubchem/descriptor/CID2244_Canonical_SMILES
has as its object an rdf:resource that is interpreted as a URL to fetch
another RDF model that has the molecular structure of the substance
(which has a text string formula rather than a true RDF graph, but you
get the idea).

My understanding is that a FROM clause is used to list multiple models
that are collectively subjected to a single SPARQL query -- correct? 
But what if I don't know them all in advance?  All I can think of is to
do a query to get the list, then have code generate a new query, but
there may be a whole lot of those.  As an example, PubChem has millions
of substances, each of which would have to be fetched in order to get
the list of URLs for all the molecule structure RDF models.  I am
vaguely concerned the performance of that might not be as good as
issuing a single query to a single dataset having all structures to find
a hand-full of molecules with some rare substructure.

Is there a cleaner way to handle this sort of thing?  Any thoughts or
suggestions would be welcome.


Reply via email to