On 9 March 2011 08:24, Joachim Baran <[email protected]> wrote: > Hi! > > On 11-03-08 4:35 PM, "Peter Ansell" <[email protected]> wrote: >>Did a SPARQL interface (and the configuration to map Biomarts to RDF >>and URIs) make it into the 0.8 release candidates? > The release of RC5 is about to happen very very soon and by then we will > not have a finished SPARQL-interface ready. > > However, there is rudimentary SPARQL-support build into RC5. You will be > able to run all queries over attributes with the SPARQL-queries that are > generated by the SPARQL-button. Depending on how your filters are named, > you can also restrict the result-set the same way as you can by using > filters in the web-interface. > > There is a temporary playground at http://dcc-dev.res.oicr.on.ca:9005 > where you can experiment with SPARQL a bit. For example, I set "Limit to > results:" to "...with Disease ID" and picked the attributes "Pathway ID", > "Pathway Name", "Disease ID" and "Disease Name". The SPARQL-button > generates the following query: > > BASE <datasets:kegg> > PREFIX biomart: <http://www.biomart.org/ontology#> > SELECT ?a0 ?a1 ?a2 ?a3 WHERE { > ?f0 biomart:pathway__pathway_dis__disease_bool_104_has_value "only" . > ?a0 a biomart:pathway__entry_101 . > ?a1 a biomart:pathway__name_101 . > ?a2 a biomart:pathway__pathway_dis__disease_104 . > ?a3 a biomart:pathway__pathway_dis__disease_1_104 > }
This doesn't look like it follows the typical SPARQL methodology. The main issue is that it seems to be encoding values and filters into predicate names instead of relying on the FILTER() construct. Typically RDF documents keep values as objects of statements where their meaning is given by the predicate and not by a . This results in no direct links between ?f1 and the ?a* variables. Also, BASE<datasets:kegg> should be GRAPH<datasets:kegg> inside of the query, to allow for multiple datasets in the same query, separated by different GRAPH names. It may be better if this query could be expressed as PREFIX biomart: <http://www.biomart.org/ontology#> PREFIX kegg_pathway: <http://kegg.org/kegg_pathway:> SELECT ?a0 ?a1 ?a2 ?a3 WHERE { GRAPH <datasets:kegg> { ?f1 a biomart:pathway__pathway_dis__disease . ?f1 biomart:key kegg_pathway:101 . ?f1 biomart:pathway__pathway_disease_name ?disease . OPTIONAL{ ?f1 biomart:pathway__entry ?a0 . } OPTIONAL{ ?f1 biomart:pathway__name ?a1 . } OPTIONAL{ ?f1 biomart:pathway__pathway_dis__disease ?a2 . } OPTIONAL { ?f1 biomart:pathway__pathway_dis__disease_1 ?a3 . } } } In this form there are links between ?f1 and the ?a* variables, and it doesn't look like values or filters are encoded into predicates or types. I may have misunderstood the point of the original query, but there are usually graph-like-links between the filtered variable (in this case ?f1) and the attributes that need to be selected. All of the properties that may not be required are enclosed in OPTIONAL so that you know you don't necessarily have to match them to return a result. > This query can be executed via a RESTful-interface (use > http://meyerweb.com/eric/tools/dencoder/ to encode the plain-text to an > HTTP suitable format): > > > http://dcc-dev.res.oicr.on.ca:9005/rest/sparql/RDF/pathway_config_1/get?que > ry=BASE%20<datasets[and so on] > > > How does SPARQL-know what to query? Well, that is defined by the > "rdf"-property of attributes in MartConfigurator. The "rdf"-property of > "Pathway Name" in the KEGG Mart is set to: > > ^^http://www.biomart.org/ontology#pathway__name_101| > > http://www.biomart.org/ontology#pathway__name_101_has_value;pathway__name_1 > 01| > > http://www.biomart.org/ontology#pathway__name_101_of;pathway__table_key_101 > > This property means: > - the attribute has the RDF-type > "http://www.biomart.org/ontology#pathway__name_101" (the ^^-entry) > - the attribute is subject of the predicate > "http://www.biomart.org/ontology#pathway__name_101_has_value" with > attribute "pathway__name_101" that acts as object (that is actually the > attribute itself) > - the attribute is subject of the predicate > "http://www.biomart.org/ontology#pathway__name_101_of" with the attribute > "pathway__table_key_101" that stands in for the object (the main attribute > of the mart; the primary key used to create the mart) > > Now, I am sure this is all confusing, especially since the > "rdf"-property was generated automatically. If you create a mart yourself, > you will be able to play around with the property a bit. In the example > above, you see that only the RDF-type definition is used to get hold of > the attributes in the query, but you are also able to use something like: Typically the rdf:type statements are not used to identify attributes in SPARQL or RDF, as predicates are used to identify attributes, although the values can be objects if necessary. > SELECT ?name WHERE { ?name biomart:pathway__name_101_of ?key } LIMIT 5 Typically this query would be executed in reverse in SPARQL. In RDF, Literals cannot be the subject of statements, and ?name is likely to be a Literal unless I have misunderstood this query. SELECT DISTINCT ?key ?name WHERE { ?key biomart:pathway__name_101 ?name . } LIMIT 5 > Returning meta-data of a mart (i.e. information about what you can > actually query by SPARQL) is not implemented yet and proper SPARQL-support > will be included in RC6, which will be released in April. Thanks for giving us a preview of what you have done so far. Cheers, Peter _______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
