Hello Sören, First of all, placing the SQL query into SPARQL is required less frequently than it seems to be. Virtuoso's optimizer does not treat subquery as an instruction to preserve the order. As a result,
PREFIX : <http://people.example/> SELECT ?y ?name ?age WHERE { :alice :knows ?y . { SQL SELECT y, name, age FROM people } } and select q1."y", q2.name, q2.age from (sparql PREFIX : <http://people.example/> SELECT ?y WHERE { :alice :knows ?y } ) as q1. people as q2 where q2.y=q1."y" will probably produce equivalent execution plans. However, native SQLs could be very useful (and sometimes unavoidable) in nested scalar subqueries, e.g. in FILTER(bif:exists(...)) or in SPARQL select (select ...) as ?calculated-value1 (select ...) as ?calculated-value-2 where ... The problem is that the sparql compiler may copy the content of the sql subquery into many places of the resulting big sql query. If the subquery refers to some aliases then multiple copies of these aliases may cause weird SQL compilation errors. Even worse, the subquery may omit aliases making things more obfuscating. Finally, the subquery will probably refer to variables bound in surrounding SPARQL so that variables should be recognized in the text of the query and appropriate aliases should be imprinted before them. A dirty hack is to write an RDF view that creates triples using tables, joins and filter conditions from the SQL query in question. That will make the optimizer happy and provide best possible SQL code but it's next to unusable if there are many different subqueries. Or better write an RDF view that creates triples using tables and maybe some joins from the SQL query in question but place filters and remaining joins into SPARQL query over that view. That's more flexible and the quality of the generated code will stay good. However the RDF View is a bad choice if an SQL view should be used as a source, or not applicable at all if the SQL view is actually procedure view. So I should think what could be done. Right now SPARQL compiler is a preprocessor at the front of the compiler. To handle arbitrary SQL subqueries, the SQL processor should be divided in parts, so there will be an SQL+SPARQL preprocessor, then SPARQL processor then core of the SQL compiler, not a small change. As a variant, the SQL inside SPARQL will contain special easily recognizable syntax extensions to refer to variables of surrounding SPARQL (and variables of SQL code around surrounding SPARQL). I will discuss the issue with others and return to this topic after the release that will contain SPARQL 1.1 extensions. Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com On Wed, 2010-07-14 at 13:55 +0200, Sören Auer wrote: > Hi, > > For some experiments we plan to run it would be very useful to embed an > SQL query as a subquery inside a SPARQL query. We want to combine > relational with RDF data for example in the following way: > > Lets assume we have foaf profiles in the triple store and a relational > table with information about people (e.g. from a CRM system). A query > similar to the following would be really useful in that case: > > PREFIX : <http://people.example/> > SELECT ?y ?name ?age WHERE { > :alice :knows ?y . > { > SQL SELECT y, name, age FROM people > } > } > > Instead of joining with the triple table a join with the SQL subquery > would occur and SPARQL variables would be matched against columns of the > SQL result set with the same name. > > Does this make sense? Are there already plans to implement something > along these lines in Virtuoso? I think this functionality would > dramatically simplify a number of data integration tasks and be a > wonderful USP for Virtuoso. > > Best, > > Sören
