Kelly, The answer is to use an RDF query language (SPARQL is the obvious choice), and a tool that automatically translates the RDF queries to SQL.
There are two kinds of these tools: First, those that present an RDF view or RDF dumps of an existing database schema (like the first table from your example). The D2R Server/D2RQ suite mentioned by Thomas is a popular choice (better links: [1], [2]). SquirrelRDF [3] is another, and OpenLink Virtuoso seems to have some support for this too [4]. Eric Prud'hommeaux did some work on extending MySQL with SPARQL capabilities [5]. Second, there are tools that use a schema optimized for storing RDF, often similar to the resource-predicate-value table in your second example. Many RDF libraries and frameworks such as Jena [6], Sesame [7] and RAP [8] come with such a "triple store". 3store [9] is another example optimized for MySQL. Oracle has built-in support for this (though they don't do SPARQL but invented their own query language) [10]. There's been a lot of work on these issues in the RDF and database communities. A random link dump can be found at [11]. Yours, Richard [1] http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/ [2] http://sites.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/ [3] http://jena.sourceforge.net/SquirrelRDF/ [4] http://virtuoso.openlinksw.com/wiki/main/Main/VOSSQLRDF [5] http://xtech06.usefulinc.com/schedule/paper/156 [6] http://jena.sourceforge.net/ [7] http://www.openrdf.org/ [8] http://sites.wiwiss.fu-berlin.de/suhl/bizer/rdfapi/ [9] http://www.aktors.org/technologies/3store/ [10] http://www.oracle.com/technology/tech/semantic_technologies/ [11] http://del.icio.us/tag/sparql2sql On 1 Feb 2007, at 03:13, Kelly Jones wrote: > I have some data in a regular MySQL table called usplaces: > > city |state|country|latitude |longitude |population|comments > -------------+-----+-------+---------+-----------+----------+-------- > New York |NY |USA |40.704234| -73.917927|8008278 |Big Apple > Chicago |IL |USA |41.840675|-87.679365 |2896016 |Windy > City > San Francisco|CA |USA |37.759881|-122.437392|776733 |City > by the Bay > [pretend there's lots more rows here] > > I can now make queries like this: > > "All information about cities between 35 and 40 degrees latitude, -90 > and -70 longitude with population over 10K": > > SELECT * FROM usplaces WHERE (latitude BETWEEN 35 AND 40) AND > (longitude between -90 AND -70) AND (population>10000); > > "All states that have at least one city larger than 1M": > > SELECT DISTINCT state FROM usplaces WHERE population>1000000; > > I now convert the data to RDF format (a different MySQL table called > usplaces_rdf) as follows (I'm cheating slightly: "Chicago" in the > first column means "the URI representing the city of Chicago", and > "IL" in the third column means "the URI representing the state of > Illinois"): > > key |relation |value > -------+----------+----- > Chicago|state |IL > Chicago|country |USA > Chicago|geo:lat |41.840675 > Chicago|geo:lon |-87.679365 > Chicago|population|2896016 > Chicago|comments |Windy City > [lots more rows here, 6 rows for each row in the original table] > > [There are many other much more efficient ways to convert to RDF, this > is just an example] > > I can now run the first query above ("All information about cities > between 35 and 40 degrees latitude, -90 and -70 longitude with > population over 10K") as something like: > > SELECT us1.* FROM usplaces_rdf us1 > LEFT JOIN usplaces_rdf us2 ON (us1.key=us2.key AND > relation='geo:lat') > LEFT JOIN usplaces_rdf us3 ON (us2.key=us3.key AND > relation='geo:lon') > LEFT JOIN usplaces_rdf us4 ON (us3.key=us4.key AND > relation='population') > WHERE > (us2.value BETWEEN 35 AND 40) AND > (us3.value BETWEEN -90 AND -70) AND > (us4.value > 10000); > > [untested, but should be fairly close]. > > I'm sure I could get my 2nd result ("All states that have at least one > city larger than 1M") with an even more complex query. > > RDF seems to have many advantages (eg, multiple values for a given > field and the ability to store "objects" in a column, not just > values), but the queries seem long and tedious. They're probably > efficient, but hard are hard to write. > > Has anyone written a simpler query language for RDF data stored in an > SQL table? Are there better ways of storing/searching RDF data? > > [I sense there's a deep connection between SQL and RDF, but haven't > figured it out yet] > > -- > We're just a Bunch Of Regular Guys, a collective group that's trying > to understand and assimilate technology. We feel that resistance to > new ideas and technology is unwise and ultimately futile. > _______________________________________________ > General mailing list > [email protected] > http://simile.mit.edu/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
