Getting Freebase onto the Semantic Web

Chris Bizer Mon, 31 Mar 2008 03:34:05 -0700


Hi John,


John Giannandrea wrote:

Chris Bizer wrote:
See:http://blog.freebase.com/2008/03/28/full-data-dumps-are-now-available/I think that it would really be exiting to turn these dumps intoRDF,publish them on the Web as Linked Data and interlink them with datasetsfrom the LOD cloud. For instance, interlinking them with DBpediashould be
very easy as both datasets contain Wikipedia article identifiers.
We would be happy to help support this effort to make our data moreLOD friendly.


This would be great.

Getting the data out on the Semantic Web as Linked Data also don'thave to be a big effort as you are already having everything that isneeded in place.

One reason we did not yet emit simple RDF ourselves was potentialconfusion about mapping specific freebase properties to the largerrange of possible ontologies. It would be simple to declare a newset of URIs for our schema, much harder to pick and choose from thelarge array of available ontologies for the range of our data.

I think for the first iteration it is completely OK if you define anew set of URIs for your schema. As a second iteration you couldreplace terms from your schema with terms from well-known vocabularieslike FOAF or SKOS.

From the LOD perspective a lot would already be won if:

1. there would be a URI for each topic in Freebase and dereferencingthis URI over the Web would return a RDF description of the conceptusing a Freebase specific schema.2. this URI would be interlinked with other data sourcesin the LODcloud, so that people could use Ssemantic Web browsers to navigatefrom these data sources into the Freebase data and so that SemanticWeb crawlers can find and index the data.

So, a minimal effort approach to getting Freebase onto the SemanticWeb could look like this:

1. Define URIs for all your concepts, somethink likehttp://www.freebase.com/rdf/resource/9202a8c04000641f800000000016a1a7

2. Deploy a Linked Data wrapper around your API that returns an RDFdescription of (in the example above) the film when somebodydereferences the URI above. A very easy way to implement such awrapper would be to just tweek the PHP script that we are using forthe RDF Book mashup. The script is found athttp://www4.wiwiss.fu-berlin.de/bizer/bookmashup/index.html

3. Interlink this RDF Version of Freebase with other data sources. Thesimplest option here would be to interlink Freebase with DBpedia asboth dataset contain Wikipedia article IDs. So what you would do is toadd a RDF link stating that a specific concept in Freebase is the sameas a concept in DBpedia to the RDF you return when one of your URIsgets dereferenced. For instance:

http://www.freebase.com/rdf/resource/9202a8c04000641f800000000016a1a7owl:sameAs http://dbpedia.org/resource/2046_(film)

4. You would send us an RDF file containing these RDF links for allFreebase concepts and we would load it into DBpedia and also servethese links.

I think all this could be done within 3 days work and would allowLinked Data browsers, like the ones listed herehttp://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/SemWebClients,to access and navigate between both datasets and would allow crawlers,like the ones listed herehttp://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/SemanticWebSearchEngines,to index both datasets.


What do you think?

Technical background information about the whole process is found inhttp://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/

After his, one could start thinking about also providing RDF dumps, sothat people could load Freebase and DBpedia together into a RDF storeand do whatever they want with the data. Or think about using wellknown terms from other vocabularies and ontologies.

We have been experimenting with using freebase itself to helpcatalog compatible ontologies for specific freebase properties.For examplehttp://www.freebase.com/view/user/jamie/web_ontology/property_mapping
If folks want to help with this, then it should be possible to useour open API to generate RDF of whatever 'flavor' you happen to beworking with, by specifying a preferred set of ontologies at querytime.

Using terms from well-known vocabularies as well as serving the datausing different vocabularies is both important, but in my opinionsomething for the second step. First step: Publish linked data. Seewhat people do with it.


Cheers

Chris

-jg

Getting Freebase onto the Semantic Web

Reply via email to