All,
We are now nearing complete stability re uploads, deletes, and data
cleansing activity re. the Virtuoso instance hosting the LOD Cloud [1].
We are still awaiting fresh data sets from Freebase and Bio2RDF (both
communities a prepping new RDF data sets). Once received, we will
replace the current datasets accordingly.
At the current time we have loaded 100% of all the very large data sets
from the LOD Cloud [2]. Thus, I would really like owners of RDF data
sets depicted in the clouds that cannot locate their data to notify me
(via this this mailing list) ASAP. You can use the LOD instance "Search
& Find" or "URI Lookup" or SPARQL endpoint [3] to verify existence of
your data (note: we are preserving original data provider URIs).
Of the top of my head here are the data sets added since my last update
notice:
1. U.S. Census
2. DBP RKB Explorer* and related datsets from Hugh Glaser
3. Gov-Track
4. BBC Programmes, DBtune
5. SemanticBible (this is a small dataset, not in the LOD cloud, but
added since linkage will be easy to generate)
6. PingTheSemanticWeb (FOAF Cloud and others)
7. All the Linking Open Drug Data from the LODD project.
One more time, if you have a new RDF based Linked Data archive, or an
updated dataset, please add pertinent information to the Linked Open
Data Sets page [4].
Additional developments re. Amazon Hosting:
Amazon have agreed to add all the Linked Open Data Sets to their public
data sets collective. Thus, the data sets we are loading will be
available in "raw data" on the public data sets page [5] in Elastic
Block Storage (EBS) form; meaning, you can make an EC2 AMI (e.g. a
Linux, Windows, Solaris) and install an RDF quad or triple store of
choice, then load the data. Of course, we are also going to offer a
Virtuoso 6.0 Cluster Edition AMI that will enable you to simply
instantiate a personal and service specific edition of Virtuoso with all
the LOD data in place, so that you can "press go" and have the LOD space
in true Linked Data from at your disposal in minutes (i.e. the time it
takes the DB to start).
Work on the migration of the LOD data to EC2 starts next week, so please
get your data sets in place if you want to take advantage of this most
generous offering from Amazon.
We are also going make a few USB devices with chunks of LOD data sets as
another distribution mechanism.
Links:
1. http://lod.openlinksw.com
2. http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-05.html
3. http://lod.openlinksw.com/sparql
4. http://esw.w3.org/topic/DataSetRDFDumps
5. http://aws.amazon.com/publicdatasets
--
Regards,
Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO
OpenLink Software Web: http://www.openlinksw.com