Hi Denny,
I am sorry, but I have to voice some criticism of this project. Over the
past two years, I have become increasingly wary of the excitement over large
numbers of triples in the LOD community. Large numbers of triples don't mean
don't necessarily mean that a dataset enables us to do anything novel or
significantly useful. I think there should be a shift from focusing on
quantity to focusing on quality and usefulness.
Now the project you describe seems to be well-made, but it also exemplifies
this problem to a degree that I have not seen before. You basically
published a huge dataset of numbers, for the sake of producing a large
number of triples. Your announcement mainly emphasis on how huge the dataset
is, and the corresponding paper does the same. The paper gives a few
application scenarios, I quote
"The added value of the paradigm shift initiated by our work cannot be
underestimated.
By endowing numbers with an own identity, the linked open data cloud
will become treasure trove for a variety of disciplines. By using elaborate
data
mining techniques, groundbreaking insights about deep mathematical
correspondences
can be obtained. As an example, using our sample dataset, we were able
to discover that there are signicantly more odd primes than even ones, and
even more excitingly a number contains 2 as a prime factor exactly if its
successor does not."
I am sorry, but this sounds a bit overenthusiastic. I see no paradigm
shift, and I also don't see why your findings about prime numbers required
you to publish the dataset as linked data. I also have troubles seeing the
practical value of looking at the resource pages for each number with a
linked data browser, but I am also not a mathematician.
I am sorry for being a bit antagonistic, but we as a community should really
try not to be seduced too easily by publishing ever-larger numbers of
triples.
Cheers,
Matthias Samwald
--------------------------------------------------
From: "Denny Vrandecic" <[email protected]>
Sent: Thursday, April 01, 2010 12:01 PM
To: <[email protected]>
Subject: KIT releases 14 billion triples to the Linked Open Data cloud
We are happy to announce that the Institute AIFB at the KIT is releasing
the biggest dataset until now to the Linked Open Data cloud. The Linked
Open Numbers project offers billions of facts about natural numbers, all
readily available as Linked Data.
Our accompanying peer-reviewed paper [1] gives further details on the
background and implementation. We have integrated with external data
sources (linking DBpedia to all their 335 number entities) and also
directly link to the best-known linked open data browsers from the page.
You can visit the Linked Open Numbers project at:
<http://km.aifb.kit.edu/projects/numbers/>
Or point your linked open data browser directly at:
<http://km.aifb.kit.edu/projects/numbers/n1>
We are happy to have increased the amount of triples on the Web by more
than 14 billion triples, roughly 87.5% of the size of linked data web
before this release (see paper for details). We hope that the data set
will find its serendipitous use.
The data set and the publication mechanism was checked pedantically, and
we expect no errors in the triples. If you do find some, please let us
know. We intend to be compatible with all major linked open data
publication standards.
About the AIFB
The Institute AIFB (Applied Informatics and Formal Description Methods) at
KIT is one of the world-leading institutions in Semantic Web technology.
Approximately 20 researchers of the knowledge management research group
are establishing theoretical results and scalable implementations for the
field, closely collaborating with the sister institute KSRI (Karlsruhe
Service Research Institute), the start-up company ontoprise GmbH, and the
Knowledge Management group at the FZI Research Center for Information
Technologies. Particular emphasis is given to areas such as logical
foundations, Semantic Web mining, ontology creation engineering and
management, RDF data management, semantic web search, and the
implementation of interfaces and tools. The institute is involved in many
industry-university co-operations, both on a European and a national
level, including a number of intelligent Web systems case studies.
Website: <http://www.aifb.kit.edu>
About KIT
The Karlsruhe Institute of Technology (KIT) is the merger of the former
Universität Karlsruhe (TH) and the former Forschungszentrum Karlsruhe.
With about 8000 employees and an annual budget of 700 million Euros, KIT
is the largest technical research institution within Germany. KIT is both,
a state university with research and teaching and, at the same time, a
large-scale research institution of the Helmholtz Association. KIT has a
strong reputation as one of Germany’s university of excellence, aiming to
set the highest standards for education, research and innovation.
Website: <http://www.kit.edu>
[1] Denny Vrandecic, Markus Krötzsch, Sebastian Rudolph, Uta Lösch:
Leveraging Non-Lexical Knowledge for the Linked Open Data Web, published
in Rodolphe Héliot and Antoine Zimmermann (eds.), The Fifth RAFT'2010),
the yearly bilingual publication on nonchalant research, available at
<http://km.aifb.kit.edu/projects/numbers/linked_open_numbers.pdf>=