Dear all,
Although the 2016-04 release is not 100% ready we would like to share a
beta release and get any early feedback.
You can find all datasets here:
http://downloads.dbpedia.org/2016-04/
What is missing:
-
links to other datasets
-
Additional types (SDTypes, Hypernyms, DBTax)
-
We will add small, but very useful datasets to the DBpedia+ stack in the
final release.
-
Release statistics
-
download page
-
No public endpoint yet with the data
What we changed in this release
-
In addition to normalized datasets to English DBpedia (en-uris) we
additionally provide normalized datasets based on the DBpedia Wikidata
(DBw) datasets (wkd-uris). These sorted datasets will be the foundation for
the upcoming fusion process with wikidata. The DBw-based uris will be the
only ones provided from the following releases on.
-
We now filter out triples from the Raw Infobox Extractor that are
already mapped. E.g. no more “<x> dbo:birthPlace <z>” and “<x>
dbp:birthPlace|dbp:placeOfBirth|... <z>” in the same resource. These
triples are now moved to the “infobox-properties-mapped” datasets and not
loaded on the main endpoint. See issue 22
<https://github.com/dbpedia/extraction-framework/issues/22> for more
details.
-
Major improvements in our citation extraction. See here
<http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg07762.html>
for more details.
In case you missed it, what we changed in the previous release (2015-10)
-
English DBpedia switched to IRIs. This can be a breaking change to some
applications that need to change their stored DBpedia resource URIs /
links. We provide the “uri-same-as-iri” dataset for English to ease the
transition.
-
The instance-types dataset is now split to two files: instance-types
(containing only direct types) and instance-types-transitive containing the
transitive types of a resource based on the DBpedia ontology
-
The mappingbased-properties file is now split in three (3) files:
-
“geo-coordinates-mappingbased” that contains the coordinated
originating from the mappings wiki. the “geo-coordinates” continues to
provide the coordinates originating from the GeoExtractor
-
“mappingbased-literals” that contains mapping based fact with literal
values
-
“mappingbased-objects” that contains mapping based fact with object
values
-
the “mappingbased-objects-disjoint-[domain|range]” are facts that are
filtered out from the “mappingbased-objects” datasets as errors but are
still provided
-
We added a new extractor for citation data that provides two files:
-
citation links: linking resources to citations
-
citation data: trying to get additional data from citations. This is
a quite interesting dataset but we need help to clean it up
-
All datasets are available in .ttl and .tql serialization (nt, nq
dataset were neglected for reasons of redundancy and server capacity).
-
A complete changelog can always be found in the git log
<https://github.com/dbpedia/extraction-framework/compare/DBpedia_2015-04...master>
;)
Markus, on behalf of the DBpedia extraction team
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion