Dear all,

Although the 2016-04 release is not 100% ready we would like to share a
beta release and get any early feedback.

You can find all datasets here:

http://downloads.dbpedia.org/2016-04/

What is missing:

   -

   links to other datasets
   -

   Additional types (SDTypes, Hypernyms, DBTax)
   -

   We will add small, but very useful datasets to the DBpedia+ stack in the
   final release.
   -

   Release statistics
   -

   download page
   -

   No public endpoint yet with the data


What we changed in this release

   -

   In addition to normalized datasets to English DBpedia (en-uris) we
   additionally provide normalized datasets based on the DBpedia Wikidata
   (DBw) datasets (wkd-uris). These sorted datasets will be the foundation for
   the upcoming fusion process with wikidata. The DBw-based uris will be the
   only ones provided from the following releases on.
   -

   We now filter out triples from the Raw Infobox Extractor that are
   already mapped. E.g. no more “<x> dbo:birthPlace <z>” and “<x>
   dbp:birthPlace|dbp:placeOfBirth|... <z>” in the same resource. These
   triples are now moved to the “infobox-properties-mapped” datasets and not
   loaded on the main endpoint. See issue 22
   <https://github.com/dbpedia/extraction-framework/issues/22> for more
   details.
   -

   Major improvements in our citation extraction. See here
   
<http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg07762.html>
   for more details.


In case you missed it, what we changed in the previous release (2015-10)

   -

   English DBpedia switched to IRIs. This can be a breaking change to some
   applications that need to change their stored DBpedia resource URIs /
   links. We provide the “uri-same-as-iri” dataset for English to ease the
   transition.
   -

   The instance-types dataset is now split to two files: instance-types
   (containing only direct types) and instance-types-transitive containing the
   transitive types of a resource based on the DBpedia ontology
   -

   The mappingbased-properties file is now split in three (3) files:
   -

      “geo-coordinates-mappingbased” that contains the coordinated
      originating from the mappings wiki. the “geo-coordinates” continues to
      provide the coordinates originating from the GeoExtractor
      -

      “mappingbased-literals” that contains mapping based fact with literal
      values
      -

      “mappingbased-objects” that contains mapping based fact with object
      values
      -

      the “mappingbased-objects-disjoint-[domain|range]” are facts that are
      filtered out from the “mappingbased-objects” datasets as errors but are
      still provided
      -

   We added a new extractor for citation data that provides two files:
   -

      citation links: linking resources to citations
      -

      citation data: trying to get additional data from citations. This is
      a quite interesting dataset but we need help to clean it up
      -

   All datasets are available in .ttl and .tql serialization (nt, nq
   dataset were neglected for reasons of redundancy and server capacity).
   -

   A complete changelog can always be found in the git log
   
<https://github.com/dbpedia/extraction-framework/compare/DBpedia_2015-04...master>
   ;)



Markus, on behalf of the DBpedia extraction team
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to