[DBpedia-discussion] [ANN] DBpedia’s Databus and strategic initiative to facilitate 1 Billion derived Knowledge Graphs by and for Consumers until 2025

Sebastian Hellmann Wed, 11 Sep 2019 02:27:24 -0700

**

[Please forward to interested colleagues]

We are proud to announce that the DBpedia Databus websiteat<https://databus.dbpedia.org/>_https://databus.dbpedia.org_<https://databus.dbpedia.org/> and the SPARQL APIat<https://databus.dbpedia.org/(repo/sparql|yasgui)>_https://databus.dbpedia.org/(repo/sparql|yasgui)_(_docu_ <http://dev.dbpedia.org/Download_Data>) are in public beta now.The system is usable (eat-your-own-dog-food tested) following a “workingsoftware over comprehensive documentation” approach. Due to its manycomponents (website, sparql endpoints, keycloak, mods, upload client,download client, and data debugging), we estimate approximately sixmonths in beta to fix bugs, implement all features and improve thedetails. If you have any feedback or questions, please usethe<https://forum.dbpedia.org/>_DBpedia Forum_<https://forum.dbpedia.org/>, the “report issues” button, or_dbpedia@infai.org_.

The full document is available at:_https://databus.dbpedia.org/dbpedia/publication/strategy/2019.09.09/strategy_databus_initiative.pdf_

We are looking forward to the feedback and discussion at the_14thDBpedia Community Meeting at SEMANTiCS 2019 in Karlsruhe_<https://wiki.dbpedia.org/events/14th-dbpedia-community-meeting-karlsruhe>on September 12th or online.



########
# Excerpt
########


     DBpedia Databus

The DBpedia Databus is a platform to capture invested effort by dataconsumers who needed better data quality (fitness for use) in order touse the data and give improvements back to the data source and otherconsumers. DBpedia Databus enables anybody to build an automatedDBpedia-style extraction, mapping and testing for any data they need.Databus incorporates features from DNS, Git, RSS, online forums andMaven to harness the full workpower of data consumers.



     Vision

Professional consumers of data worldwide have already built stablecleaning and refinement chains for all available datasets, but theirefforts are invisible and not reusable. Deep, cleaned data silos existbeyond the reach of publishers and other consumers trapped locally inpipelines.

*Data is not oil that flows out of inflexible pipelines*. Databus breaksexisting pipelines into individual components that together form adecentralized, but centrally coordinated data network in which data canflow back to previous components, the original sources, or end up beingconsumed by external components,

The Databus provides a platform for re-publishing these files with verylittle effort (leaving file traffic as only cost factor) while offeringthe full benefits of built-in system features such as automatedpublication, structured querying, automatic ingestion, as well aspluggable automated analysis, data testing via continuous integration,and automated application deployment *(software with data)*. The impactis highly synergistic, just a few thousand professional consumers andresearch projects can expose millions of cleaned datasets, which are onpar with what has long existed in deep silos and pipelines.



   1 Billion interconnected, quality-controlled Knowledge Graphs until 2025

As we are inversing the paradigm form a publisher-centric view to a dataconsumer network, we will open the download valve to enable discoveryand access to massive amounts of cleaner data than published by theoriginal source. The main DBpedia Knowledge Graph - cleaned data fromWikipedia in all languages and Wikidata - alone has 600k file downloadsper year complemented by downloads at over 20 chapter,e.g.<http://es.dbpedia.org/>_http://es.dbpedia.org_<http://es.dbpedia.org/> as well as over 8 million daily hits on themain Virtuoso endpoint. Community extension from the alpha phase suchas<https://databus.dbpedia.org/sven-h/dbkwik/dbkwik/2019.09.02>_DBkWik_<https://databus.dbpedia.org/sven-h/dbkwik/dbkwik/2019.09.02>,<https://databus.dbpedia.org/propan/lhd/linked-hypernyms>_LinkedHypernyms_<https://databus.dbpedia.org/propan/lhd/linked-hypernyms> are beingloaded onto the bus and consolidated and we expect this number to reachover 100 by the end of the year. Companies and organisations whohave<https://github.com/dbpedia/links>_previously uploaded theirbacklinks here_ <https://github.com/dbpedia/links> will be able tomigrate to the databus. Other datasets are cleaned and posted. In two ofour research projects_LOD-GEOSS_<https://www.enargus.de/pub/bscw.cgi/?op=enargus.eps2&s=14&q=BASF%20SE&v=10&m=2&id=1216225&p=1>and<http://plass.io/>_PLASS_ <http://plass.io/>, we will re-publish opendatasets, clean them and create collections, which will result inDBpedia-style knowledge graphs for energy systems and supply-chainmanagement.

The *full document* is available at:_https://databus.dbpedia.org/dbpedia/publication/strategy/2019.09.09/strategy_databus_initiative.pdf_


**

**

**

_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

[DBpedia-discussion] [ANN] DBpedia’s Databus and strategic initiative to facilitate 1 Billion derived Knowledge Graphs by and for Consumers until 2025

Reply via email to