Dear John,
yes, your idea is actually what you can do with Wikidata already. Like
you could replace part of the text with Wikidata information, which goes
in the direction of your proposal.
However, although the feature exists, I was unable to find any example
of this in the article text, but I only looked briefly. Maybe somebody else?
Not sure, whether you can tell whole articles, stories or essays in a
semantic language. A lot of information would be lost, if you do it in
OWL. Also it seems very hard to encode it even using something like
Attempto Controlled English:
https://en.wikipedia.org/wiki/Attempto_Controlled_English
Our motivation is that we do try getting more information with relation
extraction for now until something better presents itself.
all the best,
Sebastian
On 07.03.2017 15:31, Paul Houle wrote:
Isn't that Wikidata?
--
Paul Houle
paul.ho...@ontology2.com
On Mon, Mar 6, 2017, at 04:28 PM, John Flynn wrote:
I applaud this initiative to extract triples from Wikipedia open
text. However, it would be useful to initiate a parallel
challenge/effort to represent a limited portion of the current
Wikipedia article text as semantic representation, eliminating the
text altogether. In this approach, the Wikipedia information would be
semantically encoded as its original representation, as opposed to
using text to represent the information. A small subset of Wikipedia
subject matter could be used for this experiment. After the limited
Wikipedia domain of interest was fully semantically represented,
tools could be developed to translate the semantic representation
into human readable text. It seems over the long run creating the
original knowledge as a semantic representation, instead of text,
would result in a Wikipedia knowledge base that upon query by humans
could automatically perform the necessary translation into text in
whichever human language the user desired. This concept would also
facilitate machine to machine use of the Wikipedia knowledge base,
which is currently difficult, if not impossible, due to the textual
nature of the information. You could also envision tools that would
eventually make it easy for authors to source the article information
directly in semantic representation. The end results would be a
DBpedia on steroids and the eventually elimination of Wikipedia as
the original article text sources would no longer be needed.
John Flynn
http://semanticsimulations.com
*From:*Sebastian Hellmann [mailto:hellm...@informatik.uni-leipzig.de]
*Sent:* Monday, March 06, 2017 5:56 AM
*To:* DBpedia
*Subject:* [DBpedia-discussion] DBpedia Open Text Extraction
Challenge - TextExt
*DBpedia Open Text Extraction Challenge - TextExt*
Website: http://wiki.dbpedia.org/textext
*_Disclaimer: The call is under constant development, please refer to
the news section. We also acknowledge the initial engineering effort
and will be lenient on technical requirements for the first
submissions and will focus evaluation on the extracted triples and
allow late submissions, if they are coordinated with us_*.
Background
DBpedia and Wikidata currently focus primarily on representing
factual knowledge as contained in Wikipedia infoboxes. A vast amount
of information, however, is contained in the unstructured Wikipedia
article texts. With the DBpedia Open Text Extraction Challenge, we
aim to spur knowledge extraction from Wikipedia article texts in
order to dramatically broaden and deepen the amount of structured
DBpedia/Wikipedia data and provide a platform for benchmarking
various extraction tools.
Mission
Wikipedia has become the ubiquitous source of knowledge for the world
enabling humans to lookup definitions, quickly become familiar with
new topics, read up background infos for news event and many more -
even settling coffee house arguments via a quick mobile research. The
mission of DBpedia in general is to harvest Wikipedia’s knowledge,
refine and structure it and then disseminate it on the web - in a
free and open manner - for IT users and businesses.
News and next events
Twitter: Follow @dbpedia <https://twitter.com/dbpedia>, Hashtag:
#dbpedianlp
<https://twitter.com/search?f=tweets&q=%23dbpedianlp&src=typd>
·LDK <http://ldk2017.org/> conference joined the challenge (Deadline
March 19th and April 24th)
·SEMANTiCS <http://2017.semantics.cc/> joined the challenge (Deadline
June 11th and July 17th)
·Feb 20th, 2017: Full example added to this website
·March 1st, 2017: Docker image (beta)
https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge
Coming soon:
·beginning of March: full example within the docker image
·beginning of March: DBpedia full article text and tables (currently
only abstracts) http://downloads.dbpedia.org/2016-10/core-i18n/
Methodology
The DBpedia Open Text Extraction Challenge differs significantly from
other challenges in the language technology and other areas in that
it is not a one time call, but a continuous growing and expanding
challenge with the focus to *sustainably* advance the state of the
art and transcend boundaries in a *systematic* way. The DBpedia
Association and the people behind this challenge are committed to
provide the necessary infrastructure and drive the challenge for an
indefinite time as well as potentially extend the challenge beyond
Wikipedia.
We provide the extracted and cleaned full text for all Wikipedia
articles from 9 different languages in regular intervals for download
and as Docker in the machine readable NIF-RDF
<http://persistence.uni-leipzig.org/nlp2rdf/> format (Example for
Barrack Obama in English
<https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge/blob/master/BO.ttl>).
Challenge participants are asked to wrap their NLP and extraction
engines in Docker images and submit them to us. We will run
participants’ tools in regular intervals in order to extract:
1.Facts, relations, events, terminology, ontologies as RDF triples
(Triple track)
2.Useful NLP annotations such as pos-tags, dependencies, co-reference
(Annotation track)
We allow submissions 2 months prior to selected conferences
(currently http://ldk2017.org/ and http://2017.semantics.cc/ ).
Participants that fulfil the technical requirements and provide a
sufficient description will be able to present at the conference and
be included in the yearly proceedings. *Each conference, the
challenge committee will select a winner among challenge
participants, which will receive 1000€.*
Results
Every December, we will publish a summary article and proceedings of
participants’ submissions at http://ceur-ws.org/ . The first
proceedings are planned to be published in Dec 2017. We will try to
briefly summarize any intermediate progress online in this section.
Acknowledgements
We would like to thank the Computer Center of Leipzig University to
give us access to their 6TB RAM server Sirius to run all extraction
tools.
The project was created with the support of the H2020 EU project
HOBBIT <https://project-hobbit.eu/> (GA-688227) and ALIGNED
<http://aligned-project.eu/> (GA-644055) as well as the BMWi project
Smart Data Web <http://smartdataweb.de/> (GA-01MD15010B).
Challenge Committee
·Sebastian Hellmann, AKSW, DBpedia Association, KILT Competence
Center, InfAI, Leipzig
·Sören Auer, Fraunhofer IAIS, University of Bonn
·Ricardo Usbeck, AKSW, Simba Competence Center, Leipzig University
·Dimitris Kontokostas, AKSW, DBpedia Association, KILT Competence
Center, InfAI, Leipzig
·Sandro Coelho, AKSW, DBpedia Association, KILT Competence Center,
InfAI, Leipzig
Contact Email: dbpedia-textext-challe...@infai.org
<mailto:dbpedia-textext-challe...@infai.org>
------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_________________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
<mailto:DBpedia-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
All the best,
Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT)
Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org,
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
<http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion