Re: [DBpedia-discussion] DBpedia Open Text Extraction Challenge - TextExt

Sebastian Hellmann Tue, 07 Mar 2017 22:47:37 -0800

Dear John,

yes, your idea is actually what you can do with Wikidata already. Likeyou could replace part of the text with Wikidata information, which goesin the direction of your proposal.

However, although the feature exists, I was unable to find any exampleof this in the article text, but I only looked briefly. Maybe somebody else?

Not sure, whether you can tell whole articles, stories or essays in asemantic language. A lot of information would be lost, if you do it inOWL. Also it seems very hard to encode it even using something likeAttempto Controlled English:https://en.wikipedia.org/wiki/Attempto_Controlled_English

Our motivation is that we do try getting more information with relationextraction for now until something better presents itself.


all the best,

Sebastian


On 07.03.2017 15:31, Paul Houle wrote:

Isn't that Wikidata?

--
  Paul Houle
  paul.ho...@ontology2.com



On Mon, Mar 6, 2017, at 04:28 PM, John Flynn wrote:
I applaud this initiative to extract triples from Wikipedia opentext. However, it would be useful to initiate a parallelchallenge/effort to represent a limited portion of the currentWikipedia article text as semantic representation, eliminating thetext altogether. In this approach, the Wikipedia information would besemantically encoded as its original representation, as opposed tousing text to represent the information. A small subset of Wikipediasubject matter could be used for this experiment. After the limitedWikipedia domain of interest was fully semantically represented,tools could be developed to translate the semantic representationinto human readable text. It seems over the long run creating theoriginal knowledge as a semantic representation, instead of text,would result in a Wikipedia knowledge base that upon query by humanscould automatically perform the necessary translation into text inwhichever human language the user desired. This concept would alsofacilitate machine to machine use of the Wikipedia knowledge base,which is currently difficult, if not impossible, due to the textualnature of the information. You could also envision tools that wouldeventually make it easy for authors to source the article informationdirectly in semantic representation. The end results would be aDBpedia on steroids and the eventually elimination of Wikipedia asthe original article text sources would no longer be needed.
John Flynn

http://semanticsimulations.com


*From:*Sebastian Hellmann [mailto:hellm...@informatik.uni-leipzig.de]
*Sent:* Monday, March 06, 2017 5:56 AM
*To:* DBpedia
*Subject:* [DBpedia-discussion] DBpedia Open Text ExtractionChallenge - TextExt
*DBpedia Open Text Extraction Challenge - TextExt*

Website: http://wiki.dbpedia.org/textext
*_Disclaimer: The call is under constant development, please refer tothe news section. We also acknowledge the initial engineering effortand will be lenient on technical requirements for the firstsubmissions and will focus evaluation on the extracted triples andallow late submissions, if they are coordinated with us_*.
      Background
DBpedia and Wikidata currently focus primarily on representingfactual knowledge as contained in Wikipedia infoboxes. A vast amountof information, however, is contained in the unstructured Wikipediaarticle texts. With the DBpedia Open Text Extraction Challenge, weaim to spur knowledge extraction from Wikipedia article texts inorder to dramatically broaden and deepen the amount of structuredDBpedia/Wikipedia data and provide a platform for benchmarkingvarious extraction tools.
      Mission
Wikipedia has become the ubiquitous source of knowledge for the worldenabling humans to lookup definitions, quickly become familiar withnew topics, read up background infos for news event and many more -even settling coffee house arguments via a quick mobile research. Themission of DBpedia in general is to harvest Wikipedia’s knowledge,refine and structure it and then disseminate it on the web - in afree and open manner - for IT users and businesses.
      News and next events
Twitter: Follow @dbpedia <https://twitter.com/dbpedia>, Hashtag:#dbpedianlp<https://twitter.com/search?f=tweets&q=%23dbpedianlp&src=typd>
·LDK <http://ldk2017.org/> conference joined the challenge (DeadlineMarch 19th and April 24th)
·SEMANTiCS <http://2017.semantics.cc/> joined the challenge (DeadlineJune 11th and July 17th)
·Feb 20th, 2017: Full example added to this website
·March 1st, 2017: Docker image (beta)https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge
Coming soon:

·beginning of March: full example within the docker image
·beginning of March: DBpedia full article text and tables (currentlyonly abstracts) http://downloads.dbpedia.org/2016-10/core-i18n/
      Methodology
The DBpedia Open Text Extraction Challenge differs significantly fromother challenges in the language technology and other areas in thatit is not a one time call, but a continuous growing and expandingchallenge with the focus to *sustainably* advance the state of theart and transcend boundaries in a *systematic* way. The DBpediaAssociation and the people behind this challenge are committed toprovide the necessary infrastructure and drive the challenge for anindefinite time as well as potentially extend the challenge beyondWikipedia.
We provide the extracted and cleaned full text for all Wikipediaarticles from 9 different languages in regular intervals for downloadand as Docker in the machine readable NIF-RDF<http://persistence.uni-leipzig.org/nlp2rdf/> format (Example forBarrack Obama in English<https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge/blob/master/BO.ttl>).Challenge participants are asked to wrap their NLP and extractionengines in Docker images and submit them to us. We will runparticipants’ tools in regular intervals in order to extract:
1.Facts, relations, events, terminology, ontologies as RDF triples(Triple track)
2.Useful NLP annotations such as pos-tags, dependencies, co-reference(Annotation track)
We allow submissions 2 months prior to selected conferences(currently http://ldk2017.org/ and http://2017.semantics.cc/ ).Participants that fulfil the technical requirements and provide asufficient description will be able to present at the conference andbe included in the yearly proceedings. *Each conference, thechallenge committee will select a winner among challengeparticipants, which will receive 1000€.*
      Results
Every December, we will publish a summary article and proceedings ofparticipants’ submissions at http://ceur-ws.org/ . The firstproceedings are planned to be published in Dec 2017. We will try tobriefly summarize any intermediate progress online in this section.
      Acknowledgements
We would like to thank the Computer Center of Leipzig University togive us access to their 6TB RAM server Sirius to run all extractiontools.
The project was created with the support of the H2020 EU projectHOBBIT <https://project-hobbit.eu/> (GA-688227) and ALIGNED<http://aligned-project.eu/> (GA-644055) as well as the BMWi projectSmart Data Web <http://smartdataweb.de/> (GA-01MD15010B).
      Challenge Committee
·Sebastian Hellmann, AKSW, DBpedia Association, KILT CompetenceCenter, InfAI, Leipzig
·Sören Auer, Fraunhofer IAIS, University of Bonn

·Ricardo Usbeck, AKSW, Simba Competence Center, Leipzig University
·Dimitris Kontokostas, AKSW, DBpedia Association, KILT CompetenceCenter, InfAI, Leipzig
·Sandro Coelho, AKSW, DBpedia Association, KILT Competence Center,InfAI, Leipzig
Contact Email: dbpedia-textext-challe...@infai.org<mailto:dbpedia-textext-challe...@infai.org>
------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_________________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net<mailto:DBpedia-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford


_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT)Competence Center

at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association

Projects: http://dbpedia.org, http://nlp2rdf.org,http://linguistics.okfn.org, https://www.w3.org/community/ld4lt<http://www.w3.org/community/ld4lt>

Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org

------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford

_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Re: [DBpedia-discussion] DBpedia Open Text Extraction Challenge - TextExt

Reply via email to