*apologies for cross-posting*
---------------------------------------------------------------------------
Fourth International Workshop on NLP & DBpedia 2016
17 or 18 October 2016
Kobe, Japan
Collocated with the 15th International Semantic Web Conference (ISWC2016)
http://iswc2016.semanticweb.org/
Submission Deadline: 1 July 2016
Notification of Acceptance: 31 July 2016
Workshop URI:https://nlpdbpedia2016.wordpress.com/
Submissions via:https://easychair.org/conferences/?conf=nlpdbpedia2016
Hashtag: #NLPDBP2016
Contact:nlpdbpedia2...@easychair.org
---------------------------------------------------------------------------
Motivation
The central role of Wikipedia (and therefore DBpedia) for the creation of a
Translingual Web has been recognized by the Strategic Research Agenda (cf. section
3.4, page 23) and most of the contributions of the Dagstuhl seminar on the
Multilingual Semantic Web also stress the role of Wikipedia for Multilingualism.
The previous editions of the NLP&DBpedia workshop also contribute to this
understanding.
As more and more language-specific chapters of DBpedia are created (currently
14 language editions), DBpedia is becoming a driving factor for a Linguistic
Linked Open Data cloud as well as localized LOD clouds with specialized domains
(e.g. the Dutch windmill domain ontology created fromhttp://nl.dbpedia.org or
Japanese domain ontology of screws fromhttp://ja.dbpedia.org/).
The data contained in Wikipedia and DBpedia have ideal properties for making
them a controlled testbed for NLP. Wikipedia and DBpedia are multilingual and
multi-domain, the communities maintaining these resource are very open and it
is easy to join and contribute. The open licence allows data consumers to
benefit from the content and many parts are collaboratively editable.
Especially, the data in DBpedia is widely used and disseminated throughout the
Semantic Web.
With the foundation of the DBpedia Association and the frequent releases of the
DBpedia+ Data Stack, this workshop hopes to channel contributions of the NLP
research community into the data ecosystem of DBpedia and LOD, thus easing the
use of interlinked language resources as well as increasing the performance of
knowledge-based NLP approaches.
We envision the workshop to produce the following items:
• an open call to the DBpedia data consumer community that will
generate a wish list of data, which is to be generated from Wikipedia using NLP
methods (for certain domains and application scenarios). This wish list will be
broken down to tasks and benchmarks and as a result GOLD standard will be
created.
• the benchmarks and test data created will be collected and published
under an open licence for future evaluation (inspired
byhttp://oaei.ontologymatching.org/
andhttp://archive.ics.uci.edu/ml/datasets.html).
• strengthen the link between DBpedia and NLP communities that
currently meet two times a year at DBpedia developers workshops.
• We also offer all authors the chance to contribute their data to the
regular DBpedia releases in April and October.
NLP4DBpedia
DBpedia has been around for quite a while, infusing the Web of Data with
multi-domain data of decent quality. The data in DBpedia is, however, mostly
extracted from Wikipedia infoboxes, while the remaining parts of Wikipedia are
to a large extent not exploited for DBpedia. Here, NLP techniques may help
improving DBpedia.
Extracting additional triples from the plain text information in Wikipedia,
either unsupervised or using the existing triples as training information,
could multiply the information in DBpedia, or help telling correct from
incorrect information by finding supporting text passages. Furthermore,
analyzing the semantics of other structures in Wikipedia, such as tables,
lists, or categories, would help make DBpedia richer. Finally, since Wikipedia
exists in more than 200 languages, we are particularly interested in seeing NLP
approaches not only working for English, but also for other languages, in order
to leverage the huge amount of knowledge captured in the different language
editions.
NLP approaches enable also improving quality of DBpedia, especially by
extracting content from sources other than Wikipedia that may validate the data
in DBpedia.
DBpedia4NLP
On the other hand, NLP and information extraction techniques often involve
various resources while processing texts from different domains. As
high-quality annotated data is often too expensive and time-consuming to
obtain, NLP researchers are increasingly looking to the Semantic Web for
external structured sources to complement their datasets. Such resources can be
gazetteers to aid a named entity recognition system or examples of relations
between entities to bootstrap a relation finder. DBpedia can easily be utilised
to assist NLP modules in a variety of tasks.
We invite papers from both these areas including:
• Knowledge extraction from text and HTML documents (especially
unstructured and semi-structured documents) on the Web, using information in
the Linked Open Data (LOD) cloud, and especially in DBpedia.
• Representation of NLP tool output and NLP resources as RDF/OWL, and
linking the extracted output to the LOD cloud or the Linguistic LOD cloud .
• Novel applications using the extracted knowledge, the Web of Data or
NLP DBpedia-based methods.
The specific topics of the workshop are listed below.
Topics
• Enhancing DBpedia with NLP methods
• Finding errors in DBpedia with NLP methods
• Enriching DBpedia with NLP methods
• Improving quality of DBpedia with NLP methods
• Annotation methods for Wikipedia articles
• Cross-lingual data and text mining on Wikipedia
• Pattern and semantic analysis of natural language, reading the Web,
learning by reading
• Large-scale information extraction
• Entity resolution and automatic discovery of Named Entities
• Multilingual entity recognition task of real world entities
• Frequent pattern analysis of entities
• Relationship extraction, slot filling
• Entity linking, Named Entity disambiguation, cross-document
co-reference resolution
• Analysis of ontology models for natural language text
• Learning and refinement of ontologies
• Natural language taxonomies modeled to Semantic Web ontologies
• Use cases of entity recognition for Linked Data applications
• Impact of entity linking on information retrieval, semantic search
Furthermore, an informal list of NLP tasks can be found on this Wikipedia
page:http://en.wikipedia.org/wiki/Natural_language_processing#Major_tasks_in_NLP
These are relevant for the workshop as long as they fit into the DBpedia4NLP
and NLP4DBpedia frame (i.e. the used data evolves around Wikipedia and DBpedia).
All papers must represent original and unpublished work that is not currently
under review. Papers will be evaluated according to their significance,
originality, technical content, style, clarity, and relevance to the workshop.
At least one author of each accepted paper is expected to attend the workshop.
Together with the KéKi workshop (http://keki2016.linguistic-lod.org/), we have
applied for joint proceedings publication in the Lecture Notes in Computer
Science Series (LNCS) by Springer and are currently under review.
We welcome the following types of contributions:
* Full research papers (up to 16 pages).
* Position papers (up to 12 pages)
All submissions must be written in English and must be formatted according to the style for Lecture Notes in Computer Science (LNCS) Authors. Please submit your contributions electronically in PDF format tohttps://www.easychair.org/conferences/?conf=nlpdbpedia2016
For details on the LNCS style, see the Springer Author Instructions
athttp://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0. NLP & DBpedia
2016 submissions are not anonymous.
Important Dates:
- submission date: 1 July 2016, 23:59 Hawaii time
- author notifications: 31 July 2016, 23:59 Hawaii time
- pre-workshop paper: 1 September, 2016
- NLP & DBpedia 2016: 17 or 18 October 2016
- camera-ready for post-proceedings: 18 November 2016, 23:59 Hawaii time
Organising Committee:
Heiko Paulheim, University of Mannheim
Marieke van Erp, Vrije Universiteit Amsterdam
Pablo N. Mendes, IBM Research, USA
Programme committee:
Caroline Barriere, CRIM
Christian Bizer, University of Mannheim
Volha Bryl, Springer Nature
Paul Buitelaar, Insight - National University of Ireland, Galway
Philipp Cimiano, Bielefeld University
Agata Filipowska, Department of Information Systems, Poznan University of
Economics
Jorge Gracia, Ontology Engineering Group. Universidad Politécnica de Madrid
Anja Jentzsch, Hasso Plattner Institut
John P. Mccrae, National University of Ireland, Galway
Andrea Moro, Sapienza, Università di Roma
Roberto Navigli, Sapienza Universita' di Roma
Giuseppe Rizzo, ISMB
Harald Sack, Hasso-Plattner-Institute for IT Systems Engineering, University of
Potsdam
Felix Sasaki, W3C
Ricardo Usbeck, University of Leipzig
Sebastian Walter, CITEC, Bielefeld University
Krzysztof Wecel, Poznan University of Economics
--
Prof. Dr. Heiko Paulheim
Data and Web Science Group
University of Mannheim
Phone: +49 621 181 2661
B6, 26, Room C1.09
D-68159 Mannheim
Mail: he...@informatik.uni-mannheim.de
Web: www.heikopaulheim.com