[DBpedia-discussion] CFP: Semantic Web Journal - Special Issue on Linked Data for Information Extraction

Anna Lisa Gentile Fri, 25 Nov 2016 05:27:07 -0800

*---------------------------

SEMANTIC WEB JOURNAL - Call for papers: SPECIAL ISSUE ON Linked Datafor Information Extraction

---------------------------
*Submission deadline: 07 April 2017, Hawaii-Time
--------------------*

Information Extraction (IE) is the task of automatically extractingstructured information from unstructured and/or semi-structuredmachine-readable documents. It is a crucial technology to enable theSemantic Web vision.One of the major bottlenecks for the current state of the art in IE isthe availability of learning materials (e.g., seed data, trainingcorpora), which, traditionally are manually created but are expensive tobuild and maintain. Linked Data (LD) defines best practices forexposing, sharing, and connecting data, information, and knowledge onthe Semantic Web using uniform means such as URIs and RDF. It has so farcreated a gigantic knowledge source of Linked Open Data (LOD), which nowconstitutes billions of triples (facts). This has created unprecedentedopportunities for Information Extraction. Linked Data offers a uniformapproach to link resources uniquely identifiable by URIs. This creates alarge knowledge base of entities and concepts, connected by semanticrelations. Such resources can be valuable resources to seed distantlearning. Moreover, initiatives such as RDFa (supported by W3C) orMicroformats (used by schema.org and supported by major search engines)constantly produce a vast amount of annotated web pages which can beused as training data in the traditional machine learning paradigm.However, powering IE using LOD faces major challenges. First,discovering relevant learning materials on LOD for specific IE tasks isnon-trivial due to (i) the highly heterogeneous vocabularies used bydata publishers and (ii) the lack of contextual information forannotated content on web pages (e.g., annotations often predominantlyfound in page headers) and the skewed distribution towards popularentities. Users are often required to be familiar with the datasets,vocabularies, as well as query languages that data publishers use toexpose their data. Unfortunately, considering the sheer size and thediversity of LOD, imposing such requirements on users is infeasible.Second, it is known that the coverage of domains can be very imbalancedand for certain domains the data can be very sparse. Furthermore, themajority of LOD are created automatically by converting legacy databaseswith limited or no human validation, thus data inconsistency andredundancy are widespread.Another crucial aspect in IE research is the shift of attention frompurely unstructured text to semi-structured content. Two main source ofinterest are Web tables and Open Data (often available as csv files).These data are particularly rich of content and relations but often lackcontextual data, often used in classical IE methods.The aim of this special issue is to foster research on methodologiesthat exploit Linked Data for Information Extraction, to answer questionssuch as: to what extent can we identify domain-specific learningresources for IE; how to identify and deal with noise in the learningresources; how can these learning resources be used to train IE models,both for classical unstructured text and for semi-structured content;and how should the information extracted by such models integrate intothe existing LOD.



Topics of Interest
------------------------------

We solicit original papers addressing the challenges and researchquestions mentioned above. Topics of interest are listed (but notlimited to) the ones below. Note that work must make use of Linked Dataof any form and must be related to Information Extraction in some way.Please contact the editors if in doubt.

- Methods for generating seed data for IE (e.g., distant supervision)from Linked Data- Methods for identifying labelled data for IE from the annotatedwebpage content under the initiative such as RDFa and Microdata format(schema.org)

- IE tasks exploiting Linked Data in any form, such as (not limited to)
    * wrapper induction
    * table annotation
    * named entity recognition
    * relation extraction
    * ontology population, ontology expansion (A-box)
    * ontology learning (T-box)
- Methods for identifying and reducing noise in the context of IE tasks
- Disambiguation using Linked Data
- IE for knowledge graph construction


Submission Instructions
-----------------------------

Submissions shall be made through the Semantic Web journal website athttp://www.semantic-web-journal.net. Prospective authors must takenotice of the submission guidelines posted athttp://www.semantic-web-journal.net/authors. Note that you need torequest an account on the website for submitting a paper. Pleaseindicate in the cover letter that it is for the "Linked Data forInformation Extraction" special issue.All manuscripts will be reviewed based on the SWJ open and transparentreview policy and will be made available online during the review process.



Guest editors
--------------------
Anna Lisa Gentile, University of Mannheim, Germany
Ziqi Zhang, Nottingham Trent University, UK

The call is also available at the official journal website:http://www.semantic-web-journal.net/blog/call-papers-special-issue-linked-data-information-extraction


*

--
Anna Lisa Gentile
Postdoctoral Researcher
Data and Web Science Group
University of Mannheim
https://w3id.org/people/annalisa
email:annal...@informatik.uni-mannheim.de
office: +49 621 181 2646
skype: anlige

------------------------------------------------------------------------------

_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

[DBpedia-discussion] CFP: Semantic Web Journal - Special Issue on Linked Data for Information Extraction

Reply via email to