First call for participation: Named Entity rEcognition and Linking (NEEL) Challenge @ The 6th Making Sense of Microposts Workshop (#Microposts2016)

Erp, M.G.J. van Fri, 11 Dec 2015 11:58:19 -0800

*apologies for cross-posting*
=========================================================================================
Named Entity rEcognition and Linking (NEEL) Challenge
at the 6th Making Sense of Microposts Workshop (#Microposts2016) @ WWW 2016
http://microposts2016.seas.upenn.edu/challenge.html
11/12 April 2016, Montréal, Canada


=========================================================================================

Microposts are a highly popular medium to share facts, opinions or emotions. 
They are an invaluable wealth of data, ready to be mined for training 
predictive models. Following the success of the previous three years, we are 
pleased to announce the NEEL challenge which will be part of  the 
#Microposts2016 Workshop at the World Wide Web 2016 conference.

The task of the challenge is to automatically recognise entities and their 
types from English microposts, and link them to the corresponding English 
DBpedia 2014 resources (if the resources exist) or NIL identifiers. 
Participants will have to automatically extract expressions that are formed by 
discrete (and typically short) sequences of words (e.g., Obama, London, 
Rakuten) and recognise their types (e.g., Person, Location, Organisation) from 
a collection of microposts. In the linking stage, the aim is to disambiguate 
the spotted entity to the corresponding DBpedia resource, or to a NIL reference 
if the spotted named entity does not match any resource in DBpedia.

We welcome participants from the NEEL Challenge, TREC, TAC KBP, ERD shared 
tasks to participate in this year’s challenge.

DATASET
-------------
The dataset consists of tweets extracted from a collection of over 18 million 
tweets. The dataset includes event-annotated tweets provided by the Redites 
project (http://demeter.inf.ed.ac.uk/redites/) covering multiple noteworthy 
events from 2011, 2013  (including the death of Amy Winehouse, the London 
Riots, the Oslo bombing and the Westgate Shopping Mall shootout), tweets 
extracted from the Twitter firehose from 2014 and 2015 via a selection of 
hashtags. Since the task of this challenge is to automatically recognise and 
link entities, we have built our dataset considering both event and non-event 
tweets. While event tweets are likely to contain entities, non-event tweets 
enable us to evaluate the performance of the system in avoiding false positives 
in the entity extraction phase. The training set is built on top of the entire 
corpus of the NEEL 2014 and 2015 Challenges.

The training set will be released as tsv following the TAC KBP format, where 
each line contains the following columns:

1st: tweet identifier [alphanumeric]
2nd,3rd: start/end offsets expressed as the number of UTF8 characters starting 
from 0 (the beginning of the tweet), space is counted too [integer]
4th: link to DBpedia resource or NIL (it may exist different NIL in the corpus. 
Each NIL may be reused if there are multiple mentions in the text which 
represent the same entity) [alphanumeric]
5th: salience (confidence score). This field can be assigned randomly, since it 
*will not* be used to rank the submissions [double]
6th: type [alphanumeric]

Tokens are separated by TABs. We will advertise the release of the data sets on 
the workshop mailing list. To be informed, please subscribe to 
https://groups.google.com/forum/neelchallenge.


EVALUATION
------------------
Participants are allowed to submit up to 3 runs of their system as TSV files. 
An example of the submission format will be released with the development set. 
We encourage participants to make available their system to the community to 
facilitate reuse and we will acknowledge the systems that shared their source 
code or were otherwise made accessible for reuse otherwise.

We will use the TAC KBP scorer 
(https://github.com/wikilinks/neleval/wiki/Evaluation) to evaluate the results 
and in particular we will focus on:

[tagging]     strong_typed_mention_match (check entity name boundary and type)
[linking]     strong_link_match
[clustering]  mention_ceaf (NIL detection)


PAPER SUBMISSION
-----------------------------
A paper of 3 pages describing your approach, how you tuned/tested it using the 
training data, and your results on the dev set. All submissions must be in 
English. Submissions should be prepared according to the ACM SIG Proceedings 
Template (see http://www.acm.org/sigs/publications/proceedings-templates), and 
should include author names and affiliations, and 3-5 author-selected keywords. 
Along with the paper, authors will submit up to 3 runs of their systems 
computed over the test set. The submission should be made as a single, 
unencrypted zip file that includes a plain text file listing its contents. 
Submission is via EasyChair, at: 
https://easychair.org/conferences/?conf=microposts2016. Each submission will 
receive at least 2 peer reviews.
We aim to publish the #Microposts2016 proceedings via CEUR as a single volume 
containing all three tracks.


WILLING TO JOINING THE CHALLENGE?
---------------------------------------------------------
1- register your team at http://goo.gl/forms/2R7zagtUJZ and subscribe to 
https://goo.gl/vsyq0O
2- download the agreement https://goo.gl/idFdyP, sign it, and send the pdf to 
[email protected] and [email protected]
3- download the challenge guidelines https://goo.gl/XGmpuY
3- Shortly after, you will receive the instructions on how to obtain the 
database
4- check out the challenge timeline and follow up


IMPORTANT DATES
----------------------------
*Release of training*: from 7 December 2015
*Release of dev set*: 30 December 2015
*Release of test set*: 31 January 2016
*Submission of results*: 7 February 2016
*Submission of reports*: 7 February 2016
*Challenge Notification*: 18 February 2016

*Challenge camera-ready deadline*: 28 February 2016
*Workshop*: 11/12 April 2016 (Registration open to all)
(All deadlines 23:59 Hawaii Time)

CONTACT
--------------
Mailing list : https://groups.google.com/forum/neelchallenge
Twitter hashtags: #neel #microposts2016
Twitter account: @Microposts2016
W3C Microposts Community Group: http://www.w3.org/community/microposts


CHALLENGE ORGANISERS:
---------------------------------------
Giuseppe Rizzo, Istituto Superiore Mario Boella, Italy
Marieke van Erp, Vrije Universiteit Amsterdam, Netherlands


CHALLENGE COMMITTEE:
-------------------------------------
Ebrahim Bagheri, Ryerson University, Canada
Pierpaolo Basile, University of Bari, Italy
David Corney, Signal Media, UK
Grégoire Burel, KMi, Open University, UK
Milan Dojchinovski, Leipzig University, Germany/Czech Technical University, 
Czech Republic
Guillaume Erétéo, Vigiglobe, France
Anna Lisa Gentile, The University of Sheffield, UK
José M. Morales del Castillo, El Colegio de México, Mexico
Bernardo Pereira Nunes, PUC-Rio, Brazil
Giles Reger, The University of Manchester, UK
Irina Temnikova, Qatar Computing Research Institute, Qatar
Victoria Uren, Aston University, UK



--
Computational Lexicology & Terminology Lab (CLTL)
The Network Institute, Vrije Universiteit Amsterdam

De Boelelaan 1105
1081 HV  Amsterdam, The Netherlands
http://www.mariekevanerp.com
http://www.newsreader-project.eu

First call for participation: Named Entity rEcognition and Linking (NEEL) Challenge @ The 6th Making Sense of Microposts Workshop (#Microposts2016)

Reply via email to