date:20201217

[Mt-list] 2nd CFP: The 6th Arabic Natural Language Processing Workshop

2020-12-17 Thread Samia Touileb

**apologies for cross-posting**



 Second Call for Papers 

The 6th Arabic Natural Language Processing Workshop (WANLP-6 2021 
) will be collocated with EACL 2021 
.

We invite submissions on topics of natural language processing that include, 
but are not limited to, the following:

  - Basic core technologies: morphological analysis, disambiguation, 
tokenization, POS tagging, named entity detection, chunking, parsing, semantic 
role labeling, Arabic dialect modeling, etc.
  - Applications: machine translation, speech recognition, speech synthesis, 
optical character recognition, pedagogy, assistive technologies, social media 
analytics, sentiment analysis, summarizations, dialogue systems, etc.
  - Resources: lexicons, dictionaries, annotated and unannotated corpora, etc.

Submissions may include work in progress as well as finished work that has not 
been previously published. Submissions must have a clear focus on specific 
issues pertaining to the Arabic language whether it is standard Arabic, 
dialectal, or mixed. Papers on other languages sharing problems faced by Arabic 
NLP researchers such as Semitic languages or languages using Arabic script are 
welcome. Additionally, papers on efforts using Arabic resources but targeting 
other languages are also welcome. Descriptions of commercial systems are 
welcome, but authors should be willing to discuss the details of their work.

*Shared Task*

Two shared tasks will be associated with the workshop this year:

 - Shared Task 1: NADI 2021 -- Arabic dialect identification 
. This shared task 
targets fine-grained dialect identification with new datasets and efforts to 
distinguish both modern standard Arabic (MSA) and dialects (DA) according to 
their geographical origin.

 - Shared Task 2: Sarcasm and Sentiment Detection In Arabic 
. The shared 
task will focus on analysing tweets and identifying their sentiment and whether 
a tweet is sarcastic or not.


*Important Dates*

   - February 1, 2021: Workshop Paper Due Date (Extended)
   - February 22, 2021: Notification of Acceptance
   - March 1, 2021: Camera-ready papers due (strict!)
   - April 19-20, 2021: Workshop Dates


*Submission Details*

This year we invite two types of research papers (long and short), demo papers, 
and shared task description papers. Long research papers may consist of up to 8 
pages of content, plus unlimited references. Short research papers, demo 
papers, and shared task description papers may consist of up to 4 pages of 
content, plus unlimited references. Submissions will be done via softconf.

*Submission Link*: https://www.softconf.com/eacl2021/WANLP2021/


*WANLP 2021 Organizing Committee*

General Chair:
  Nizar Habash, New York University Abu Dhabi, UAE.

Program Chairs:
  - Houda Bouamor, Carnegie Mellon University in Qatar.
  - Hazem Hajj, American University of Beirut, Lebanon.
  - Walid Magdy, University of Edinburgh, Scotland.
  - Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar.

Publication Chair:
  - Fethi Bougares, University of Le Mans, France.
  - Nadi Tomeh, LIPN, Université Paris 13, Sorbonne Paris Cité.

Publicity Chair:
  - Ibrahim Abu Farha, University of Edinburgh, Scotland.
  - Samia Touileb, University of Oslo, Norway.

Ex-General Chairs / Advisors:
  - Wassim El-Hajj, American University of Beirut, Lebanon.
  - Imed Zitouni, Google, USA.



*Advisory Committee:*
Muhammad Abdul-Mageed, Ahmed Ali, Hend Alkhalifa, Houda Bouamor, Fethi 
Bougares, Khalid Choukri, Kareem Darwish, Mona Diab, Mahmoud El-Haj, Samhaa 
El-Beltagy, Wassim El-Hajj, Nizar Habash, Lamia Hadrich Belguith, Hazem Hajj, 
Walid Magdy, Khaled Shaalan, Kamel Smaili, Nadi Tomeh, Wajdi Zaghouani, Imed 
Zitouni.

*Shared Task 1:*
  - Muhammad Abdul-Mageed, Chiyu Zhang (The University of British Columbia, 
Canada), Nizar Habash (New York University Abu Dhabi), and Houda Bouamor 
(Carnegie Mellon University, Qatar).

*Shared Task 2:*
- Ibrahim Abu Farha (The University of Edinburgh, UK), Wajdi Zaghouani 
(Hamad Bin Khalifa University Doha, Qatar), and Walid Magdy (The University of 
Edinburgh, UK)


For questions or comments regarding  WANLP-6 please contact Ibrahim Abu Farha 
(i.abufarha AT ed.ac.uk ) and Samia Touileb (samiat AT ifi.uio.no ).



Samia Touileb

Postdoc
Language Technology Group
Section for Machine Learning
Department of Informatics, University of Oslo

___
Mt-list site list
Mt-list@eamt.org
http://lists.eamt.org/mailman/listinfo/mt-list

[Mt-list] Call for Participation - Vardial Evaluation Campaign 2021

2020-12-17 Thread Zampieri, Marcos

Call for Participation - VarDial Evaluation Campaign 2021

Within the scope of the eighth VarDial workshop, co-located with EACL 2021, we 
are organizing an evaluation campaign on similar languages, varieties and 
dialects with four shared tasks.

URL: https://sites.google.com/view/vardial2021/evaluation-campaign

To participate and to receive the training data please fill the registration 
form available on the workshop website. The training sets will be released 
Monday, December 21, 2020.

The tasks we are organizing this year are the following (please check the 
website for more information):

- DLI - Dravidian Language Identification: Dravidian languages are a language 
family spoken mainly in the south of India. The four major literary Dravidian 
languages are Tamil (ISO 639-3: tam), Telugu (ISO 639-3: tel), Malayalam (ISO 
639-3: mal), and Kannada (ISO 639-3: kan). Tamil, Malayalam, and Kannada are 
closely related belonging to the South Dravidian subgroup. The DLI shared task 
provides participants with a collection of 16,672 YouTube comments as training 
set. The comments contain code-mixed sentences with English and one of the 
South Dravidian language (Tamil, Malayalam or Kannada). All comments were 
written in Roman script (Non-native script). The task is to identify the 
language of each comment.

- RDI - Romanian Dialect Identification: In this second iteration of the 
Romanian Dialect Identification (RDI) shared task we provide participants with 
an augmented version of the MOROCO data set for training, which contains 
Moldavian (MD) and Romanian (RO) samples of text collected from the news 
domain. A new test set has been collected which will allow participants to 
improve the results they obtained in VarDial 2020. The task is a binary 
classification by dialect, in which a classification model is required to 
discriminate between the Moldavian (MD) and the Romanian (RO) dialects. The 
task is closed, therefore, participants are not allowed to use external data to 
train their models. The test set will contain newly collected text samples, not 
previously included in MOROCO. The test samples will come from a different 
domain, hence the methods have to take the cross-domain nature of the task into 
account. RDI participants may use other external resources in their systems, 
e.g. unlabelled corpora, lexicons, pre-trained embeddings, etc.

- SMG - Social Media Variety Geolocation: In this second iteration of the SMG 
task, we again focus on a geolocation (rather than identification) task: given 
a text, the participants have to predict its geographic location in terms of 
latitude/longitude coordinates. Using data from the social media platforms 
Twitter and Jodel, we provide extended datasets for the same three subtasks as 
in 2020:: 1. Standard German Jodels; 2. Swiss German Jodels; 3. BCMS Tweets. 
All three subtasks will use the same data format and evaluation methodology, 
and participants are encouraged to submit their systems for all subtasks.

- ULI - Uralic Language Identification: This task focuses on discriminating 
between the languages in the Uralic group as defined by the ISO 639-3 standard. 
This is an open public leaderboard competition following VarDial 2020 where 
participants can submit at any point until the final submission date. The task 
includes 29 individual relevant languages, some of which are extremely closely 
related and similar, such as Kven Finnish (fkv) and Tornedalen Finnish (fit). 
These languages are used from Scandinavia, Estonia, and Finland all the way to 
the Russian Siberia.

Best,
Marcos
___
Mt-list site list
Mt-list@eamt.org
http://lists.eamt.org/mailman/listinfo/mt-list

[Mt-list] 2nd CFP: The 6th Arabic Natural Language Processing Workshop

[Mt-list] Call for Participation - Vardial Evaluation Campaign 2021

2 matches

Site Navigation

Mail list logo

Footer information