[Corpora-List] LAST call for papers: ICNLSP 2022, 5th International Conference on Natural Language and Speech Processing, 16, 17 December2022

2022-08-18 Thread Mourad Abbas
*ICNLSP 2022: LAST call for papers*


Dear all,

We are delighted to announce that ICNLSP 2022
, the 5*th* edition of the
International Conference on Natural Language and Speech Processing, hosted
by DataScientia (University of Trento)
 for the third time, will be
held online, on 16-17 December 2022.


*Important dates*

*Submission deadline*:  *30 August 2022*
*Notification of acceptance*: *31 October 2022*
*Camera-ready paper due*: *20 November 2022*
*Conference dates*: *16, 17 Decemberber 2022*


*Publication*

1- All accepted papers will be published in ACL Anthology, and indexed in
DBLP.

2- Selected papers will be published in Signals and Communication
Technology (Springer) (https://www.springer.com/series/4748), indexed by
Scopus and zbMATH.



*Keynote speakers*



1. *Eric Laporte*, *Gustave Eiffel University*,  *France.*



2.* Jan Niehues*, *University of Maastricht*, *Netherlands.*


3. *Ahmed Ali*, *Qatar Computing Research Institute*, *Qatar*.




*Workshop: NSURL 2022*

The workshop on NLP Solutions for Under Resourced Languages NSURL
 will be held with ICNLSP 2022
. The workshop aim to be a forum for
solving NLP tasks concerning Arabic and its dialects and also
under-resourced languages as African, Persian, etc.



We look forward to welcome you to ICNLSP 2022
 that will be an opportunity to get
acquainted with the latest research in the field of natural language and
speech processing, hoping that it will be successful with your active
participation.

*Contact*

icnlsp2...@easychair.org
___
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info


[Corpora-List] PhD and Postdoc positions at CIS/LMU Munich in NLP and Deep Learning

2022-08-18 Thread Barbara Plank
The Center for Information and Language Processing (CIS) at LMU Munich
has several fully-funded positions in Natural Language Processing and
Deep Learning available in the groups of Barbara Plank and Hinrich
Schütze.

Application deadline: September 8th, 2022
Details and application: https://www.cis.lmu.de/web/jobs2022.html
___
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info


[Corpora-List] Tenure-track Lecturer positions at CIS/LMU Munich in Computational Linguistics and NLP

2022-08-18 Thread Barbara Plank
The Center for Information and Language Processing (CIS) at LMU Munich
(co-directed by Barbara Plank and Hinrich Schütze) has two open
tenure-track lecturer positions (Akademische/r Rat/Raetin auf
Lebenszeit) in computational linguistics / natural language
processing.

Application deadline: September 30th, 2022
Details and application: https://www.cis.lmu.de/web/arpositions2022.html
___
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info


[Corpora-List] Survey Study on Sign Language Computation using Machine Learning  

2022-08-18 Thread Hal Daume
Hello! We are a team of researchers from MSR New England and New York. We are 
seeking participants (aged 18 or older) who have experience in machine learning 
or are interested in applying machine learning to developing computational 
models for signed languages for a survey study.



The purpose of this project is to explore how machine learning practitioners 
can better build machine learning models for sign language computation (e.g., 
recognition/translation). We want to understand your general motivations in 
working with machine learning problems and expected challenges when newly 
working with sign language data and tasks. Please know that sign language 
knowledge or sign language computation experience is NOT required to 
participate in this project.

The survey can be found at 
https://forms.office.com/r/7LPnkdTFLN
 along with a consent form for further details. For every submission of the 
survey, $10 will be donated to LEAD-K (Language Equity and Acquisition for Deaf 
Kids), up to the first 50 submissions. The survey receives one submission per 
person.

Once you agree to consent, you will be directed to the survey questions. It 
will take about 30 minutes to answer the questions, including your experience 
in machine learning and sign language computation (if any), understanding of 
sign language culture, and demographics such as your age or education level. 



Your responses will be anonymous, unless you choose to provide your name and 
email address for future contact where you will be invited to participate in a 
paid study to collaborate with American Sign Language experts. Your name and 
email address will never be shared outside of the research team.



Please complete the survey by Tuesday, 8/23 and feel free to forward this to 
other colleagues who may be interested!



Thank you so much for your consideration!

Rie Kamikubo, Danielle Bragg, Alex Lu, Hal Daumé III

___
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info


[Corpora-List] August 2022 Newsletter - LDC

2022-08-18 Thread Penn LDC via Corpora
In this newsletter:
Fall 2022 LDC Data Scholarship Program
30th Anniversary Highlight: The LDC Gigawords

New publication:
HAVIC MED Novel 2 Test - Videos, Metadata and 
Annotation

Fall 2022 LDC Data Scholarship Program
Student applications for the Fall 2022 LDC Data Scholarship program are being 
accepted now through September 15, 2022. This program provides eligible 
students with no-cost access to LDC data. Students must complete an application 
consisting of a data use proposal and letter of support from their advisor. For 
application requirements and program rules, visit the LDC Data Scholarships 
page.

30th Anniversary Highlight: The LDC Gigawords
Giga: a combining form meaning "billion," used in the formation of compound 
words (Source: https://www.dictionary.com/browse/giga-)

LDC's Gigaword corpora are a natural outgrowth of its vast decades-long 
multi-language newswire collection. Newswire data was originally collected, 
annotated, and distributed for use in many sponsored projects and was also 
released through the LDC catalog in tailored data sets. Then came the idea of 
making LDC's entire newswire collection available by language with a simple, 
minimal markup to support a broad range of NLP/HLT tasks. The first 
Arabic, 
Chinese, and 
English Gigaword editions were 
released in 2003; subsequent cumulative releases through fifth editions in 2011 
represent LDC's newswire collection spanning 1994-2010 in those languages. 
French and 
Spanish Gigawords were first 
published in 2006, culminating in the release of third editions in 2011, 
likewise covering newswire collected by LDC through 2010.

The community has used, and continues to use, these data sets in numerous ways. 
Automatic text summarization is a favorite, and current work in this area 
applies deep learning principles (see, e.g., Gao et al. 
2020, English). 
Gigawords are also useful for text source classification (Huang et al. 
2003, Chinese), information extraction 
(Lan et al. 2020, Arabic), knowledge 
extraction and distributional semantics (Napoles et al. 
2012, English), and natural language 
understanding (Ganitkevitch 
2013, English), 
among other fields. Recent variations like the 
annotated and concretely 
annotated English Gigawords add 
syntactic, semantic, and coreference annotations to this billion word text 
collection.

All Gigaword corpora are available for licensing by Consortium members and 
non-members. Visit Obtaining Data 
 for more 
information.

New publication:

HAVIC MED Novel 2 Test - Videos, Metadata and 
Annotation is comprised of 6,200 
hours of user-generated videos with annotation and metadata developed by LDC 
for the 2015 NIST Multimedia Event Detection tasks. The data consists of videos 
of various events (event videos) and videos completely unrelated to events 
(background videos). Each event video was manually annotated with judgments 
describing its event properties and other salient features. Background videos 
were labeled with topic and genre categories.

HAVIC MED Novel 2 Test -- Videos, Metadata and Annotation is distributed via 
web download.

2022 Subscription Members will automatically receive copies of this corpus. 
2022 Standard Members may request a copy as part of their 16 free membership 
corpora. This corpus is a members-only release and is not available for 
non-member licensing. Contact l...@ldc.upenn.edu for 
information about membership.

Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: l...@ldc.upenn.edu
M: 3600 Market St. Suite 810
  Philadelphia, PA 19104






___
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info


[Corpora-List] Re: Corpora Digest, Vol 219, Issue 1

2022-08-18 Thread Rocco Tripodi
Il giorno ven 12 ago 2022 alle 14:00  ha
scritto:

> Send Corpora mailing list submissions to
> corpora@list.elra.info
>
> To subscribe or unsubscribe via email, send a message with subject or
> body 'help' to
> corpora-requ...@list.elra.info
>
> You can reach the person managing the list at
> corpora-ow...@list.elra.info
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Corpora digest..."
>
> Today's Topics:
>
>1. [CfP] TREC Health Misinformation Track 2022 (Maria Maistro)
>2. [CfP] ACM TOIS Efficiency in Neural IR (Maria Maistro)
>3. Call for Badges - ACM SIGIR Artifact Badges Continuous Submission
>   (Nicola Ferro)
>4. Call for proposals: Natural Language Processing (John Benjamin’s)
>   (Caro)
>
>
> --
>
> Message: 1
> Date: Fri, 12 Aug 2022 08:03:18 +
> From: Maria Maistro 
> Subject: [Corpora-List] [CfP] TREC Health Misinformation Track 2022
> To: "corpora@list.elra.info" 
> Message-ID: <86b5f708-9063-456a-b790-888b9639e...@ku.dk>
> Content-Type: multipart/alternative;
> boundary="_000_86B5F7089063456AB790888B9639E00Fkudk_"
>
> Call for Participation - TREC Health Misinformation Track 2022
> https://trec-health-misinfo.github.io
>
> Overview 類
> --
> Web search engines are frequently used to help people make decisions about
> health-related issues. Unfortunately, the web is filled with misinformation
> regarding the efficacy of treatments for health issues. Search users may
> not be able to discern correct from incorrect information, nor credible
> from non-credible sources. As a result of finding misinformation deemed by
> the user to be useful to their decision making task, they can make
> incorrect decisions that waste money and put their health at risk.
>
> The TREC Health Misinformation track fosters research on retrieval methods
> that promote reliable and correct information over misinformation for
> health-related decision making tasks.
>
> Tasks 
> --
> * Ad-hoc Retrieval Task: design a ranking model that promotes credible and
> correct information over incorrect information;
> * Answer Prediction Task: predict the answer to the topic’s stance.
>
> Guidelines  we u guy
> --
> * Corpus: noclean version of the C4 dataset (
> https://huggingface.co/datasets/allenai/c4);
> * Topics: about consumer health search (people seeking health advice
> online);
> * Runs: runs may be either automatic or manual with the standard TREC run
> format.
>
> Detailed guidelines: https://trec-health-misinfo.github.io
>
> Important Dates 
> --
> * Runs due from participants: August 28, 2022
> * Evaluation results returned: End of September 2022
> * Notebook paper due: October 2022
> * TREC 2022 Conference: November 14-18, 2022
> * Final paper due: February 2023
>
> Organization 
> --
> * Charles Clarke, University of Waterloo
> * Maria Maistro, University of Copenhagen
> * Mark Smucker, University of Waterloo
>
>
> ———
>
> Maria Maistro, PhD
> Tenure-track Assistant Professor
> Department of Computer Science
> University of Copenhagen
> Universitetsparken 5, 2100 Copenhagen, Denmark
> -- next part --
> A message part incompatible with plain text digests has been removed ...
> Name: not available
> Type: text/html
> Size: 3101 bytes
> Desc: not available
>
> --
>
> Message: 2
> Date: Fri, 12 Aug 2022 08:07:34 +
> From: Maria Maistro 
> Subject: [Corpora-List] [CfP] ACM TOIS Efficiency in Neural IR
> To: "corpora@list.elra.info" 
> Message-ID: <3d36af0d-fa6d-4a4f-be65-df108b703...@ku.dk>
> Content-Type: multipart/alternative;
> boundary="_000_3D36AF0DFA6D4A4FBE65DF108B703AD2kudk_"
>
> Call for Papers - ACM Transactions on Information Systems
> Special Section on Efficiency in Neural Information Retrieval
>
> Full Call of Papers: https://dl.acm.org/journal/tois/calls-for-papers
>
> Overview 類
> --
> The aim of this Special Section is to engage with researchers in
> Information Retrieval, Natural Language Processing, and related areas and
> gather insight into the core challenges in measuring, reporting, and
> optimizing all facets of efficiency in Neural Information Retrieval (NIR)
> systems, including time-, space-, resource-, sample- and energy-
> efficiency, among other factors.
> This special section solicits perspectives from active researchers to
> advance our understanding of and to overcome efficiency challenges in NIR.
> In particular, researchers are encouraged to examine the ever-growing
> model complexity through appropriate empirical analysis, to propose models
> that require less data, computational resources, and energy for training
> and fine-tuning with similarly efficient inference, to ask if there are
> meaningful 

[Corpora-List] Final CFP: The 7th Arabic Natural Language Processing Workshop, WANLP-7 2022, / Co-located with EMNLP 2022

2022-08-18 Thread Wajdi Zaghouani
*** Apologies for Cross-Posting ***

The 7th Arabic Natural Language Processing Workshop (WANLP2022) will be a
full-day event taking place on December 8, 2022 (in a hybrid mode). This
year’s WANLP is co-located with EMNLP 2022 in Abu Dhabi, United Arab
Emirates.

Workshop URL: http://wanlp2022.arabic-nlp.net/

Submission URL: https://softconf.com/emnlp2022/WANLP2022

Important Dates

   -

   September 5: Workshop Paper Due Date
   -

   October 10: Notification of Acceptance
   -

   October 21: Camera-ready papers due (strict!)
   -

   December 7-8: Workshop Dates


We invite submissions on topics that include, but are not limited to, the
following:


   -

   Enabling core technologies: morphological analysis, disambiguation,
   tokenization, POS tagging, named entity detection, chunking, parsing,
   semantic role labeling, sentiment analysis, Arabic dialect modeling, etc.
   -

   Applications: machine translation, speech recognition, speech synthesis,
   optical character recognition, pedagogy, assistive technologies, social
   media, etc.
   -

   Resources: dictionaries, annotated data, corpus, etc.


Submissions may include work in progress as well as finished work.
Submissions§ must have a clear focus on specific issues pertaining to the
Arabic language whether it is standard Arabic, dialectal, classical, or
mixed. Papers on other languages sharing problems faced by Arabic NLP
researchers, such as Semitic languages or languages using Arabic script,
are welcome provided that they propose techniques or approaches that would
be of interest to Arabic NLP, and they explain why this is the case.
Additionally, papers on efforts using Arabic resources but targeting other
languages are also welcome. Descriptions of commercial systems are welcome,
but authors should be willing to discuss the details of their work.

We have several submission tracks including long, short, and demo tracks.

If you have any questions, please contact us at: wanlp2...@gmail.com

The WANLP 2022 Organizing Committee

http://wanlp2022.arabic-nlp.net/



*Wajdi Zaghouani, Ph.D.*

*Assistant Professor*
College of Humanities and Social Sciences

P.O. Box 34110 | Education City | Doha, Qatar
tel: +974 4454 5601 | mob: +974 33454992

wzaghou...@hbku.edu.qa| Office A141, LAS Building
___
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info


[Corpora-List] Call for applications: 1 year MRes in Translation and Interpreting Studies at University of Surrey

2022-08-18 Thread Constantin Orasan via Corpora
The Centre for Translation Studies (CTS) at University of Surrey invites 
applications for a place in our MRes in Translation and Interpreting Studies 
course. Students attending this course get in-depth, systematic research 
training in translation and interpreting, and customised preparation for a PhD 
and an academic career. This unique and innovative course is the first of its 
kind in the UK and draws on the research areas CTS is well known for: 
translation and interpreting technologies, translation process research, 
translation as intercultural mediation, corpus-based translation, audiovisual 
translation and multimodality studies. CTS has more recently embarked on 
exciting, fast-developing areas, including machine translation, Natural 
Language Processing for translation/interpreting and hybrid workflows in 
translation/interpreting. The research we carry out at CTS is in touch with 
recent technological and social developments, as we maintain a strong focus on 
the responsible integration of technologies in workflows where multilingual and 
multimodal mediation is key. 

By studying with us, you'll join our internationally recognised Centre for 
Translation Studies, thus benefiting from a combination of leading research 
expertise and professional relevance and honing skills you will need in order 
to thrive in academia or in the industry. As an MRes student, you will take two 
compulsory taught modules and select two optional modules (60 credits). You 
will then complete your degree with an MRes in Translation and Interpreting 
Studies Dissertation (120 credits). The dissertation, which is longer than a 
typical MA dissertation, will enable you to research a topic in greater depth 
than is the case in a conventional MA project format. This year, we invite in 
particular students interested in pursuing dissertation topics related to 
machine translation, corpora in translation and interpreting, and the use of 
NLP for translation and interpreting.  

For further inspiration, take a look at what our current students say about the 
course and their MA projects:  
https://www.surrey.ac.uk/student-life/what-our-students-say/zeynep-polat-posoflu
  
And for more details about the programme or how to apply visit: 
https://www.surrey.ac.uk/postgraduate/translation-and-interpreting-studies-mres 
 

If you feel that an MRes is not for you, you can check our other postgraduate 
courses on topics related to translation and interpreting at: 
https://www.surrey.ac.uk/centre-translation-studies/study/postgraduate-courses  

---
Prof Constantin Orăsan  
Professor of Language and Translation Technologies

Centre for Translation Studies | School of Literature and Languages
Personal page: https://www.surrey.ac.uk/people/constantin-orasan   
Office: 06LC03, Phone: +44 (0) 1483 68 4115
Library and Learning Centre, University of Surrey, Guildford, Surrey, GU2 7XH, 
UK


___
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info