Dear Colleagues,

We are pleased to inform you that we will be hosting the "Shared Task:
Low-Resource Indic Language Translation" again this year as part of WMT
2024. Following the outstanding success and enthusiastic participation
witnessed in the previous year's edition, we are excited to continue this
important initiative. Despite recent advancements in machine translation
(MT), such as multilingual translation and transfer learning techniques,
the scarcity of parallel data remains a significant challenge, particularly
for low-resource languages.

The WMT 2024 Indic Machine Translation Shared Task aims to address this
challenge by focusing on low-resource Indic languages from diverse language
families. Specifically, we are targeting languages such as Assamese, Mizo,
Khasi, Manipuri, Nyishi, Bodo, Mising, and Kokborok.

The task for this year features two categories based on the availability of
training data:

*Category 1: Moderate Training Data Available*

   - en-as: English ⇔ Assamese
   - en-lus: English ⇔ Mizo
   - en-kha: English ⇔ Khasi
   - en-mni: English ⇔ Manipuri
   - en-nshi: English ⇔ Nyishi



*Category 2: Very Limited Training Data*

   - en-bodo: English ⇔ Bodo
   - en-mrp: English ⇔ Mising
   - en-trp: English ⇔ Kokborok



Participants are encouraged to develop MT systems that can produce
high-quality translations despite limited data availability. Key areas for
exploration include leveraging monolingual data, investigating multilingual
approaches, and exploring transfer learning techniques.

*Important Dates:*

   - Release of training/dev data: 25 May, 2024
   - Test data release: 13 July, 2024
   - Run Submission deadline: 28 July, 2024
   - System description/workshop paper submission deadline: TBA, 2024
   (follow EMNLP/WMT page)
   - Notification of Acceptance: TBA, 2024 (follow EMNLP/WMT page)
   - Camera-ready: TBA, 2024 (follow EMNLP/WMT page)
   - Workshop Dates: follow EMNLP/WMT main page



The organizing committee comprises experts from various institutions
dedicated to advancing MT research in low-resource language settings.

*Organizers:*

   - Santanu Pal, Wipro AI Lab, London, UK
   - Partha Pakray, National Institute of Technology, Silchar, India
   - Sandeep Kumar Dash, National Institute of Technology, Mizoram, India
   - Lenin Laitonjam, National Institute of Technology, Mizoram, India
   - Pankaj Kundan Dadure, University of Petroleum and Energy Studies,
   Dehradun, India
   - Arnab Maji, North-Eastern Hill University, India
   - Lyngdoh Sarah, North-Eastern Hill University, India
   - Anupam Jamatia, National Institute of Technology Agartala, India
   - Koj Sambyo, National Institute of Technology Arunachal Pradesh, India



For inquiries and further information, please contact us at
[email protected]. Additionally, you can find more details and updates
on the task through the following link: Task Link:
https://www2.statmt.org/wmt24/indic-mt-task.html
<https://www2.statmt.org/wmt24/indic-mt-task.html>.



To register for the event, please fill out the registration form available
here
<https://docs.google.com/forms/d/e/1FAIpQLSd8LwriqdLLhVNAvUWEcGRJmKuBFQZ9BR_TKpb6VYZEnyGU0g/viewform?pli=1>.
(
https://docs.google.com/forms/d/e/1FAIpQLSd8LwriqdLLhVNAvUWEcGRJmKuBFQZ9BR_TKpb6VYZEnyGU0g/viewform?pli=1
)



We look forward to your participation and contributions to advancing
low-resource Indic language translation.



Best regards

Team Low-Resource Indic Language Translation

WMT 2024
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to