[Corpora-List] Call for participation: MultiLexNorm 2: Multilingual Lexical Normalization

Rob van der Goot via Corpora Tue, 26 Nov 2024 05:39:08 -0800

Dear all,


After a successful first edition in 2021, we are glad to invite you to the
second MultiLexNorm shared task! The shared task will be hosted at WNUT 2025.

As defined in the previous iteration, lexical normalization is:
The task of transforming an utterance into its standard form, word by word,
including both one-to-many (1-n) and many-to-one (n-1) replacements.

Building on the previous task which focused on Indo-European languages written
in the Latin script, we extended the benchmark to include languages written in
other scripts. We now include data for Thai, Vietnamese, and Indonesian. The
data and more information about the task can be found on:

https://noisy-text.github.io/2025/multi-lexnorm.html#

Dates:
Data available Nov 15, 2024
Data freeze Jan 07, 2025
Test data Jan 25, 2025
Final Evaluation Feb 07, 2025
Paper deadline Feb 25, 2025
Paper reviewed Mar 01, 2025
Camera ready Mar 10, 2025
Workshop May 03, 2025 (TBD)

Best,
The organizers:
Rob van der Goot
Weerayut Buaphet
Peerat Limkonchotiwat
Thanh-Nhi Nguyen
Thanh-Phong Le
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] Call for participation: MultiLexNorm 2: Multilingual Lexical Normalization

Reply via email to