[Apologies for cross-postings]

********************************************************************************

Final Call for Papers

21st Workshop on Multiword Expressions (MWE 2025)

Organized, sponsored and endorsed by SIGLEX, the Special Interest Group on
the Lexicon of the ACL

Full-day workshop collocated with NAACL 2025, Albuquerque, New Mexico,
U.S.A., May 3 or 4, 2025

Hybrid (on-site & on-line)

Submission deadline: February  13, 2025

MWE 2025 website: <https://multiword.org/mwe2022/>
https://multiword.org/mwe2025/

********************************************************************************

Multiword expressions (MWEs), i.e., word combinations that exhibit lexical,
syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (Baldwin
and Kim, 2010), such as “by and large”, “hot dog”, “make a decision” and
“break one's leg” are still a pain in the neck for Natural Language
Processing (NLP). The notion encompasses closely related phenomena: idioms,
compounds, light-verb constructions, phrasal verbs, rhetorical figures,
collocations, institutionalized phrases, etc. Given their irregular nature,
MWEs often pose complex problems in linguistic modeling (e.g. annotation),
NLP tasks (e.g. parsing), and end-user applications (e.g. natural language
understanding and Machine Translation), hence still representing an open
issue for computational linguistics (Constant et al., 2017).

For more than two decades, modelling and processing MWEs for NLP has been
the topic of the MWE workshop organised by the MWE section
<https://multiword.org/> of ACL-SIGLEX <http://www.siglex.org/> in
conjunction with major NLP conferences since 2003. Impressive progress has
been made in the field, but our understanding of MWEs still requires much
research considering their need and usefulness in NLP applications. This is
also relevant to domain-specific NLP pipelines that need to tackle
terminologies most often realised as MWEs. Following previous years, for
this 21st edition of the workshop, we identified the following topics on
which contributions are particularly encouraged:

   -

   MWE processing to enhance end-user applications. MWEs gained particular
   attention in end-user applications, including Machine Translation (MT)
   (Zaninello and Birch, 2020), simplification (Kochmar et al., 2020),
   language learning and assessment (Paquot et al., 2020), social media mining
   (Pelosi et al., 2017), and abusive language detection (Zampieri et al.
   2020). We believe that it is crucial to extend and deepen these first
   attempts to integrate and evaluate MWE technology in these and further
   end-user applications.
   -

   MWE processing and identification in the general language, as well as in
   specialized languages and domains: Multiword terminology extraction from
   domain-specific corpora (Lossio-Ventura et al, 2014) is of particular
   importance to various applications, such as MT (Semmar and Laib, 2017), or
   for the identification and monitoring of neologisms and technical jargon
   (Chatzitheodorou and Kappatos, 2021).
   -

   MWE processing in low-resource languages: The PARSEME shared tasks (2017
   
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_05_MWE_2017___lb__EACL__rb__&subpage=CONF_40_Shared_Task>,
   2018
   
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_04_LAW-MWE-CxG_2018___lb__COLING__rb__&subpage=CONF_40_Shared_Task>,
   2020
   
<https://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-LEX_2020___lb__COLING__rb__&subpage=CONF_40_Shared_Task>)
   among others, have fostered significant progress in MWE identification,
   providing datasets that include low-resource languages, evaluation
   measures, and tools that now allow fully integrating MWE identification
   into end-user applications. There are continuous efforts in this direction
   (Diaz Hernandez, 2024) and a few of them have also explored methods for the
   automatic interpretation of MWEs (Bhatia et al., 2018), and their
   processing in low-resource languages (Eder et al., 2021). Resource creation
   and sharing should be pursued in parallel with the development of
   multilingual benchmarks for MWE identification (Savary et al., 2023).
   -

   MWE identification and interpretation in LLMs: Most current MWE
   processing is limited to their identification and detection using
   pre-trained language models, but we still lack understanding about how MWEs
   are represented and dealt with therein (Garcia et al., 2021), how to better
   model the compositionality of MWEs from semantics (Phelps et al., 2024).
   Now that NLP has shifted towards end-to-end neural models like BERT,
   capable of solving complex tasks with little or no intermediary linguistic
   symbols, questions arise about the extent to which MWEs should be
   implicitly or explicitly modelled (Shwartz and Dagan, 2019).
   -

   New and enhanced representation of MWEs in language resources and
   computational models of compositionality as gold standards for formative
   intrinsic evaluation.


Through this workshop, we will bring together and encourage researchers in
various NLP subfields to submit their MWE-related research, We also intend
to consolidate the converging results of previous joint workshops LAW-MWE-CxG
2018 <http://multiword.sourceforge.net/lawmwecxg2018/>, MWE-WN 2019
<http://multiword.sourceforge.net/mwewn2019/> and MWE-LEX 2020
<http://multiword.sourceforge.net/mwelex2020/>, the joint MWE-WOAH panel in
2021 <https://multiword.org/mwe2021/#program>, the MWE-SIGUL 2022 joint
session <https://multiword.org/mwe2022/>, and the MWE-UD 2024
<https://multiword.org/mweud2024/>, extending our scope to MWEs in
e-lexicons, and WordNets, MWE annotation, as well as grammatical
constructions. Correspondingly, we call for papers on research related (but
not limited) to MWEs and constructions in:


   -

   Computationally-applicable theoretical work in psycholinguistics and
   corpus linguistics;
   -

   Annotation (expert, crowdsourcing, automatic) and representation in
   resources such as corpora, treebanks, e-lexicons, WordNets, constructions
   (also for low-resource languages);
   -

   Processing in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG,
   LFG, TAG, UD, etc.);
   -

   Discovery and identification methods, including for specialized
   languages and domains such as clinical or biomedical NLP;
   -

   Interpretation of MWEs and understanding of text containing them;
   -

   Language acquisition, language learning, and non-standard language (e.g.
   tweets, speech);
   -

   Evaluation of annotation and processing techniques;
   -

   Retrospective comparative analyses from the PARSEME shared tasks;
   -

   Processing for end-user applications (e.g. MT, NLU, summarisation,
   language learning, etc.);
   -

   Implicit and explicit representation in pre-trained language models and
   end-user applications;
   -

   Evaluation and probing of pre-trained language models;
   -

   Resources and tools (e.g. lexicons, identifiers) and their integration
   into end-user applications;
   -

   Multiword terminology extraction;
   -

   Adaptation and transfer of annotations and related resources to new
   languages and domains including low-resource ones.


Submission formats:

The workshop invites two types of submissions:


   -

   archival submissions that present substantially original research in
   both long paper format (8 pages + references) and short paper format (4
   pages + references).
   -

   non-archival submissions of abstracts describing relevant research
   presented/published elsewhere which will not be included in the MWE
   proceedings.


Paper submission and templates

Papers should be submitted via the workshop's submission page
<https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE> (
https://openreview.net/group?id=aclweb.org/NAACL/2025/Workshop/MWE). Please
choose the appropriate submission format (archival/non-archival). Archival
papers with existing reviews will also be accepted through the ACL Rolling
Review. Submissions must follow the ACL stylesheet
<https://github.com/acl-org/acl-style-files>.

Important Dates

Paper Submission Deadline: February 13, 2025

Notification of acceptance: March 8 2025

Camera-ready papers due: March 17, 2025

Workshop: May 3 or 4, 2025

All deadlines are at 23:59 UTC-12 (Anywhere on Earth).

Organizing Committee

Verginica Barbu Mititelu, Voula Giouli, Grazina Korvel, A. Seza Doğruöz,
Alexandre Rademaker, Atul Kr. Ojha, Mathieu Constant

Anti-harassment policy

The workshop follows the ACL anti-harassment policy
<https://www.aclweb.org/adminwiki/index.php?title=Anti-Harassment_Policy>.

Contact

For any inquiries regarding the workshop, please send an email to the
Organizing Committee at <[email protected]>
[email protected].
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to