The Research unit ATILF (Computer Processing and Analysis of the French
Language) offers a postdoctoral position in natural language processing (NLP).
Topic: Discovery of multiword expressions, their meaning and their linguistic
properties in texts using large language models
Location: ATILF, Nancy, France
Starting date: from April 2024
Duration: 12 months (possibility to extend the duration for one more year)
Supervisors: Mathieu Constant (Univ. Lorraine, France) and Agata Savary (Univ.
Paris-Saclay, France)
Salary: depends on experience after PhD and salary grids, from 3070 (<2-year
experience) to 4465 euros (>7-year-experience) before tax
Application deadline: 22th February 2024
Subject. The term « multiword expression » refers to a combination of multiple
lexical items that displays irregular composition possibly on different
linguistic levels (morphology, syntax, semantics, …). They include a large
variety of phenomena such as idioms (run around in circles), support verb
constructions (take a walk), nominal compounds (dry run), complex function
units (in spite of). They have been the subject of extensive research work in
the NLP community over the last 50 years.
The goal of this post-doc position is to investigate new methods for
discovering multiword expressions, their meaning and their linguistic
properties in texts, in order to enrich an induced semantic lexicon with new
multiword entries, definitions, argumental structure, and other properties. The
emergence of Large Language Models (LLM) opens new promising perspectives for
multiword expressions, not only regarding their semantic compositionality but
also their linguistic characterization. The methods will be primarily
experimented on French, but other languages are also possible.
Context. The position is part of the SELEXINI project
(https://selexini.lis-lab.fr <https://selexini.lis-lab.fr/>, 2022-2026) funded
by the French National Research Agency (ANR). The goal of the SELEXINI project
is to develop next-generation lexicon induction methods for natural language
processing. The induced lexicons will not only cluster word usages according to
their senses, but also contain multiword expressions, argumental structure,
generated definitions, etc, combining the power of large pre-trained language
models and existing lexical resources to address the lack of interpretability
and diversity in current language technology. The hired researcher will be
fully integrated in the project team.
Requirements. Applicants should hold a PhD thesis in computer science, in
applied mathematics, in natural language processing, or in computational
linguistics.
The hired post-doc researcher should have the following skills:
expertise in deep learning for NLP and notably large language models
excellent programming skills
good linguistic skills
good knowledge of French would be a plus
team spirit
Application. The applicants should submit a cover letter, a CV including their
publications, a list of references for recommendation, a transcript of Master
grades, on the following official web site:
https://emploi.cnrs.fr/Offres/CDD/UMR7118-MATCON-001/Default.aspx?lang=EN
<https://emploi.cnrs.fr/Offres/CDD/UMR7118-MATCON-001/Default.aspx?lang=EN>.
The applications should be submitted not later than 22th of February 2024.
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]