Workshop on "Gender Bias in Natural Language Processing", August 16, Thailand 
ACL

First Call for papers and announcing the shared task.

http://genderbiasnlp.talp.cat <http://genderbiasnlp.talp.cat/>
Gender bias, among other demographic biases (e.g. race, nationality, religion), 
in machine-learned models is of increasing interest to the scientific community 
and industry. Models of natural language are highly affected by such biases, 
which are present in widely used products and can lead to poor user 
experiences. There is a growing body of research into improved representations 
of gender in NLP models. Key example approaches are to build and use balanced 
training and evaluation datasets (e.g. Webster et al., 2018), and to change the 
learning algorithms themselves (e.g. Bolukbasi et al., 2016). While these 
approaches show promising results, there is more to do to solve identified and 
future bias issues. In order to make progress as a field, we need to create 
widespread awareness of bias and a consensus on how to work against it, for 
instance by developing standard tasks and metrics. Our workshop provides a 
forum to achieve this goal.
Topics of interest
We invite submissions of technical work exploring the detection, measurement, 
and mediation of gender bias in NLP models and applications. Other important 
topics are the creation of datasets, identifying and assessing relevant biases 
or focusing on fairness in NLP systems. Finally, the workshop is also open to 
non-technical work addressing sociological perspectives, and we strongly 
encourage critical reflections on the sources and implications of bias 
throughout all types of work.
In addition this year we are organising a Shared Task on Gender Bias Machine 
Translation evaluation (see details below)

Paper Submission Information
Submissions will be accepted as short papers (4-6 pages) and as long papers 
(8-10 pages), plus additional pages for references, following the ACL 2024 
guidelines. Supplementary material can be added, but should not be central to 
the argument of the paper. Blind submission is required.
Each paper should include a statement which explicitly defines (a) what system 
behaviors are considered as bias in the work and (b) why those behaviors are 
harmful, in what ways, and to whom (cf. Blodgett et al. (2020)). More 
information on this requirement, which was successfully introduced at GeBNLP 
2020, can be found on the workshop website. We also encourage authors to engage 
with definitions of bias and other relevant concepts such as prejudice, harm, 
discrimination from outside NLP, especially from social sciences and normative 
ethics, in this statement and in their work in general.

Non-archival option
The authors have the option of submitting research as non-archival, meaning 
that the paper will not be published in the conference proceedings. We expect 
these submissions to describe the same quality of work and format as archival 
submissions.
Important dates.
Jan 15, 2024: First call of papers
Feb 20, 2024: Second call of papers
May 10, 2024: Workshop Paper Due Date
June 5, 2024: Notification of Acceptance
June 25, 2024: Camera-ready papers due
August 16, 2024: Workshop Dates
Keynote Speakers.
Isabelle Augenstein, University of Copenhagen
Hal Daumé III, University of Maryland and Microsoft Research NYC

Organizers.
Christine Basta, Alexandria University
Marta R. Costa-jussà, FAIR, Meta,
Agnieszka Falénska, University of Stuttgart
Seraphina Goldfarb-Tarrant, Cohere
Debora Nozza, Bocconi University





Shared Task on Machine Translation Gender Bias Evaluation

Motivation 
Demographic biases are relatively infrequent phenomena but present a very 
important problem. The development of datasets in this area has raised the 
interest in evaluating Natural Language Processing (NLP) models beyond standard 
quality terms. In Machine Translation (MT), gender bias is observed when 
translations show errors in linguistic gender determination despite the fact 
that there are sufficient gender clues in the source content for a system to 
infer the correct gendered forms. To illustrate this phenomenon, sentence (1) 
below does not contain enough linguistic clues for a translation system to 
decide which gendered form should be used when translating into a language 
where the word for doctor is gendered. Sentence (2), however, includes a 
gendered pronoun which most likely has the word doctor as its antecedent. 
Sentence (3) shows two variations of the exact sentence with the only variation 
of the gender inflection. 

1. I didn’t feel well, so I made an appointment with my doctor. 
2. My doctor is very attentive to her patients’ needs. 
3. Mi amiga es una ama de casa / Mi amigo es un amo de casa. (in English, My 
(female/male) friend is a homemaker)

Gender bias is observed when the system produces the wrong gendered form when 
translating sentence (2) into a language that uses distinct gendered forms for 
the word doctor. A single error in the translation of an utterance the like of 
sentence (1) would not be sufficient to conclude that gender bias exists in the 
model; doing so would take consistently observing one linguistic gender over 
another. Finally, a lack of robustness is shown in sentence (3) if the 
translation quality differs in the translation of sentences in (3). It has 
previously been hypothesized that one possible source of gender bias is gender 
representation imbalance in large training and evaluation data sets, e.g. 
[Costa-jussà et al., 2022; Qian et al., 2022]

Goals

The goals of the shared translation task are:
To investigate the quality of MT systems on the particular case of gender 
preservation for tens of languages
To examine and understand special gender challenges in translating in different 
language families.
To investigate the performance of gender translation of low-resource, 
morphologically rich languages
To open to the community the first challenge of this kind
To generate up-to-date performance numbers in order to provide a basis of 
comparison in future research
To investigate the usefulness of multilingual and language resources
To encourage beginners and established research groups to participate and 
interchange discussions

Shared Task Description

We propose to evaluate the 3 cases of gender bias: gender-specific, gender 
robustness and unambiguous gender.
 
Description Task 1: Gender-specific

In the English-to-X translation direction, we evaluate the capacity of machine 
translation systems to generate gender-specific translations from English 
neutral inputs (e.g.  I didn’t feel well, so I made an appointment with my 
doctor.) This can be illustrated by the fact that machine translation (MT) 
models systematically translate neutral source sentences into masculine or 
feminine depending on the stereotypical usage of the word (e.g. “homemakers” 
into “amas de casa”, which is the feminine form in Spanish and “doctors” into 
“médicos”, which is the masculine form in Spanish). 

Description Task 2: Gender Robustness

In the X-to-English translation direction, we compare the robustness of the 
model when the source input only differs in gender (masculine or feminine), 
e.g. in Spanish: Mi amiga es una ama de casa / Mi amigo es un amo de casa.

Description Task 3: Unambiguous Gender

In the X-to-X translation direction, we evaluate the unambiguous gender 
translation across languages and without being English-centric, e.g, 
Spanish-to-Catalan: Mi amiga es una ama de casa is translated into La meva 
amiga és una mestressa de casa  

Submission details

X Languages. In addition to English, our challenge covers 26 languages: Modern 
Standard Arabic, Belarusian, Bulgarian, Catalan, Czech, Danish, German, French, 
Italian, Lithuanian, Standard Latvian, Marathi, Dutch, Portuguese, Romanian, 
Russian, Slovak, Slovenian, Spanish, Swedish, Tamil, Thai, Ukrainian, Urdu

Evaluation. The challenge will be evaluated using automatic metrics. Evaluation 
criteria will be in terms of overall translation quality and difference in 
performance for male and female sets. More details will be provided.
Submission platform. We will use the Dynabench platform 
<https://dynabench.org/tasks/multilingual-holistic-bias> for all tasks.

Important Dates.        
From Jan 2024, Fill in the interest form 
<https://docs.google.com/forms/d/e/1FAIpQLSdQQ4UynaoT70djAaGTUpLlIJyls3te2yfY1llRSI6v8t2lUg/viewform>
Mar 20, 2024: Model Submission
April 1-15, 2024: Evaluation
April 24, 2024: System paper submission deadline
May 15, 2024: Notifications of the acceptance
June 10, 2024: Camera-Ready version
August 16, 2024: Workshop at ACL        

Citation
Marta Costa-jussà, Pierre Andrews, Eric Smith, Prangthip Hansanti, Christophe 
Ropers, Elahe Kalbassi, Cynthia Gao, Daniel Licht, and Carleigh Wood. 2023. 
Multilingual Holistic Bias: Extending Descriptors and Patterns to Unveil 
Demographic Biases in Languages at Scale. In Proceedings of the 2023 Conference 
on Empirical Methods in Natural Language Processing, pages 14141–14156, 
Singapore. Association for Computational Linguistics.
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to