[Corpora-List] Re: [CFP] The First Workshop and Shared Task on Multilingual Counterspeech Generation at COLING-2025

Jakob, Charlott Frederike via Corpora Thu, 22 Aug 2024 00:42:23 -0700

Dear Sender,

I am currently out of the office and will not be checking emails regularly. I 
will return on September 9, and will respond to your message as soon as 
possible after that date.


Best regards,
Charlott Jakob

On 7 Aug 2024, at 02:44, mevallec--- via Corpora <[email protected]> wrote:

Background and Scope
---------------------
While interest in automatic approaches to Counterspeech generation has been 
steadily growing,
including studies on data curation (Chung et al., 2019a; Fanton et al., 2021), 
detection (Chung
et al., 2021a; Mathew et al., 2018), and generation (Tekiroglu et al., 2020; 
Chung et al., 2021b;
Zhu and Bhat, 2021; Tekiroglu et al., 2022), the large majority of the 
published experimental work on automatic Counterspeech generation has been 
carried out for English. This is due to the scarcity of both non-English 
manually curated training data and to the crushing predominance of English in 
the generative Large Language Models (LLMs) ecosystem. A workshop on exploring 
Multilingual Counterspeech Generation is proposed to promote and encourage 
research on multilingual approaches for this challenging topic.

Thus, this workshop aims to test monolingual and multilingual LLMs in 
particular and Language Technology in general to automatically generate 
counterspeech not only in English but also in languages with fewer resources. 
In this sense, an important goal of the workshop will be to understand the 
impact of using LLMs, considering for example how to deal with pressing issues 
such as biases, hallucinated content, data scarcity or data contamination.

We seek to maximize the scientific and social impact of this workshop by 
promoting the
creation of a community of researchers from diverse fields, such as computer 
and social sciences, as well as policy makers and other stakeholders interested 
in automatic counterspeech generation. By doing so we aim to gain a deeper 
understanding of how counterspeech is currently used to tackle abuse by 
individuals, activists, and organizations
and how Natural Language Processing (NLP) and Generation (NLG) may be best 
applied to counteract it.

Call for Papers
---------------------
We welcome submissions on the following topics (but not limited to):
- Models and methods for generating counterspeech in different languages.
- Automatic Counterspeech generation for low resource languages with scarce 
training data.
- Dialogue agents that use counterspeech to combat offensive messages that are 
directed to individuals or groups, targeted based on various aspects such as 
ideology, gender, sexual orientation and religion.
- Methods for human and automatic evaluation of counterspeech.
- Multidisciplinary studies providing different perspectives on the topic such 
as computer science, social science, psychology, etc.
- Development of taxonomies and quality datasets for counterspeech in multiple 
languages.
- Potentials and limitations (e.g., fairness, biases, hallucinated content) of 
applying different NLP methods, such as LLMs, to generate counterspeech.
- Social impact and empirical studies of counterspeech in social networks, 
including research on the effectiveness and consequences for users of using 
counterspeech to combat hate online.

Submission
---------------------
We welcome two types of papers: regular workshop papers and non-archival 
submissions. Regular workshop papers will be included in the workshop 
proceedings. All submissions must be in PDF format and made through START  
[https://softconf.com/coling2025/MCG25/]
- Regular workshop papers: Authors can submit papers up to 8 pages, with 
unlimited pages for references. Authors may submit up to 100 MB of 
supplementary materials separately and their code for reproducibility. All 
submissions undergo an double-blind single-track review. Accepted papers will 
be presented as posters with the possibility of oral presentations.
- Non-archival submissions: Cross-submissions are welcome. Accepted papers will 
be presented at the workshop, but will not be included in the workshop 
proceedings. Papers must be in PDF format and will be reviewed in a 
double-blind fashion by workshop reviewers. We also welcome extended abstracts 
(up to 2 pages) of papers that are work in progress, under review or to be 
submitted to other venues. Papers in this category need to follow the COLING 
format.

Important Dates
---------------------
- Submission: November 20th, 2024
- Notification of Acceptance: December 2nd, 2024
- Camera-Ready Papers Due: December 10th, 2024

-----------------------------------------------------
Shared Task on Multilingual Counterspeech Generation
-----------------------------------------------------

In addition to paper contributions, we are organizing a shared task on 
multilingual counterspeech generation with the aim of sharing in a central 
space current efforts, especially those for languages different to English.
It is envisaged that the shared task would allow the community to study how we 
can improve counterspeech generation for both lower resource languages but also 
to reinforce the strong body of research already existing for English.
The counterspeech generated by participants should be respectful, 
non-offensive, and contain information that is specific and truthful with 
respect to the following targets: Jews, LGBT+, immigrants,, people of color, 
women.

Data
---------------------
We release new data consisting of 597 Hate Speech-Counter Narrative (HS-CN) 
pairs. In this dataset, the HS are taken from MTCONAN 
[https://github.com/marcoguerini/CONAN/tree/master/Multitarget-CONAN], while 
the CN are newly generated. Together with each HS-CN pair, we also provide 5 
background knowledge sentences, some of which are relevant for obtaining the 
Counter Narratives. The dataset is available in 4 different languages (Basque, 
English, Italian and Spanish) and divided in the following splits:
- Development: 100 pairs. [AVAILABLE NOW!] 
[https://huggingface.co/datasets/LanD-FBK/ML_MTCONAN_KN]
- Train: 397 pairs [AVAILABLE in OCTOBER!]
- Test: 100 pairs [TBA]

In order to score the shared task participants, the CNs will be kept hidden 
during the shared task while the HS and the background knowledge will be 
released for participants to prepare their submissions.

The languages, Basque, English, Italian and Spanish, offer a varied spectrum of 
complexity including an agglutinative language isolate (Basque), two romance 
languages (Italian, Spanish) and a Germanic one (English). The choice of 
languages obeys to the existing linguistic expertise among the organizers 
required to successfully run the shared task.

Additional Multilingual Data
---------------------
Participants also have available the English, Italian, Basque and Spanish CONAN 
manually curated data:
- CONAN (English and Italian) 
[https://github.com/marcoguerini/CONAN/tree/master/CONAN]
- CONAN (Basque and Spanish) [https://huggingface.co/datasets/HiTZ/CONAN-EUS]
- CONAN-MT-SP (Spanish) [https://github.com/sinai-uja/CONAN-MT-SP] or 
[https://huggingface.co/datasets/SINAI/CONAN-MT-SP]
- Multitarget-CONAN (English) 
[https://github.com/marcoguerini/CONAN/tree/master/Multitarget-CONAN]

Participation
---------------------
The aim of the shared task is, given a HS (and optionally any additional 
knowledge the participants may like to use), generate a CN to counteract the HS.
An example of CN appear bellow:
- English:
  + HS: We should ban all homosexuals.
  + CN: When will the love prosper and the hatred start to dissipate? I will 
not only respect my fellow LGBT+ people, I will promote their rights.
- Italian:
  + HS: Dovremmo espellere tutti gli omosessuali.
  + CN: Quand'è che l'amore prospererà e l'odio inizierà a dissiparsi? Non solo 
rispetterò i miei compagni LGBT+, ma promuoverò i loro diritti.
- Spanish:
  + HS: Deberíamos prohibir a todos los homosexuales.
  + CN: ¿Cuándo prosperará el amor y empezará a disiparse el odio? No sólo 
respetaré a mis compañeros LGBT+, sino que promoveré sus derechos.
- Basque:
  + HS: Homosexual guztiak debekatu beharko genituzke.
  + CN: Noiz hasiko da maitasuna irabazten eta gorrotoa desagertzen? LGBT+ 
pertsonak errespetatzeaz gain, haien eskubideak sustatuko ditut.

Participants will download the test HS for the 4 languages and generate at most 
three different CNs per HS for each language). The test window will last 5 days.
Participants are allowed to use any resource (language model, data, etc.) to 
generate the CN.

Evaluation
---------------------
The CNs submitted by the participants will be evaluated:
- Using traditional automatic metrics as in Tekiro ̆glu et al.( 2022), which 
include BLEU, ROUGE, Novelty and Repetition Rate.
- Using LLM as a Judge following the approach described in this paper:  
https://arxiv.org/abs/2406.15227

Important Dates
---------------------
- Test dataset release: October 21st, 2024
- Results submission: October 25th, 2024
- Results notification: November 10th, 2024
- Working papers submission: November 20th, 2024
- Notification of Acceptance: December 3rd, 2024
- Camera-Ready Papers Due: December 10th, 2024
- Workshop: January 19th, 2025


For more information you can joint the Google group 
[https://groups.google.com/g/multilingual-cs-generation-coling2025] or visit 
our website [https://sites.google.com/view/multilang-counterspeech-gen/home]

Best regards,
The Multilingual Counterspeech Generation Workshop Organizers.
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] Re: [CFP] The First Workshop and Shared Task on Multilingual Counterspeech Generation at COLING-2025

Reply via email to