[Corpora-List] Re: Corpora Digest, Vol 789, Issue 1

frcchang--- via Corpora Tue, 23 Apr 2024 04:35:57 -0700

help
---- Replied Message ----
From [email protected] Date 04/22/2024 20:00 To 
[email protected] Cc Subject Corpora Digest, Vol 789, Issue 1 
Send Corpora mailing list submissions to
[email protected]
To subscribe or unsubscribe via email, send a message with subject or
body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."
Today's Topics:
1. WMT 2024: Low-Resource Indic Language Translation. (Santanu Pal)
2. Final CPF: SIGIR eCom'24: May 3rd (Tracy Holloway King)
3. [2nd CFP] Special issue on Abusive Language Detection of the journal 
Traitement Automatique des Langues (TAL)
(Farah Benamara)
4. [Call for Participation]: GermEval2024 Shared Task GerMS-Detect - Sexism 
Detection in German Online News Fora @Konvens 2024
([email protected])
----------------------------------------------------------------------
Message: 1
Date: Sun, 21 Apr 2024 13:02:42 +0100
From: Santanu Pal <[email protected]>
Subject: [Corpora-List] WMT 2024: Low-Resource Indic Language
Translation.
To: [email protected]
Message-ID:
<CALdLWwZZ4EJ6Vk5r9xS1b90vGBgtWpfq_PwGJSF=f+uq6-z...@mail.gmail.com>
Content-Type: multipart/alternative;
boundary="00000000000043488c06169a1b64"
Dear Colleagues,
We are pleased to inform you that we will be hosting the "Shared Task:
Low-Resource Indic Language Translation" again this year as part of WMT
2024. Following the outstanding success and enthusiastic participation
witnessed in the previous year's edition, we are excited to continue this
important initiative. Despite recent advancements in machine translation
(MT), such as multilingual translation and transfer learning techniques,
the scarcity of parallel data remains a significant challenge, particularly
for low-resource languages.
The WMT 2024 Indic Machine Translation Shared Task aims to address this
challenge by focusing on low-resource Indic languages from diverse language
families. Specifically, we are targeting languages such as Assamese, Mizo,
Khasi, Manipuri, Nyishi, Bodo, Mising, and Kokborok.
For inquiries and further information, please contact us at
[email protected]. Additionally, you can find more details and updates
on the task through the following link: Task Link:
https://www2.statmt.org/wmt24/indic-mt-task.html.
We highly encourage participants to register in advance so that we can
provide updates regarding release dates of data and other relevant
information periodically
To register for the event, please fill out the registration form available
here. (
https://docs.google.com/forms/d/e/1FAIpQLSd8LwriqdLLhVNAvUWEcGRJmKuBFQZ9BR_TKpb6VYZEnyGU0g/viewform?pli=1
)
We look forward to your participation and contributions to advancing
low-resource Indic language translation.
with best regards,
Santanu
-------------- next part --------------
A message part incompatible with plain text digests has been removed ...
Name: not available
Type: text/html
Size: 1892 bytes
Desc: not available
------------------------------
Message: 2
Date: Sun, 21 Apr 2024 14:38:11 -0700
From: Tracy Holloway King <[email protected]>
Subject: [Corpora-List] Final CPF: SIGIR eCom'24: May 3rd
To: [email protected]
Message-ID:
<cakca-dzek8y5hnwg2nospruat27tym9baomjr-yj7wvs3xg...@mail.gmail.com>
Content-Type: multipart/alternative;
boundary="000000000000d0b3b80616a22378"
Final Call For Papers - SIGIR eCom'24 - https://sigir-ecom.github.io/
The SIGIR Workshop on eCommerce will serve as a platform for publication
and discussion of Information Retrieval, NLP and Vision research relative
to their applications in the domain of eCommerce. This workshop will bring
together practitioners and researchers from academia and industry to
discuss the challenges and approaches to product search and recommendation
in eCommerce. The deadline for paper submission is May 3rd, 2024 (11:59
P.M. AoE)
The special theme of this year's workshop is eCommerce Search in the Age of
Generative AI and LLMs.
The workshop will also include a data challenge. This year we will
collaborate with TREC on a product search data challenge (
https://trec-product-search.github.io/index.html). The overarching goal is
to study how end-to-end retrieval systems can be built and evaluated given
a large set of products. The data challenge provides a corpus of products
and a set of user intents (queries): the goal is to find the product that
suits the user’s needs.
SIGIR eCom is a full day workshop taking place on Thursday, July 18, 2024
in conjunction with SIGIR 2024. SIGIR eCom'24 will be an in-person workshop.
________________
Important Dates:
Paper submission deadline - May 3rd, 2024 (11:59 P.M. AoE)
Notification of acceptance - May 23, 2024
Camera Ready Version of Papers Due - June 24, 2024
SIGIR eCom Full day Workshop - July 18, 2024
We invite quality research contributions, position and opinion papers
addressing relevant challenges in the domain of eCommerce. We invite
submission of both papers and posters. All submitted papers and posters
will be single-blind and will be peer reviewed by an international program
committee of researchers of high repute. Accepted submissions will be
presented at the workshop.
Topics:
Topics of interest include, but are not limited to:
* eCommerce search in the age of Generative AI and LLMs (2024 special theme)
- Ranking and Whole Page Relevance
- Optimization for IR and business metrics
- Diversity in product search and recommendations
- Relevance models for multi-faceted entities
- Relevance vs. revenue
- Deterministic sorts (e.g. price low to high)
- Temporal dynamics and seasonality
* Query and Document Understanding
- Query intent, query suggestions, and auto-completion
- Strategies for resolving low or zero recall queries
- Converting across modalities (e.g., text, structured data, images)
- Categorization and facets
- Reviews and sentiment analysis
* Recommendation and Personalization
- Personalization & contextualization, including the use of personal
facets such as age, gender, location
- Privacy, bias and ethics in eCommerce IR
- Blending recommendations and search results
- Representations and Data
- Semantic representation of products, queries, and customers
- Construction and use of knowledge graphs for eCommerce
* IR Fundamentals for eCommerce
- Unified and universal search and recommendations
- Cross-lingual search and machine translation
- Indexing and search in rapidly changing environments (e.g., auction
sites)
- Experimentation techniques including AB testing and multi-armed bandits
* Visual Search in ecommerce
- Large-scale Visual Search Challenges and Solutions
- Multimodal Search and combining visual and textual information
- Combining Vision and language models
- Explainable AI for Visual Search
* Other challenges
- Trust, transparency, and fairness in eCommerce
- UX for eCommerce
- The role of search in trust and security for marketplaces
- Question answering and chatbots for eCommerce
Data/Resource Track:
In order to promote academic research in the eCommerce domain, we plan to
accept a small number of high quality dataset contributions. These
submissions should be accompanied by a clear and detailed description of
the dataset, some potential questions and applications that arise from it.
Preliminary empirical investigations conveying any insight about the data
will increase the quality of the submission.
Submission Instructions:
All papers will be peer reviewed (single-blind) by the program committee
and judged by their relevance to the workshop, especially to the main
themes identified above, and their potential to generate discussion.
Submissions must describe work that is not previously published, not
accepted for publication elsewhere, and not currently under review
elsewhere. All submissions must be in English. The workshop follows a
single-blind reviewing process, i.e. author names must be on the papers. We
do not accept anonymized submissions. At least one of the authors of each
accepted paper must register for the workshop and present the paper.
All submissions must be in PDF formatted according to the latest CEUR
single column format; the short (8-page) and long (15-page) limits are
extended to account for this. For instructions and LaTeX/Overleaf/docx
templates, see: https://ceur-ws.org/HOWTOSUBMIT.html#CEURART Read up to and
including the “License footnote in paper PDFs” section. Please Use
Emphasizing Capitalized Style for Paper Titles. Submit your paper PDF
through the SIGIR eCom’24 Easychair:
https://easychair.org/conferences/?conf=sigirecom24
Long paper limit: 15 pages. References are not counted in the page limit.
Short paper limit: 8 pages. References are not counted in the page limit.
The deadline for paper submission is May 3rd, 2024 (11:59 P.M. AoE)
https://sigir-ecom.github.io/
-------------- next part --------------
A message part incompatible with plain text digests has been removed ...
Name: not available
Type: text/html
Size: 24122 bytes
Desc: not available
------------------------------
Message: 3
Date: Mon, 22 Apr 2024 10:40:25 +0800
From: Farah Benamara <[email protected]>
Subject: [Corpora-List] [2nd CFP] Special issue on Abusive Language
Detection of the journal Traitement Automatique des Langues (TAL)
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8; format=flowed
================[Apologies for any cross-posting]================
**Special issue of the journal Traitement Automatique des Langues (TAL) 
Abusive Language Detection : Linguistic Resources, Methods and 
Applications **
**Guest Editors**
Farah Benamara (IRIT-Toulouse University, IPAL Singapore), Delphine 
Battistelli (MoDyCo, Paris Nanterre University) and Viviana Patti (Turin 
University)
**Motivations**
Abusive language - or, in another very common terminology, hate speech - 
and the propagation of harmful stereotypes have unfortunately become 
commonplace occurrences on various social media platforms, partly due to 
users’ freedom and anonymity and the lack of regulation provided by 
these platforms. The sheer volume and often implicit nature of such 
unwanted content make manual moderation of these user spaces a 
formidable task. Various scientific communities interested in its at 
least partial automation have taken up the problem over the past ten 
years. In particular, Computational Social Science, Natural Language 
Processing and Computational Linguistics have proposed numerous works to 
create resources, datasets, and models aimed at automating the task of 
abusive language detection (henceforth ALD). In fact, we see that ALD 
has become a research theme in its own right in the field of Natural 
Language Processing with an abundant literature.
Abusive language (umbrella term to refer to the various forms of harmful 
language, such as toxic, offensive language, hate speech, and 
stereotypes) is topically focused and each specific manifestation of 
abusive language targets different vulnerable groups based on 
characteristics such as gender (misogyny, sexism), ethnicity, race, 
religion (xenophobia, racism, Islamophobia), sexual orientation 
(homophobia), and so on. Most automatic ALD approaches cast the problem 
into a binary classification task but important considerations should be 
taken into account, in particular: (1) the topical focus or the 
target-oriented nature of hate speech ; (2) the degree of engagement of 
users in abusive content (e.g., denunciation, approbation, reporting, 
neutral attitude) ; (3) the question of stereotypes and dominant 
ideologies ; (4) the question of linguistic strategies more particularly 
linked or born with social networks (e.g., emoticons, hashtags). 
Furthermore, most of the work (resources, classifiers) is developed for 
English.
**Topics**
Motivated by the interest of the community in the problem of ALD, we 
invite papers from Natural Language Processing, Machine Learning and 
Computational Social Sciences. We explicitly encourage interdisciplinary 
submissions (resources, computational methods, and user applications at 
the interface of linguistics/psychology/socio-linguistics/sociology) but 
also position papers on the actual state of the art in the field 
discussing the limitations of the current approaches and directions for 
future work. The topics covered by the special issue include, but are 
not limited to:
-- Linguistic resources and evaluation: annotation schemes, corpus 
linguistics studies, new datasets, with a particular interest in French 
language and/or multilingual resources. In the case of strictly lexical 
resources: methods for constituting them and coverage, semantic 
categories retained.
-- Formal/Conceptual approaches for ALD as inspired by models in 
sociology, socio-linguistics and psychology.
-- Models and Methods: supervised and unsupervised approaches, including 
LLMs.
-- Role of contextual phenomena, including discourses, extra-linguistic 
contexts (e.g., cultural aspects).
-- Models for cross-lingual and multimodal detection.
-- New approaches beyond binary classification: target-oriented ALD, 
degrees of user engagement, etc.
-- Dynamics of online AL in social media, propaganda propagation.
-- Bias detection and removal in resource creation, datasets and methods.
-- Application of ALD tools in education, social media content 
moderation, etc.
-- Social, legal, and ethical implications of detecting, monitoring and 
moderating AL.
**Important dates**
May 31th, 2024: Submission deadline
July 15th, 2024: Notification of acceptance after first rereading
End of September 2024: Revised version
Mid October 2024: Final decision
End of November 2024: Camera ready
January 2025: Publication of the special issue
**Submission**
Submissions can either be in French or English and should follow the 
journal templates: https://tal-65-3.sciencesconf.org/
**About the journal**
Traitement Automatiques des Langues Journal (TAL) is the international 
French journal of Natural Language Processing 
(https://www.atala.org/revuetal) published by ATALA (French Association 
for Natural Language Processing, http://www.atala.org) since 1959 with 
the support of CNRS (National Centre for Scientific Research). It is 
indexed by ACL Anthology as well as DBLP. It is also supported by the 
Institute of Human and Social Sciences of the CNRS.
**Contact**
For any question, please contact [email protected]
**External committee**
-- Cristina Bosco, University of Turin
-- Elena Cabrio, University of Côte d'Azur
-- Tommaso Caselli, Faculty of Arts, Rijksuniveristeit Groningen
-- Valentina Dragos, ONERA
-- Karën Fort, Sorbonne University
-- Claire Hugonnier, University of Grenoble Alpes
-- Irina Illina, University of Lorraine
-- Roy Ka-Wei Lee, Singapore University of Technology and Design
-- Véronique Moriceau, IRIT, University of Toulouse
-– Frédérique Segond, INRIA Paris
-- Mariona Taulé, University of Barcelona
-- Samuel Vernet, Aix-Marseille University
-- Mathieu Valette, Paris Sorbonne Nouvelle University
-- Marcos Zampieri, George Mason University
-- 
========================
Farah Benamara Zitoune
Professor in Computer Science, Université Paul Sabatier
IRIT-CNRS
118 Route de Narbonne, 31062, Toulouse.
Tel : +33 5 61 55 77 06
http://www.irit.fr/~Farah.Benamara
==================================
-- 
========================
Farah Benamara Zitoune
Professor in Computer Science, Université Paul Sabatier
IRIT-CNRS
118 Route de Narbonne, 31062, Toulouse.
Tel : +33 5 61 55 77 06
http://www.irit.fr/~Farah.Benamara
==================================
------------------------------
Message: 4
Date: Mon, 22 Apr 2024 08:58:42 -0000
From: [email protected]
Subject: [Corpora-List] [Call for Participation]: GermEval2024 Shared
Task GerMS-Detect - Sexism Detection in German Online News Fora
@Konvens 2024
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="utf-8"
1st CALL FOR PARTICIPATION
We are pleased to announce the GermEval Shared Task GerMS-Detect on Sexism 
Detection in German Online News Fora collocated with Konvens 2024.
Competition Website: https://ofai.github.io/GermEval2024-GerMS/
Important Dates:
Trial phase: April 20 - April 29, 2024
Development phase: May 1 - June 5, 2024
Competition phase: June 7 - June 25, 2024
Paper submission due: July 1, 2024
Camera ready due: July 20, 2024
Shared Task @KONVENS: 10 September, 2024
Task description:
This shared task is about the detection of sexism/misogyny in comments posted 
in (mostly) German language to the comment section of an Austrian online 
newspaper. The data was originally collected for the development of a 
classifier that supports human moderators in detecting potentially sexist 
comments or identify comment fora with a high rate of sexist comments. For 
details see the Competition Website 
(https://ofai.github.io/GermEval2024-GerMS/).
Organizers:
The task is organized by the Austrian Research Institute for Artificial 
Intelligence (OFAI) (www.ofai.at).
Organizing team:
Brigitte Krenn (brigitte.krenn (AT) ofai.at)
Johann Petrak (johann.petrak (AT) ofai.at)
Stephanie Gross (stephanie.gross (AT) ofai.at)
------------------------------
Subject: Digest Footer
_______________________________________________
Corpora mailing list -- [email protected]
To unsubscribe send an email to [email protected]
------------------------------
End of Corpora Digest, Vol 789, Issue 1
***************************************

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] Re: Corpora Digest, Vol 789, Issue 1

Reply via email to