Call for Participation

Sentiment Across Multi-Dialectal Arabic: A Benchmark for Sentiment Analysis in 
the Hospitality Domain

We invite researchers, practitioners, and NLP enthusiasts to participate in the 
Sentiment Across Multi-Dialectal Arabic shared task, a challenge aimed at 
advancing sentiment analysis for Arabic dialects in the hospitality sector.


About the Task:

Arabic is one of the world’s most spoken languages, characterised by rich 
dialectal variation across different regions. These dialects significantly 
differ in syntax, vocabulary, and sentiment expression, making sentiment 
analysis a challenging NLP task. This task focuses on multi-dialectal sentiment 
detection in hotel reviews, where participants will classify sentiment as 
positive, neutral, or negative across multiple Arabic dialects, including 
Saudi, Moroccan, and Egyptian Arabic.

This shared task provides a high-quality multi-dialect parallel dataset, 
enabling participants to explore:
     1. Dialect-Specific Sentiment Detection – Understanding how sentiment 
varies across dialects.
     2. Cross-Linguistic Sentiment Analysis – Investigating sentiment 
preservation across dialects.
     3. Benchmarking on Multi-Dialect Data – Evaluating models on a 
standardised Arabic dialect dataset.

Dataset Overview:
     - Hotel reviews across multiple Arabic dialects.
     - Balanced sentiment distribution (positive, neutral, negative).
     - Multi-Dialect Parallel Dataset – Each review is available in multiple 
dialects, allowing for cross-linguistic comparison.

Evaluation Metrics:
     - Primary Metric: F1-Score.
     - Additional Analysis: Comparison of sentiment accuracy across dialects.

Baseline System:
      - Pre-trained BERT-based model (AraBERT) fine-tuned on MSA and Arabic 
dialect data.
      - Participants are encouraged to improve upon the baseline model with 
their own techniques and use LLMs.

Why Participate?
      - Contribute to Arabic NLP Research – Help advance sentiment analysis for 
Arabic dialects.
      - Gain Access to a High-Quality Dataset – A unique multi-dialect 
benchmark for future research.
      - Collaborate with the NLP Community – Engage with leading researchers 
and practitioners.
      - Showcase Your Work – High-performing models may be featured in a 
post-task publication.

Timeline
      - Training data ready – April 15, 2024
      - Test Evaluation starts – April 27, 2025
      - Test Evaluation end – May 10, 2025 
      - Paper submission due – May 16, 2025
      - Notification to authors – May 31, 2025
      - Shared task presentation co-located with RANLP 2025 – September 11 and 
September 12, 2025
 
How to Participate?
      - Register for the task via   https://ahasis-42267.web.app/
      - Download the dataset and baseline system.
      - Develop and test your sentiment analysis model.
      - Submit your results for evaluation.

Organising Team
      1. Maram Alharbi, Lancaster University, UK
      2. Salmane Chafik, Mohammed VI Polytechnic University, Morocco
      3. Professor Ruslan Mitkov, Lancaster University, UK
      4. Dr. Saad Ezzini, King Fahd University of Petroleum and Minerals, Saudi 
Arabia
      5. Dr. Tharindo Ranasinghe, Lancaster University, UK 
      6. Dr. Hansi Hettiarachchi, Lancaster University, UK 

For inquiries, please contact us at    [email protected]
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to