**The 6th Workshop on Open-Source Arabic Corpora and Processing Tools (Hybrid) 
with shared tasks on Arabic LLMs Hallucination and Dialect to MSA Machine 
Translation**

The workshop will be conducted in a *hybrid* format to ensure maximum 
participation, accommodating attendees both online and in-person.
Submission deadline: extended to * March 1 *, 2024

*Workshop site* : https://osact-lrec.github.io/

*shared tasks:*
Task 1: Arabic LLMs Hallucination (contact Hamdy Mubarak),  Link: 
https://sites.google.com/view/arabic-llms-hallucination
Task 2: Dialect to MSA Machine Translation (contact Kareem Darwish), Link: 
https://codalab.lisn.upsaclay.fr/competitions/17118

*Co-located with LREC-COLING 2024*
https://lrec-coling-2024.org/
Turin, Italy, 20-25 May 2024

* Important Dates*
Submission deadline: extended to * March 1 *, 2024
Notification of acceptance: March 25, 2024
Camera-ready papers due: March 30, 2024
Workshop date: May 25, 2024

*Workshop Description*
In the computational linguistics (CL), natural language processing (NLP), and 
information retrieval (IR) communities, Arabic is considered to be relatively 
resource-poor compared to English. This situation was thought to be the reason 
for the limited number of language resources -based studies in Arabic. However, 
the past few years witnessed the emergence of new considerably large and free 
classical and Modern Standard Arabic (MSA) as well as dialectical corpora and 
to a lesser extent Arabic processing tools.

This workshop follows the footsteps of previous editions of OSACT to provide a 
forum for researchers to share and discuss their ongoing work. This workshop is 
timely given the continued rise in research projects focusing on Arabic 
Language Resources. The sixth workshop comes to encourage researchers and 
practitioners of Arabic language technologies, including CL, NLP and IR to 
share and discuss their latest research efforts, corpora, and tools. The 
workshop will also give special attention to Large Language Models (LLMs) and 
Generative AI, which is a hot topic nowadays. In addition to the general topics 
of CL, NLP and IR, the workshop will give a special emphasis on two shared 
tasks, namely: Arabic LLMs Hallucination and Dialect to MSA Machine Translation.


*Submissions Topics*
Language Resources:
- Pre-trained Arabic language models and their applications.
- Surveying and evaluating the design of available Arabic corpora, their 
associated and processing tools.
- Availing new annotated corpora for NLP and IR applications such as named 
entity recognition, machine translation, sentiment analysis, text 
classification, and language learning.
- Evaluating the use of crowdsourcing platforms for Arabic data annotation.
- Open source Arabic processing toolkits.

Tools and Technologies:
Language education, e.g., L1 and L2.
- Language modeling and pre-trained models.
- Tokenization, normalization, word segmentation, morphological analysis, 
part-of-speech tagging, etc.
- Sentiment analysis, dialect identification, and text classification.
- Dialect translation.
- Fake news detection.
- Web and social media search and analytics.
- Issues in the design, construction, and use of Arabic LRs: text, speech, 
sign, gesture, image, in single or multimodal/multimedia data.
- Guidelines, standards, best practices, and models for LRs interoperability.
- Methodologies and tools for LRs construction and annotation.
- Methodologies and tools for extraction and acquisition of knowledge.
- Ontologies, terminology, and knowledge representation.
- LRs and Semantic Web (including Linked Data, Knowledge Graphs, etc.).

Issues in the design, construction and use of Arabic LRs:
- Guidelines, standards, best practices and models for LRs interoperability.
- Methodologies and tools for LRs construction and annotation.
- Methodologies and tools for extraction and acquisition of knowledge.
- Ontologies, terminology and knowledge representation.
- LRs and Semantic Web (including Linked Data, Knowledge Graphs, etc.).

*Submissions*
- Submission Instructions: https://lrec-coling-2024.org/authors-kit/
- Submission Link: https://softconf.com/lrec-coling2024/osact2024/

 *Workshop organizers*
- Hend Al-Khalifa ( King Saud University, KSA)
- Hamdy Mubarak (Qatar Computing Research Institute, Qatar)
- Kareem Darwish (aiXplain Inc., US)
- Tamer Elsayed (Qatar University, Qatar)
- Mona Ali  (Northeastern University, Canada)
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to