ูุณุฑูุง ุงูุงุนูุงู ุนู ู
ุนุฌู
ูุจุณ ุงูุญุงุณูุจู
We are very happy to release
๐๐๐๐๐ฌ - ๐๐ฉ๐๐ง-๐๐จ๐ฎ๐ซ๐๐
๐๐๐ฑ๐ข๐๐จ๐ ๐ซ๐๐ฉ๐ก๐ข๐
๐๐๐ญ๐๐๐๐ฌ๐
Qabas = 60k Lemmas + manually linked with 12 corpora (2.3 tokens) + 110
lexicons (~ 300k lemmas)
Birzeit Universityโs SinaLab for Computational Linguistics and
Artificial Intelligence [1] has officially launched Qabas [2], an
open-source lexicographic database for Arabic, designed specifically for
Natural Language Processing (NLP) applications.
Qabas stands out by linking its lexical entries (lemmas) with lemmas
from 110 different lexicons and numerous morphologically annotated
corpora (around 2 million tokens), creating an extensive lexicographic
graph. This project has been under development for over fourteen years.
Lexicons have evolved from being primarily hard-copy resources for human
use to having substantial significance in NLP applications. Although
Arabic is a highly resourced language in terms of traditional lexicons,
not enough attention is given to developing AI-oriented lexicographic
databases. Additionally, none of the Arabic lexicons are available
open-source, due to copyright restrictions imposed by their owners. As
for Qabas, it is an open-source Arabic lexicon designed for NLP
applications, and its novelty lies in its synthesis of many lexical
resources. Each lexical entry (i.e., lemma) in Qabas is linked with
equivalent lemmas in 110 other lexicons, and with 12
morphologically-annotated corpora (about 2M tokens); The philosophy of
Qabas is to construct a large lexicographic data graph by linking
existing Arabic lexicons and annotated corpora. Qabas stands as the
largest Arabic lexicon, encompassing about 58K lemmas (45K nominal
lemmas, 12.5K verbal lemmas, and 500 function word lemmas).
Prof. Mustafa Jarrar, the projectโs manager and main author,
emphasized the importance of making Qabas freely available as an
open-source resource, allowing everyone to access and use it for both
commercial and non-commercial purposes. Prof. Jarrar hopes that
researchers, companies, and software developers will leverage the
lexiconโs data to develop innovative content and applications that
benefit humanity.
Prof. Talal Shahwan, President of Birzeit University, stated that
despite the challenging conditions in Palestine, the university remains
committed to excellence and to its mission towards knowledge. He
emphasized that this achievement was made possible by the dedication of
the universityโs faculty and researchers.
Qabas is publicly available online at: https://sina.birzeit.edu/qabas
[2]
To download Qabas and find out more, see:
https://sina.birzeit.edu/qabas/about [3]
Article: https://www.jarrar.info/publications/JH24.pdf [4]
Weโd love your feedback:
Facebook: https://www.facebook.com/watch?v=880418097306662 [5]
LinkedIn: https://www.facebook.com/watch?v=880418097306662 [5]
Best
--Mustafa
__________________________
Mustafa Jarrar, PhD
Professor of Artificial Intelligence
Chair, PhD Program in Computer Science
Birzeit University, Palestine
Page: http://www.jarrar.info [6]
SinaLab: https://sina.birzeit.edu [1]
Links:
------
[1]
https://urldefense.com/v3/__https://sina.birzeit.edu/__;!!D9dNQwwGXtA!SFgLoYlsqfXVgjsy9jWkD-AOLPLlL9QKzCsSv_7814XZuCoMEfz5kzKotsLitobr3BCL78hdruIHohCdhvbk5gI$
[2]
https://urldefense.com/v3/__https://sina.birzeit.edu/qabas__;!!D9dNQwwGXtA!SFgLoYlsqfXVgjsy9jWkD-AOLPLlL9QKzCsSv_7814XZuCoMEfz5kzKotsLitobr3BCL78hdruIHohCd1mNo7Uc$
[3]
https://urldefense.com/v3/__https://sina.birzeit.edu/qabas/about__;!!D9dNQwwGXtA!SFgLoYlsqfXVgjsy9jWkD-AOLPLlL9QKzCsSv_7814XZuCoMEfz5kzKotsLitobr3BCL78hdruIHohCdmvX-YsE$
[4]
https://urldefense.com/v3/__https://www.jarrar.info/publications/JH24.pdf__;!!D9dNQwwGXtA!SFgLoYlsqfXVgjsy9jWkD-AOLPLlL9QKzCsSv_7814XZuCoMEfz5kzKotsLitobr3BCL78hdruIHohCdOUh01CQ$
[5]
https://urldefense.com/v3/__https://www.facebook.com/watch?v=880418097306662__;!!D9dNQwwGXtA!SFgLoYlsqfXVgjsy9jWkD-AOLPLlL9QKzCsSv_7814XZuCoMEfz5kzKotsLitobr3BCL78hdruIHohCdLZezV5c$
[6]
https://urldefense.com/v3/__http://www.jarrar.info/__;!!D9dNQwwGXtA!SFgLoYlsqfXVgjsy9jWkD-AOLPLlL9QKzCsSv_7814XZuCoMEfz5kzKotsLitobr3BCL78hdruIHohCdPkNHViA$