I am also looking for such a resource :-)
On Thu, Apr 1, 2010 at 10:07 PM, johan zahri johan.i.za...@gmail.com wrote:
سلام عليكم,
Any web resource that we can read on determining patterns of the root of the
arabic words?
2010/3/30 Kais Dukes k...@kaisdukes.com
Salamu Alaykum Fazlul Haque,
To the best of my knowledge, our project is the first accurate
annotated morphological work for the Quran by computer, so I would be
surprised at an accurate unique word count from another secondary
source. Although of course, I could be wrong. The number of unique
Arabic words in the Quran is not an easy question to answer. In Arabic
the concept of a word can have multiple technical linguistic
interpretations. Based on the existing annotation we have performed at
the Quranic Arabic Corpus (http://corpus.quran.com), I can provide the
following statistics:
Total number of space-seperated words = 77,430
Number of *unique* surface forms (i.e. space-separated word-forms,
including clitics) = 18994
Number of unique words by *stem* = 12183
Number of unique words by *root* = 1685 (not necessarily a great
metric for unique word counting, e.g. pronouns have no Semitic root)
Number of unique words by *lemma* = 3382 (excluding verbs, and other
words where lemma is not annotated).
..