Salam Raja and Ayesha Thank you for both of your related e-mails below. Please let me explain the difference the two word counts for the Quran that you have noticed on the website, i.e. 77,430 verses 74,129. The first figure is the total number of white-space seperated word-forms in the Quran, according to our current counting scheme. Other counting schemes (e.g. the Quran Printing Complex) differ by only one or two words. We have adopted the current word seperation scheme in the Tanzil project. I'm not saying either one is right or wrong (that is a separate, long and detailed discussion, about exactly when we should split the compound word ba3ad+ma or not :-)
With regards to the second word count you mention, 74,129, I presume that you got this number by adding up all the words on the project's verb concordance page and on the lemma concordance page: http://corpus.quran.com/verbs.jsp http://corpus.quran.com/lemmas.jsp So the next question is, if there are 77,430 words in the Quran, and if we count all the words that are verbs and the words that have lemmas, why do we only get 74,129 words? The answer is simple: The remaining words are not verbs and have not been tagged with lemmas. In particular, the remaining words are 30 disconnected initial letters (e.g. Alif-Lam-Meem) and the rest are independent pronouns (e.g. huwa, hiya, etc). We felt that lemmas were not applicable for these words (this again is open to discussion). For your convenience, I've included a brief explanation of the difference between lemma, root and stem from an upcoming journal submission. I hope this helps! As your question has been asked a couple of times before, hopefully with the next version of the website, we can add the explanation I just gave to the concordance pages, so as to explain in more detailed the difference between these two word counts for the Quranic text. w/salam, -- Kais http://www.kaisdukes.com - web http://kaisdukes.wordpress.com - blog ------------------------------------ Lemmas, Stems and Roots: "Two other popular resources provided alongside corpus annotations are the Quranic dictionary and morphological search. Both these resources are based around root, lemma and stem, which in Arabic linguistics are distinct concepts. Roots are an abstract grouping of words, and lemmas are a further subdivision. The root of an Arabic word is not a word itself, but a sequence of three or four letters, known as radicals, from which most words can be derived through the Arabic template-pattern system. A lemma is a real representative word that groups together other related words that differ by inflection, and is used as entry headers in standard Arabic dictionaries. The simplest non-inflected form of a word is chosen as the lemma: third person masculine for verbs and singular for nouns. Stems arise in morphological segmentation and are not necessarily actual words. After removing clitics from a compound word-form, the stem will remain." Extracted from: K. Dukes, E. Atwell and N. Habash (2010). Supervised Collaboration for Syntactic Annotation of Quranic Arabic. Submitted to the Language Resources and Evaluation Journal (LREJ). Special Issue on Collaboratively Constructed Language Resources. ------------------------------------------- > From: r...@um.edu.my[smtp:r...@um.edu.my <smtp%3ar...@um.edu.my>] > Sent: Friday, July 02, 2010 4:15:59 AM > To: Kais Dukes > Subject: Verb concordance and Lemma List > Auto forwarded by a Rule > > Dear Kais Dukes, > > I have copied and paste your list of lemmas and verb concordance as > displayed in the web. These information was put in the Excel sheet attached > to this email. > > As I was interested to know the word count of qura'nic words, I use your > list to count the words. I have found error in the numbers provided in the > list, or maybe it was just me counting those words wrongly: > > In your paper, you've pointed out that there should contain 77 430 words in > the Qur'an but I only find 74129 by adding up all the frequencies of all the > words in both list. > > Please verify why I only get 74129 not 77 430. It is because you did not > add particle words such as wa and fa? I looked up Tanzil.info, they do > include those as words. > > Thank you. > > -- > Raja Jamilah bt Raja Yusof > Pensyarah > Fakulti Sains Komputer dan Teknologi Maklumat > Universiti Malaya > ============================================== From: Ayesha Nicole <ayesha_nic...@yahoo.com> Date: Thu, Jul 1, 2010 at 11:14 PM Subject: Re: List of Nouns in Qur'an? To: Kais Dukes <k...@kaisdukes.com> wa alaikum as salaam. Yes. :) JAK ASA Ayesha ---------- Previous message ---------- From: Kais Dukes <k...@kaisdukes.com> To: Ayesha Nicole <ayesha_nic...@yahoo.com> Sent: Thu, July 1, 2010 4:45:03 PM Subject: Re: List of Nouns in Qur'an? Salam Ayesha, Have a look here, does this help? http://corpus.quran.com/lemmas.jsp Kind Regards, -- Kais ---------- Previous message ---------- On Thu, Jul 1, 2010 at 8:02 PM, Ayesha Nicole <ayesha_nic...@yahoo.com> wrote: Dear Kais, How are you? I am re-sending the email below to the correct address. Thank you. Sincerely, Ayesha ---------- Previous message ---------- From: Ayesha Nicole <ayesha_nic...@yahoo.com> To: dukes.k...@googlemail.com Sent: Thu, July 1, 2010 3:01:13 PM Subject: Re: List of Nouns in Qur'an? Dear Kais, How are you? Is there a way to create a list of all nouns that occur in the Qur'an? and by type of noun? and by location in the Quran? Please see below for the details behind my request. Thank you. Sincerely, Ayesha Nicole ---------- Previous message ---------- From: Abdulazeez Abdulraheem <toaz...@gmail.com> To: Ayesha Nicole <ayesha_nic...@yahoo.com> Sent: Thu, July 1, 2010 2:54:58 AM Subject: Re: List of Nouns in Qur'an? Sr Ayesha Assalam alaikum wa rahmatullah I wish the following link has an option. You can write to them. Previously, he (Br Qais Duke) was willing to share the database. A simple modification in the software should give us all the nouns in one click! http://corpus.quran.com/treebank.jsp Wassalam Abdulazeez Abdulraheem ---------- Previous message ---------- On Thu, Jul 1, 2010 at 2:25 AM, Ayesha Nicole <ayesha_nic...@yahoo.com> wrote: as salaam 'alaikum wa rahmatu Allahi wa barakatuh, Dr. Abdul-Raheem, I pray this reaches you in the best of health and imaan. I was reading this intriguing article on how children acquire language, with nouns being first, and then verbs, and then adjectives: http://www.sciencedaily.com/releases/2004/09/040915113243.htm I am reading through this document on basic Arabic grammar that has several examples of types of nouns (pages 13 - 33) and wonder if there is a list by the same grouping of all nouns and their meanings, and location in the Qur'an? http://www.al-islam.org/hawza/lugha/ArabicNouns.pdf And how do nouns differ between Arabic and English? JAK ASA Ayesha