Hello comp-quran members, I am writing to you all, to get some advice on a printed version of the Quranic Arabic Corpus. We have been receiving a lot of requests lately for something that users can download for offline use. Because I beleive that the accuracy of the grammar is getting quite reasonable, I am now considering this. In any case, we can always update what we produce as the grammar improves.
What I had in mind, was a set of PDF files that could be downloaded. For example, perhaps 30 PDFs (1 per juz of the Quran, see: http://en.wikipedia.org/wiki/Juz'). My question is - I would be keen to find out from members of the mailing list if they think that this is a good idea? And if so, what would the best format be? I was thinking of starting with the word-by-word grammar (not the syntactic treebank). Perhaps starting with the information here: http://corpus.quran.com/wordbyword.jsp The problem is that if we displayed those images at the same resolution, with the same information, that comes out to about 7 Quranic words per printed A4 page. Given that there are 77,430 Arabic words in the Quran (according to our counting of whitespace) that would give 11,061 pages in total - or 368 pages per juz (i.e. 368 pages for each of the 30 PDF files). That doesn't sound very reasonable to me. If we shrink the images and text by 50% that would give about 5,530 pages in total. Do you think that perhaps we should just display part-of-speech tags to save space? Or perhaps Quranic researchers and students perfer the whole grammar written out in textual form? I'm open to any suggestions on how to best display this information in printed form for a set of PDFs. Any suggestions are more than welcome. Feel free to reply directly to the mailing list, just hit "reply all": comp-quran@comp.leeds.ac.uk Looking forward to hearing from you. Kind Regards, -- Kais Dukes Language Research Group School of Computing University of Leeds http://corpus.quran.com - The Quranic Arabic Corpus comp-quran@comp.leeds.ac.uk - Computional Quranic Arabic discussion list