Abdul-Baquee,
this is an excellent idea in principle - word clouds and wordle
(http://www.wordle.net/) and variants like Tagul http://tagul.com/
have become commonly-known tools for visualisation of key terms
in a text, so it's about time someone did this for Quran computing.
I can suggest anumber of ways you might consider taking this further:
- you currently offer a list of words in a chapter (or selected collection of
chapters), excluding stopwards, with the font-size of the word
representing the frequency of that word.
Worlde offers more visually-appealing visualisations: a variety of
colours and fonts, with words laid out in a variety of formats.
Worlde highlights the "main" words much more clearly, especially for a
long text or large collection: low-frequency words dissapear leaving
only a few high-freq words.
Maybe you could offer Worlde outputs instead of (or as well as)
the word-clouds you generate by PHP?
- you only offer Arabic word-clouds. The word-by-word translations in
the Quranic Arabic Corpus offer in principle the possibility of a
word-cloud and/or Worlde of the English translation, or (even better)
both English and Arabic Word-clouds ... either side-by-side, or
together. You will have some issues in tokenisation etc to sort out
though.
- you might ask viewers to suggest feedback on what they might use this
for; and then later go on to add these applications to your website,
as suggestions for future users. I imagine such a visualisation might
help a learner/teacher to see the main points to focus on first; or to
see common recurring themes in a collection of texts (assuming high-freq
words must be recurring); or to introduce new learners to a new way of
studying and appreciating the texts; etc ...
This could be a whole new line in Quranic Computing research...
Eric
Eric Atwell, School of Computing, Leeds University
On Sat, 23 Jan 2010, Abdulbaqi Sharaf wrote:
Hello,
I have implemented "word cloud" for Qur'anic surahs:
http://www.textminingthequran.com/php/wordcloud.html
you can choose more than one Sura, and can consider only content words, the
list of stop words is initial and should grow to more words in future, have a
try and let me know your feedback..
best,
Abdul-Baquee M. Sharaf
PhD Student
Language Technologies Group
School of Computing
University of Leeds
UK
--
Eric Atwell,
Senior Lecturer, Language research group, School of Computing,
Faculty of Engineering, UNIVERSITY OF LEEDS, Leeds LS2 9JT, England
TEL: 0113-3435430 FAX: 0113-3435468 WWW/email: google Eric Atwell