The START online demo http://start.csail.mit.edu/ is impressive,
and START is a possible architecture to investigate, but it does have
disadvantages, such as: you don't have access to the underlying system,
to "retrain" your own version and/or add new questions and answers;
and once you get started, you need a "production line" system to
semi-automate addition of new questions and answers, ideally by
semi-autotaically extracting these from trusted sources such as those
discussed by others on the list.

An alternative worth investigating is the ALICE chatbot architecture.
ALICE is an online chatbot whcih can respond to user input (questions
but also other comments etc) wth natural language replies.  Anyone can
set up their own online version of ALICE at the pandorabots hosting
website http://www.pandorabots.com/  You can either "add to" original
English ALICE chatbot, adding your own model questions and answers;
or retrian the "brain" from scratch, replacing the default ALICE
question+answer templates with an entirely new set of templates.
Once you register as a "botmaster" with your own online chatbot, you can
monitor usage; when a user types in a question, the chatbot generates a
reply, but if you as botmaster decide the reply is inappropriate, you
can reset the template to a better reply next time.  Richard Wallace,
the original developer of ALICE, used this sort of "feedback loop" to
incrementally improve the responses generated by ALICE so gradually it
produced fewer and fewer inapropriate replies, and gradually its "brain"
or set of question+answer templates grew to cover nearly all the
questions asked by casual users over the internet.

So, you could set up a "answers from the Quran" chatbot on the
Pandorabots website, with a small initial "seed" set of questions and
answers.  Then as Botmaster monitor the questions posed by users, to
collect a set of typical questions; the initial default answer for many
will be "I have no answer for that", but you can replace this with a
better answer in the model for future use.

Another advantage of the ALICE approach is that you can
semi-automatically extract question-and-answer templates from trusted
sources, eg trusted Quran Q+A websites, and convert these into the ALICE
brain internal format, AIML (Artificial Intelligence Markup Language).
Bayan Abu Shawar's 2005 PhD thesis, "a corpus based approach to generalising
a chatbot system", described how to do this: she developed Java tools
for mapping a corpus of dialogue or questions+answers  into AIML,
to import into a pandorabots chatbot.

Of course, the ALICE architecture does have its limitations and
disadvantages too! One is that AIML is based on the words of the text, and cannot take into account parts-of-speech, syntax, dependency
structure or other "deep" linguistic analysis; at least not the current
version of AIML. But i suggest that it is at least worth investigating
as a vehicle for a Quran question-answering system to augment the Quranic Arabic Corpus website.

Eric Atwell,
 Senior Lecturer, Language research group, School of Computing,
 Faculty of Engineering, UNIVERSITY OF LEEDS, Leeds LS2 9JT, England
 TEL: 0113-3435430  FAX: 0113-3435468  WWW/email: google Eric Atwell


Reply via email to