Hi All, My name is Dileepa Jayakody, a MSc research student from University of Moratuwa, Sri Lanka. My research project (ReputationBox) is about prediction the goodness of incoming emails (based on a calculated reputation score) by analysing previous email conversations, email correspondents and their interests etc. I think this will be more like a recommendation engine for emails to rate and classify incoming emails based on a reputation score.
The basic flow of my application is as follows; 1. User authorizes my application : ReputationBox to connect to his mailbox to read email 2. ReputationBox performs an initial reputation-analysis process to build a reputation-index over the past emails imported as a batch. (This initial reputation-index will be used as the training-data to analyse new incoming emails) 3. New emails are polled/ pushed to ReputationBox server and reputation-analysis is performed real-time to predict the reputation. 4. Email reputation data is stored in the application 5. ReputationBox client web-app represents the reputation data of the new emails (based on the reputation data in the email the client could be implemented as a priority-inbox, spam-filter, email categorizer etc) I would like to seek advice on how to develop the reputation-analysis component of my application using Apache Mahout. I'm looking at the people, topic and the actions mentioned in an email to derive the reputation. This is the high level architecture diagram of ReputationBox system [1]. I also plan to deploy my application in Google AppEngine. Is Mahout GAE deployable? I'm also planning to use Apache Isis to develop ReputationBox as a domain-driven application. This is a proposed project for GSoC. For more information on my application please see the jira [2] Looking forward to your suggestions. Thanks, Dileepa [1] https://issues.apache.org/jira/secure/attachment/12634802/EmailReputationSystem_v2.png [2] https://issues.apache.org/jira/browse/ISIS-736
