james server contains a functional but basic bayesian spam filter. in the past, suggestions have been made (the JIRA eludes me ATM) that we factor out this function so that it can be developed indepedently. i think the time is right to do this:
1 a great deal of progress has been made over in hadoopland with advanced distributed machine learning (hama, mahout). integration this into james would provide industrial strength distributed filtering capabilities. 2 machine learning algorithms can also be employed to improve the user experience. i'm very kean on automated approaches to tagging. these tags would then be used to filter etc. i'm willing to push this forward if i have the support of the community note i suggest "apache-james-machine-learning" since the product will need to be multi-module. so (following stefano's rule) i suggest it is created at top level. note that mailets along these lines are IMO beyond the assembly capability of Phoenix. (IMAP and jsieve are increasingly difficult to cleanly assemble so this problem is more general.) i'm +1 - robert --8<---------------------------------------------------------------------------------------- [ ] +1 Create apache-james-machine-learning [ ] +0 [ ] -0 [ ] -1 Do not create apache-james-machine-learning --------------------------------------------------------------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
