Mahout seems to be at a stage where we have covered most of the interesting machine learning problems, where it is being used in production by quite some developers - hey, we even got a book that is now available in a printed version.
Maybe it's time to start taking first steps towards a 1.0 release. One* important step in my opinion is to define what kind of backwards compatibility guarantees we want to give our users - and what guarantees our users really need - after releasing 1.0. Just a rough list below - feel free to extend, shrink and change: 1) Data input formats - people probably do not want to re-generate vectors from their original data every time they use a new Mahout version. 2) Model formats - people probably do not want to have to retrain a model only to make it work with the latest and greatest features of a new Mahout release. 3) Model output - when upgrading users probably want to receive model output that is then integrated in their system the same way as with the older relase. 4) APIs - I don't see us keeping all interfaces or even abstract classes stable. However users should know which APIs we consider "public facing" and will likely keep stable. Maybe an annotation makes that clear? 5) Command line scripts - is there a significant user base relying on the bin/mahout script to warrant working towards keeping that stable between releases? Most likely I've forgotten about other vital pieces - just wanted to kick off that discussion. Isabel * though not the only one - others include but are not limited to the time frame for which we offer support for any given release.
signature.asc
Description: This is a digitally signed message part.