Mahout seems to be at a stage where we have covered most of the interesting 
machine learning problems, where it is being used in production by quite some 
developers - hey, we even got a book that is now available in a printed version.

Maybe it's time to start taking first steps towards a 1.0 release. One* 
important step in my opinion is to define what kind of backwards compatibility 
guarantees we want to give our users - and what guarantees our users really 
need 
- after releasing 1.0.

Just a rough list below - feel free to extend, shrink and change:

1) Data input formats - people probably do not want to re-generate vectors from 
their original data every time they use a new Mahout version.

2) Model formats - people probably do not want to have to retrain a model only 
to make it work with the latest and greatest features of a new Mahout release.

3) Model output - when upgrading users probably want to receive model output 
that is then integrated in their system the same way as with the older relase.

4) APIs - I don't see us keeping all interfaces or even abstract classes 
stable. 
However users should know which APIs we consider "public facing" and will 
likely 
keep stable. Maybe an annotation makes that clear?

5) Command line scripts - is there a significant user base relying on the 
bin/mahout script to warrant working towards keeping that stable between 
releases?

Most likely I've forgotten about other vital pieces - just wanted to kick off 
that discussion.


Isabel


* though not the only one - others include but are not limited to the time 
frame 
for which we offer support for any given release.

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to