I have two proposals: 1) I would be happy to support a similar effort on the classifier API. Indeed, I think the two APIs should be roughly the same, at least in terms of input/output so that pipelining etc is easier. (cf scikit-learn clustering/classifier/regression API)
2) As discussed before, I would like a cleaner interface for input/output and for specifying/inferring the data format. Marty has made great progress on ARFF formats for classifiers. On 28 March 2013 15:58, Dan Filimon <[email protected]> wrote: > I made an issue for Ted's first suggestion about reforming the clustering > APIs. [1] > > Ted, could you explain a bit more what you mean by "simplify the connection > to Lucene for clustering and classification"? It's too vague for an idea > proposal. > > Also, everyone, don't hesitate to post new ideas and volunteer. :) > Tomorrow (March 29) is the deadline! > > [1] https://issues.apache.org/jira/browse/MAHOUT-1177 > > On Wed, Mar 27, 2013 at 5:08 PM, Isabel Drost-Fromm <[email protected]>wrote: > >> On Tuesday, March 26, 2013 07:26:22 PM Dan Filimon wrote: >> > Makes a lot of sense. Maybe you can give some advice to this year's >> > mentors though? >> >> Helping out and providing advice is no problem. >> >> Isabel >> -- Dr Andy Twigg Junior Research Fellow, St Johns College, Oxford Room 351, Department of Computer Science http://www.cs.ox.ac.uk/people/andy.twigg/ [email protected] | +447799647538
