More consistency is definitely a good thing and in the direction of what Gokhan's been talking about in the other thread. :)
Would you kindly open JIRA issues for these proposals and label them "gsoc2013" and "mentors"? Thanks! On Thu, Mar 28, 2013 at 7:32 PM, Andy Twigg <[email protected]> wrote: > I have two proposals: > 1) I would be happy to support a similar effort on the classifier API. > Indeed, I think the two APIs should be roughly the same, at least in > terms of input/output so that pipelining etc is easier. (cf > scikit-learn clustering/classifier/regression API) > > 2) As discussed before, I would like a cleaner interface for > input/output and for specifying/inferring the data format. Marty has > made great progress on ARFF formats for classifiers. > > > > > On 28 March 2013 15:58, Dan Filimon <[email protected]> wrote: > > I made an issue for Ted's first suggestion about reforming the clustering > > APIs. [1] > > > > Ted, could you explain a bit more what you mean by "simplify the > connection > > to Lucene for clustering and classification"? It's too vague for an idea > > proposal. > > > > Also, everyone, don't hesitate to post new ideas and volunteer. :) > > Tomorrow (March 29) is the deadline! > > > > [1] https://issues.apache.org/jira/browse/MAHOUT-1177 > > > > On Wed, Mar 27, 2013 at 5:08 PM, Isabel Drost-Fromm <[email protected] > >wrote: > > > >> On Tuesday, March 26, 2013 07:26:22 PM Dan Filimon wrote: > >> > Makes a lot of sense. Maybe you can give some advice to this year's > >> > mentors though? > >> > >> Helping out and providing advice is no problem. > >> > >> Isabel > >> > > > > -- > Dr Andy Twigg > Junior Research Fellow, St Johns College, Oxford > Room 351, Department of Computer Science > http://www.cs.ox.ac.uk/people/andy.twigg/ > [email protected] | +447799647538 >
