On Mon, Aug 9, 2010 at 5:22 PM, Ted Dunning <[email protected]> wrote:
> It doesn't quite feel like 0.9 to me yet. > > To clarify, to me 0.9 means a compatibility release like in Lucene with > little functionality difference between 0.9 and 1.0. > > Jake and Robin and I were talking the other evening and a common lament was > that our classification (and clustering) stuff was all over the map in > terms > of data structures. Driving that to rest and getting those comments even > vaguely as plug and play as our much more advanced recommendation > components > would be very, very helpful. > I definitely agree. I think we need to standardize classification, LDA, clustering, and recommendations to use some similar formats. It would be really nice if someone could take the *same* input, and do LDA on it, SVD, and fuzzy k-means, and then use the induced metric on your inputs to decide which works best, and then use that to feed into an item-based recommender, for example. It's definitely not yet "almost 1.0", in my eyes, at least. Once we have 1.0, we are going to have to be way careful on changing APIs, and I personally think we have some APIs which could change a bit in the next few minor releases, especially in the distributed linear algebra space (ie things related to DistributedRowMatrix and its cousins). We may also, for example, want to explore an API change/addition which allows "vectors" which are boolean or float valued (for space savings), and keyed on longs (for the truly large-scale case). Another thing that we will want to do before 1.0 is fully integrate what we want out of COLT. That's just off the top of my head, for things we should at least have some story about before 1.0. -jake > > On Mon, Aug 9, 2010 at 5:48 AM, Grant Ingersoll <[email protected]> > wrote: > > > So, how about we shoot for pencils down on September 1? That should give > > us enough time to incorporate GSOC, M-228, etc. Then, we can do a 5-7 > day > > freeze and then release. > > > > Any thoughts on 0.4 vs. 0.9? I'm kind of leaning towards 0.9, but I > don't > > want to paint us into a corner either. From what I've seen, many of our > > APIs our firming up. That being said, maybe two more releases pre 1.0 > would > > be good. > > > > -Grant > > > > On Aug 9, 2010, at 5:54 AM, Sebastian Schelter wrote: > > > > > Regarding the issues I work on, I wanna see MAHOUT-460 (add > > > "maxPreferencesPerItemConsidered" to the ItemSimilarityJob) and > > > MAHOUT-457 (make ItemSimilarityJob and RecommenderJob work on > > > ElasticMapReduce) being included in the 0.4 release. It should be no > > > problem to get them done until September. > > > > > > --sebastian > > > > > > Am 08.08.2010 21:20, schrieb Grant Ingersoll: > > >> Now that GSOC is almost done, seems like the time to start thinking > > about 0.4 (or maybe 0.9, i.e. the last release before 1.0?) Obviously, > we > > still need to incorporate in much of the GSOC work, so reality says Sept. > or > > October for a release, but maybe sooner if we are sufficiently motivated. > > >> > > >> Thoughts? > > >> > > >> -Grant > > > > > > > > > >
