Ok, so let's sketch up here what these interfaces should look like. Any proposal is more than welcome. Regards, Tommaso
2012/7/7 Thomas Jungblut <[email protected]> > Looks fine to me. > The key are the interfaces for learning and predicting so we should define > some vectors and matrices. > It would be enough to define the algorithms via the interfaces and a > generic BSP should just run them based on the given input. > > 2012/7/7 Tommaso Teofili <[email protected]> > > > Hi all, > > > > in my spare time I started writing some basic BSP based machine learning > > algorithms for our ml module, now I'm wondering, from a design point of > > view, where it'd make sense to put the training data / model. I'd assume > > the obvious answer would be HDFS so this makes me think we should come > with > > (at least) two BSP jobs for each algorithm: one for learning and one for > > "predicting" each to be run separately. > > This would allow to read the training data from HDFS, and consequently > > create a model (also on HDFS) and then the created model could be read > > (again from HDFS) in order to predict an output for a new input. > > Does that make sense? > > I'm just wondering what a general purpose design for Hama based ML stuff > > would look like so this is just to start the discussion, any opinion is > > welcome. > > > > Cheers, > > Tommaso > > >
