Re: [ML] - data storage and basic design approach

Tommaso Teofili Mon, 09 Jul 2012 08:48:31 -0700

Ok, so let's sketch up here what these interfaces should look like.
Any proposal is more than welcome.
Regards,
Tommaso


2012/7/7 Thomas Jungblut <[email protected]>

> Looks fine to me.
> The key are the interfaces for learning and predicting so we should define
> some vectors and matrices.
> It would be enough to define the algorithms via the interfaces and a
> generic BSP should just run them based on the given input.
>
> 2012/7/7 Tommaso Teofili <[email protected]>
>
> > Hi all,
> >
> > in my spare time I started writing some basic BSP based machine learning
> > algorithms for our ml module, now I'm wondering, from a design point of
> > view, where it'd make sense to put the training data / model. I'd assume
> > the obvious answer would be HDFS so this makes me think we should come
> with
> > (at least) two BSP jobs for each algorithm: one for learning and one for
> > "predicting" each to be run separately.
> > This would allow to read the training data from HDFS, and consequently
> > create a model (also on HDFS) and then the created model could be read
> > (again from HDFS) in order to predict an output for a new input.
> > Does that make sense?
> > I'm just wondering what a general purpose design for Hama based ML stuff
> > would look like so this is just to start the discussion, any opinion is
> > welcome.
> >
> > Cheers,
> > Tommaso
> >
>

Re: [ML] - data storage and basic design approach

Reply via email to