Yes that is great, I will help you with that.

2012/10/11 Panos Mandros <[email protected]>

> Hey Thomas,
>     implementing PLANET was part of my bachelor thesis. It works not only
> for single label learning but for multi-label learning also, as this is one
> of the areas my professor is interested. It works fine but still has things
> that need to be done. One of these is to transfer it to Hama. Another thing
> is to find a more efficient way to transfer data from mappers to the
> reducer because right now the output is really big. If you want we can
> cooperate on this.
>
> 2012/10/10 Thomas Jungblut <[email protected]>
>
> > Hey Panos,
> >
> > thanks for transferring this.
> >
> > Here is the paper for the others:
> >
> >
> http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/de//pubs/archive/36296.pdf
> >
> > I wanted to do this, not enough time :/
> > As I said on stackoverflow, I think the graph package is the wrong
> approach
> > here, you can clearly translate the mapreduce algorithm to BSP
> > and make use of the faster iterations.
> >
> > Do you already have the code in MapReduce? I can simply turn this into
> BSP.
> > I would like to support the creation of random forests as well, by
> training
> > a decision tree in every task and combining them later.
> >
> >
> > 2012/10/10 Panos Mandros <[email protected]>
> >
> > > I currently have implemented in Hadoop, Google's framework for building
> > > decision trees (also known as PLANET). It is supposed to scale well in
> > > very large datasets. But it has many problems. It scales only well if
> > > the dataset has a few attributes. If a dataset has a lot of attributes,
> > > that means it will have a lot of map/reduce jobs which means a big
> > > start-up cost for all of these jobs. Google however uses it with a lot
> > > of modifications on its Hadoop like platform and not on the algorithm
> > > itself. PLANET starts with a single vertex and with map reduce jobs you
> > > add more and more until the tree is fully build.
> > >
> > > I have seen many times that Apache Hama is suitable for iterative
> > > algorithms like graphs. Can someone build a new graph with Hama or you
> > > just have as input a graph and make some computations on it? Will it be
> > > easy to transfer my project to Hama?? Thanks
> > >
> >
>

Reply via email to