Re: [jira] Commented: (MAHOUT-537) Bring DistributedRowMatrix into compliance with Hadoop 0.20.2

Shannon Quinn Tue, 09 Nov 2010 16:50:31 -0800

Ok, silly question...how do I go about plugging in a different version of
Hadoop? I moved the 0.21 version (the tar.gz from the Hadoop site) into the
same path as HADOOP_HOME, and I wiped out the .m2/ repository, so on the
next Mahout build all the dependencies were rebuilt. Still getting the
0.20.2 packages.


Grant has good points, just want to see if I can get this running...

On Sat, Nov 6, 2010 at 12:12 PM, Ted Dunning <[email protected]> wrote:

> Remember Flume != FlumeJava.
>
> Flume is Cloudera's semi-proprietary ETL system.
>
> FlumeJava is a high level API for creating map-reduce programs in Java.
>  The
> level of abstraction is similar to Pig.
>
> Plume is an open source project I started to clone FlumeJava by filling in
> the details omitted from the Google paper.  As an
> example of how high level Plume is, word count in raw map-reduce is >200
> lines of code.  In Plume, it is about 20 and you
> can't tell which version of Hadoop, if any, your code is running on.
>
> On Fri, Nov 5, 2010 at 7:09 AM, Grant Ingersoll <[email protected]>
> wrote:
>
> > The Plume/Flume stuff seems promising for helping with that as well as
> > giving some other benefits, but that relies on us having an open source
> > version of Flume (which Ted and others have started).  I don't know that
> it
> > is all that practical in short term and I'm not proposing any rewrites at
> > this point, but we should consider it as working at that layer might
> allow
> > the ability to plugin different backends that are better performing given
> > certain setups (local, small cluster, large cluster).  Such a bit of
> > insulation might allow us to plug in other capabilities as well.  One of
> the
> > things Hadoop has spawned is a whole lot more interest in these kind of
> > capabilities and I fully expect to see new/related paradigms coming out.
> >  Obviously, we aren't just going to jump on anything, but if we can think
> > about ways we might be able to plug them in.  Thoughts?
>

Re: [jira] Commented: (MAHOUT-537) Bring DistributedRowMatrix into compliance with Hadoop 0.20.2

Reply via email to