Ok, silly question...how do I go about plugging in a different version of Hadoop? I moved the 0.21 version (the tar.gz from the Hadoop site) into the same path as HADOOP_HOME, and I wiped out the .m2/ repository, so on the next Mahout build all the dependencies were rebuilt. Still getting the 0.20.2 packages.
Grant has good points, just want to see if I can get this running... On Sat, Nov 6, 2010 at 12:12 PM, Ted Dunning <[email protected]> wrote: > Remember Flume != FlumeJava. > > Flume is Cloudera's semi-proprietary ETL system. > > FlumeJava is a high level API for creating map-reduce programs in Java. > The > level of abstraction is similar to Pig. > > Plume is an open source project I started to clone FlumeJava by filling in > the details omitted from the Google paper. As an > example of how high level Plume is, word count in raw map-reduce is >200 > lines of code. In Plume, it is about 20 and you > can't tell which version of Hadoop, if any, your code is running on. > > On Fri, Nov 5, 2010 at 7:09 AM, Grant Ingersoll <[email protected]> > wrote: > > > The Plume/Flume stuff seems promising for helping with that as well as > > giving some other benefits, but that relies on us having an open source > > version of Flume (which Ted and others have started). I don't know that > it > > is all that practical in short term and I'm not proposing any rewrites at > > this point, but we should consider it as working at that layer might > allow > > the ability to plugin different backends that are better performing given > > certain setups (local, small cluster, large cluster). Such a bit of > > insulation might allow us to plug in other capabilities as well. One of > the > > things Hadoop has spawned is a whole lot more interest in these kind of > > capabilities and I fully expect to see new/related paradigms coming out. > > Obviously, we aren't just going to jump on anything, but if we can think > > about ways we might be able to plug them in. Thoughts? >
