Thanks for this suggestion on the shell, I will take a look into that. But I still don't understand why streaming won't work very well, it is able To do m/r jobs using the supplied exec right? So all the map/reduce programs take input/output from their own local filesystem or from the hdfs?
-----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jean-Daniel Cryans Sent: Thursday, September 18, 2008 6:30 PM To: [email protected] Subject: [LIKELY JUNK]Re: Running map/reduce written in Ruby on Hbase Hui Ding, This wouldn't work very well. Streaming is defined so that you pass programs (any) that can take in input and an output in the filesystem, not HBase tables. You should instead try to use JRuby like we do for the shell. It requires some more setup, but since it all runs inside the JVM it eventually works. I see that more and more users are interested in using JRuby/Jython for MR jobs and I know that some companies already uses a wrapper for that ("Happy" anyone?). I'm sure many would be insterested in seeing this kind of work. J-D On Thu, Sep 18, 2008 at 7:57 PM, Ding, Hui <[EMAIL PROTECTED]> wrote: > Hi all, > > I wanted to run some map/reduce job but I'd like to do that in Ruby, is > this possible with Hadoop Streaming? > My understanding is that I will provide mapper/reducer in Ruby and > supply that to Hadoop Streamining, and since hbase can be a source/sink > of map/reduce, I should be able to access the tables, right? > > And as far as setup is concered, I just need to have a ruby interpreter > set up on each of the machine in the cluster? > > Thanks a lot! >
