Hey Pedro,

Your script will not run across all nodes, nor read data local blocks.
Hadoop streaming allows you to achieve that. Agree the name may
confuse you a bit, the 'streaming' part comes from the way it
'streams'/'pipes' data into and out of a newly launched process (your
script, written in any preferred language) that it takes care of
executing and terminating across your TaskTrackers, giving you MR in
your own language, as opposed to the Java API.

On Sat, Jun 16, 2012 at 2:53 PM, Pedro Costa <psdc1...@gmail.com> wrote:
> I still don't get why hadoop streaming is useful. If I have man and reduce
> functions defined in shell script, like the one below, why should I use
> Hadoop?
>
> cat someInputFile | shellMapper.sh | shellReducer.sh > someOutputFile
>
>
>
> On 16/06/2012, at 01:21, Ruslan Al-Fakikh <metarus...@gmail.com> wrote:
>
> Hi Pedro,
>
> You can find it here
> http://wiki.apache.org/hadoop/HadoopStreaming
>
> Thanks
>
> On Sat, Jun 16, 2012 at 2:46 AM, Pedro Costa <psdc1...@gmail.com> wrote:
>
> Hi,
>
>
> Hadoop mapreduce can be used for streaming. But what is streaming from the
> point of view of mapreduce? For me, streaming are video and audio data.
>
>
>  Why mapreduce supports streaming?
>
>
> Can anyone give me an example on why to use streaming in mapreduce?
>
>
> Thanks,
>
> Pedro



-- 
Harsh J

Reply via email to