Thanks all, and Charles you guided me to Baidu slides titled:
Introduction to *Hadoop C++
Extension*<http://hic2010.hadooper.cn/dct/attach/Y2xiOmNsYjpwZGY6ODI5>
which is their experience and the sixth-slide shows exactly what I was
looking for. It is still hard to manage memory with pipes besides the no
performance gains, hence the advancement of HCE.

Thanks,
Mark
On Thu, Apr 5, 2012 at 2:23 PM, Charles Earl <charles.ce...@gmail.com>wrote:

> Also bear in mind that there is a kind of detour involved, in the sense
> that a pipes map must send key,value data back to the Java process and then
> to reduce (more or less).
> I think that the Hadoop C Extension (HCE, there is a patch) is supposed to
> be faster.
> Would be interested to know if the community has any experience with HCE
> performance.
> C
>
> On Apr 5, 2012, at 3:49 PM, Robert Evans <ev...@yahoo-inc.com> wrote:
>
> > Both streaming and pipes do very similar things.  They will fork/exec a
> separate process that is running whatever you want it to run.  The JVM that
> is running hadoop then communicates with this process to send the data over
> and get the processing results back.  The difference between streaming and
> pipes is that streaming uses stdin/stdout for this communication so
> preexisting processing like grep, sed and awk can be used here.  Pipes uses
> a custom protocol with a C++ library to communicate.  The C++ library is
> tagged with SWIG compatible data so that it can be wrapped to have APIs in
> other languages like python or perl.
> >
> > I am not sure what the performance difference is between the two, but in
> my own work I have seen a significant performance penalty from using either
> of them, because there is a somewhat large overhead of sending all of the
> data out to a separate process just to read it back in again.
> >
> > --Bobby Evans
> >
> >
> > On 4/5/12 1:54 PM, "Mark question" <markq2...@gmail.com> wrote:
> >
> > Hi guys,
> >  quick question:
> >   Are there any performance gains from hadoop streaming or pipes over
> > Java? From what I've read, it's only to ease testing by using your
> favorite
> > language. So I guess it is eventually translated to bytecode then
> executed.
> > Is that true?
> >
> > Thank you,
> > Mark
> >
>

Reply via email to