I would say biggest difference between a C-Hadoop and Java-Hadoop would be memory usage on Namenode (and memory allocation related cpu benifits). Rest of the nodes on the cluster would perform about the same. C is more suitable for low level memory optimizations (both in overall size and number of allocations). In my guestimate, Namenode could use 30-40% less memory.

Note that memory is an issue mainly for very large clusters.

Raghu.

Steve Schlosser wrote:
Please excuse a possibly heretical question...

My colleagues and I have been working with Hadoop lately, and I keep
getting asked the same question: what is the performance impact of
having the system written in Java?  Some folks, even today, are
suspicious of the overhead of Java, especially for systems
programming.  I realize that an apples-to-apples comparison of
Java-based Hadoop to, say, C-based Hadoop is entirely out of the
question, but I was wondering if anyone has a reasonable qualitative
answer that I can pass on when people ask.

Thanks!

-steve

Reply via email to