On 06.09.2007, at 09:56, Pietu Pohjalainen wrote:

Jeroen Verhagen wrote:
On 9/5/07, Steve Schlosser <[EMAIL PROTECTED]> wrote:

question, but I was wondering if anyone has a reasonable qualitative
answer that I can pass on when people ask.

Is this question really relevant since Hadoop is designed to run on a
cluster of commodity hardware Google-style? If there were any
difference I'm sure it would be solved by adding 1 machine to the
cluster.



Isn't it about whether to add 30% or 50% more machines? Which is starting to get significant when you think whether to have 1000 or 1500 machines.

A plain java vs <some language> discussion is way to simple. I've been working on a java project that way (!!) out-performed a similar C ++ project. The design and a smart implementation will make more difference that just the plain language. Long running vs short running ..all what has already been said. At least that's my experience. That being said, for hadoop the one-child-jvm-per-job is what has quite a bit of an overhead. If you are not scared that your jobs will tear down your tasktrackers - we have an in-jvm execution patch. (not submitted yet though)

cheers
--
Torsten

Reply via email to