On 11/10/08 1:30 AM, "Aaron Kimball" <[EMAIL PROTECTED]> wrote:
> It sounds like you think the 64- and 32-bit environments are effectively
> interchangable. May I ask why are you using both? The 64bit environment
> gives you access to more memory; do you see faster performance for the TT's
> in 32-bit mode? Do you get bit by library compatibility bugs that others
> should watch out for in running a dual-mode Hadoop environment?

Some random thoughts on our mixed environment:

A) The vast majority of user provided (legacy) code is 32-bit.  Since you
can't mix 64 and 32 bit objects at link or runtime, it just makes sense for
us to run TTs, etc, by default as 32-bit to give us the most bang for our
buck.

B) In the case of the data node, the memory usage is small enough that the
64-bit JVM isn't needed.

C) Since we currently run HOD, it should be possible for users to switch
their bit-ness and I think we have a handful of users that do.  We'll
probably lose this capability when we go back to a static job tracker. :(

D) For streaming jobs, the bit-ness of the JVM is irrelevant.  32-bit is
better due to the smaller footprint since streaming jobs eat memory like it
was candy. :)

E) We load the 64-bit and 32-bit versions of libraries on our nodes, thus
allowing us to move our bit-ness whenever we like.  This makes for a fat
image (so no RAM disk for the OS for us!), but given the streaming VM
issues, it works out mostly in our favor anyway.

    In general, 64-bit code runs slower than 32-bit code.  So unless one
needs to access more memory or has external dependencies (JNIs, whatever),
32-bit for your Java environment is the way to go. The name node and maybe a
static job tracker are the potential problem children here and places where
I suspect most people will be using the 64-bit JVM.

Reply via email to