Also, serialization often keeps previously read object references in memory. Better to use Thrift or Avro to serialize the object.

In my experience serialization is inefficient for large object graphs, but works fine for smaller graphs (depending on how much memory you have to work with).

Also for that small of data memcache and mongo may be overkill (unless the data changes frequently)

Cheers,
Matt

On Oct 13, 2010, at 11:04 AM, "Shi Yu" <[email protected]> wrote:

As a coming-up to the my own question, I think to invoke the JVM in Hadoop requires much more memory than an ordinary JVM. I found that instead of serialization the object, maybe I could create a MapFile as an index to permit lookups by key in Hadoop. I have also compared the performance of MongoDB and Memcache. I will let you know the result after I try the MapFile approach.

Shi

On 2010-10-12 21:59, M. C. Srivas wrote:

On Tue, Oct 12, 2010 at 4:50 AM, Shi Yu<[email protected]>  wrote:


Hi,

I want to load a serialized HashMap object in hadoop. The file of stored object is 200M. I could read that object efficiently in JAVA by setting

-Xmx

as 1000M. However, in hadoop I could never load it into memory. The code

is

very simple (just read the ObjectInputStream) and there is yet no

map/reduce

implemented. I set the mapred.child.java.opts=-Xmx3000M, still get the "java.lang.OutOfMemoryError: Java heap space" Could anyone explain a

little

bit how memory is allocate to JVM in hadoop. Why hadoop takes up so much
memory?  If a program requires 1G memory on a single node, how much

memory

it requires (generally) in Hadoop?


The JVM reserves swap space in advance, at the time of launching the
process. If your swap is too low (or do not have any swap configured), you
will hit this.

Or, you are on a 32-bit machine, in which case 3G is not possible in the
JVM.

-Srivas.




Thanks.

Shi

--








iCrossing Privileged and Confidential Information
This email message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information of iCrossing. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.


Reply via email to