?
Thanks,
Dhruba
-Original Message-
From: Nguyen Kien Trung [mailto:[EMAIL PROTECTED]
Sent: Monday, July 16, 2007 8:13 PM
To: [email protected]
Subject: Re: How much RAMs needed...
Thanks Peter and Ted, your explanations do make some sense to me.
The out of memory error is
Thanks Peter and Ted, your explanations do make some sense to me.
The out of memory error is as follows:
java.lang.OutOfMemoryError
at sun.misc.Unsafe.allocateMemory(Native Method)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:99)
at java.nio.ByteBuffer.allocateDirect(B
Peter is pointing out that he was able to process the equivalent of many
small files using very modest hardware (smaller than your hardware).
This is confirmation that you need to combine your inputs into larger
chunks.
On 7/15/07 7:07 PM, "Nguyen Kien Trung" <[EMAIL PROTECTED]> wrote:
> Hi Pe
HDFS can't really do the combination into larger files, but if you can do
that, it will help quite a bit.
You might need a custom InputFormat or split to make it all sing, but you
should be much better off with fewer large input files.
One of the biggest advantages will be that your disk reading
Trung,
Someone more knowledgeable will need to help.
It's my very simple understanding that Hadoop DFS
creates multiple blocks for every file being processed.
The JVM heap being exceeded could possibly be a file
handle issue instead of being due to overall block count.
In other words, your nam
Hi Peter,
I appreciate for the info. I'm afraid I'm not getting what you mean.
The issue I've encountered is i'm not able to start up the namenode due
to out of memory error. Given that there are huge number of tiny files
in datanodes.
Cheers,
Trung
Peter W. wrote:
Trung,
Using one machin
Trung,
Using one machine (with 2GB RAM) and 300 input files
I was able to successfully run:
INFO mapred.JobClient:
Map input records=10785802
Map output records=10785802
Map input bytes=1302175673
Map output bytes=1101864522
Reduce input groups=1244034
Reduce input records=10785802
Reduce outpu
Thanks Ted,
Unfortunately, those files are really tiny files. Is it a good practice
if HDFS can combine those tiny files into a single block which fits a
standard size of 64M?
Ted Dunning wrote:
Are these really tiny files, or are you really storing 2M x 100MB = 200TB of
data? Or do you have
Are these really tiny files, or are you really storing 2M x 100MB = 200TB of
data? Or do you have more like 2M x 10KB = 20GB of data?
Map-reduce and HDFS will generally work much better if you can arrange to
have relatively larger files.
On 7/15/07 8:04 AM, "erolagnab" <[EMAIL PROTECTED]> wrote
://www.nabble.com/How-much-RAMs-needed...-tf4082367.html#a11603027
Sent from the Hadoop Users mailing list archive at Nabble.com.
10 matches
Mail list logo