On 17-Dec-08, at 10:44 AM, Philip wrote:
I've been trying to trouble shoot an OOME we've been having.
When we run the job over a dataset that about 700GB (~9000 files) or
larger
we will get an OOME on the map jobs. However if we run the job over
smaller
set of the data then everything works out fine. So my question is:
What
changes in Hadoop as the size of the input set increases?
We are on hadoop 0.18.0.
I don't have a real answer, but the first thing you should do is try
using 0.18.1 or 0.17.x - I had some 0.18.0 memory/filehandle problems
go away when I switched. 0.18.0 was not a stable release, in any case.
Karl Anderson
[email protected]
http://monkey.org/~kra