RE: Question on HadoopStreaming and Memory Usage

Devaraj Das Sun, 15 Jun 2008 21:18:26 -0700

Hadoop does provide a ulimit based way to control the memory consumption by
the tasks it spawns via the config mapred.child.ulimit. Look at
http://hadoop.apache.org/core/docs/r0.17.0/mapred_tutorial.html#Task+Executi
on+%26+Environment 
However, what is lacking is a way to get the cumulative memory consumption
of all processes spawned by a map/reduce task. For e.g., a streaming process
could spawn 100s of processes and they collectively can cause havoc.


> -----Original Message-----
> From: Taeho Kang [mailto:[EMAIL PROTECTED] 
> Sent: Monday, June 16, 2008 7:23 AM
> To: [EMAIL PROTECTED]
> Subject: Question on HadoopStreaming and Memory Usage
> 
> Dear All,
> 
> I've got a question about hadoop streaming with its memory management.
> Does hadoop streaming have a mechanism to prevent over-usage 
> of memory by its subprocesses (Map or Reduce function)?
> 
> Say, a binary used for reduce phase allocates itself lots and 
> lots of memory to the point it starves other important 
> processes like a Datanode or TaskTracker process. Does Hadoop 
> Streaming prevent such cases?
> 
> Thank you in advance,
> 
> Taeho
>

RE: Question on HadoopStreaming and Memory Usage

Reply via email to