Memory limits were handled in this jira:
https://issues.apache.org/jira/browse/HADOOP-2765

However - in our environment - we spawn all streaming tasks through a
wrapper program (with the wrapper being defaulted across all users). We
can control resource uses from this wrapper (and also do some level of
job control).
        
This does require some minor code changes and would be happy to contrib
- but the approach doesn't quite fit in well with the approach in 2765
that passes each one of these resource limits as a hadoop config param.

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris
Anderson
Sent: Wednesday, June 25, 2008 7:00 PM
To: core-user@hadoop.apache.org
Subject: process limits for streaming jar

Hi there,

I'm running some streaming jobs on ec2 (ruby parsing scripts) and in
my most recent test I managed to spike the load on my large instances
to 25 or so. As a result, I lost communication with one instance. I
think I took down sshd. Whoops.

My question is, has anyone got strategies for managing resources used
by the processes spawned by streaming jar? Ideally I'd like to run my
ruby scripts under nice.

I can hack something together with wrappers, but I'm thinking there
might be a configuration option to handle this within Streaming jar.
Thanks for any suggestions!

-- 
Chris Anderson
http://jchris.mfdz.com

Reply via email to