Hello Hadoopers- I'm attempting to run some large-memory map tasks with using hadoop streaming, but I seem to be running afoul of the mapred.child.ulimit restriction, which is set to 2097152. I assume this is in KB since my tasks fail when they get to about 2GB (I just need to get to about 2.3GB- almost there!). So far, nothing I've tried has succeeded in changing this value. I've attempted to add -jobconf mapred.child.ulimt=3000000 to the streaming command line, but to no avail. In the job's xml file that I find in my logs, it's still got the old value. And worse, in my task logs I see the message: "attempt to override final parameter: mapred.child.ulimit; Ignoring." which doesn't exactly inspire confidence that I'm on the right path.
I see there's been a fair amount of traffic on Jira about large memory jobs, but there doesn't seem to be much in the way of examples or documentation. Can someone tell me how to run such a job, especially a streaming job? Many thanks in advance-- Chris ps. I'm running an 18.3 cluster on Amazon EC2 (I've been using the Cloudera convenience scripts, but I can abandon this if I need more control). The instances have plenty of memory (7.5GB).
