[ 
https://issues.apache.org/jira/browse/PIG-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011465#comment-13011465
 ] 

Michael Brauwerman commented on PIG-1838:
-----------------------------------------

Thanks, Daniel.

I found the reason why my first attempt (setting HADOOP_OPTS before calling 
hadoop) failed, and in the process found an alternate solution (which I find 
convenient because I didn't take the opportunity to investigate classpath 
management yet).

The version of hadoop I am running (Amazon EMR's version) resets HADOOP_OPTS in 
"conf/hadoop-env.sh", which clears out any previously set value. That script 
then sources "conf/hadoop-user.env.sh"


So, I added 
{code}
 export HADOOP_OPTS="$HADOOP_OPTS -Djava.io.tmpdir=/mnt/tmp"
{code}

to conf/hadoop-user.env.sh
and now pig scripts use /mnt/tmp as desired.



> On a large farm, some pigs die of /tmp starvation
> -------------------------------------------------
>
>                 Key: PIG-1838
>                 URL: https://issues.apache.org/jira/browse/PIG-1838
>             Project: Pig
>          Issue Type: Wish
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Allen Wittenauer
>
> We're starting to see issues where interactive/command line pig users blow up 
> due to so many large jar creations in /tmp. (In other words, pig execution 
> prior to the java.io.tmpdir fix that Hadoop makes can kick in.)  Pig should 
> probably not depend upon users being savvy enough to override java.io.tmpdir 
> on their own in these situations and/or a better steward of the space it does 
> use.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to