[
https://issues.apache.org/jira/browse/PIG-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011465#comment-13011465
]
Michael Brauwerman commented on PIG-1838:
-----------------------------------------
Thanks, Daniel.
I found the reason why my first attempt (setting HADOOP_OPTS before calling
hadoop) failed, and in the process found an alternate solution (which I find
convenient because I didn't take the opportunity to investigate classpath
management yet).
The version of hadoop I am running (Amazon EMR's version) resets HADOOP_OPTS in
"conf/hadoop-env.sh", which clears out any previously set value. That script
then sources "conf/hadoop-user.env.sh"
So, I added
{code}
export HADOOP_OPTS="$HADOOP_OPTS -Djava.io.tmpdir=/mnt/tmp"
{code}
to conf/hadoop-user.env.sh
and now pig scripts use /mnt/tmp as desired.
> On a large farm, some pigs die of /tmp starvation
> -------------------------------------------------
>
> Key: PIG-1838
> URL: https://issues.apache.org/jira/browse/PIG-1838
> Project: Pig
> Issue Type: Wish
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Allen Wittenauer
>
> We're starting to see issues where interactive/command line pig users blow up
> due to so many large jar creations in /tmp. (In other words, pig execution
> prior to the java.io.tmpdir fix that Hadoop makes can kick in.) Pig should
> probably not depend upon users being savvy enough to override java.io.tmpdir
> on their own in these situations and/or a better steward of the space it does
> use.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira