Pig 0.8 allows you to specify its temp directory with -Dpig.temp.dir=<dir path> command (PIG-103).
On 12/16/10 8:18 AM, "David Vrensk" <[email protected]> wrote: Hello fellow pig users, I have told pig to use a separate disk for its temp files by setting PIG_OPTS=-Dhadoop.tmp.dir=/mnt/hadoop-tmp but it still keeps a lot of its files in /tmp: /tmp/temp-1035677529$ find . -type f -exec ls -lh '{}' \; -rw-r--r-- 1 pig pig 308K 2010-12-16 14:13 ./tmp82247880/.part-00000.crc -rwxrwxrwx 1 pig pig 39M 2010-12-16 14:13 ./tmp82247880/part-00000 -rw-r--r-- 1 pig pig 8 2010-12-16 14:13 ./tmp-1431528563/.part-00000.crc -rwxrwxrwx 1 pig pig 0 2010-12-16 14:04 ./tmp-1431528563/part-00000 -rw-r--r-- 1 pig pig 3.0M 2010-12-16 14:01 ./tmp1746442640/.part-00000.crc -rwxrwxrwx 1 pig pig 381M 2010-12-16 14:01 ./tmp1746442640/part-00000 -rw-r--r-- 1 pig pig 8.8M 2010-12-16 16:05 ./tmp-1936719424/_temporary/_attempt_local_0003_r_000000_0/.part-00000.crc -rwxrwxrwx 1 pig pig 1.1G 2010-12-16 16:05 ./tmp-1936719424/_temporary/_attempt_local_0003_r_000000_0/part-00000 -rw-r--r-- 1 pig pig 38M 2010-12-16 14:13 ./tmp1280814018/.part-00000.crc -rwxrwxrwx 1 pig pig 4.8G 2010-12-16 14:13 ./tmp1280814018/part-00000 -rw-r--r-- 1 pig pig 308K 2010-12-16 14:13 ./tmp1738480876/.part-00000.crc -rwxrwxrwx 1 pig pig 39M 2010-12-16 14:13 ./tmp1738480876/part-00000 I don't know what these files are and my google-fu is too weak to find anything. FWIW, the command line I currently use to run pig is pig-0.6.0/bin/pig -param input=batch-20101216-130003/* scripts/the_script.pig I'm looking for a way to make pig put all its files on /mnt/hadoop-tmp. Preferrably, it should be a command line argument or an environment variable and not tweeking an xml file. Not only will that make my scripts more transparent, but the xml file I've heard about so far (hadoop-site.xml) resides within the hadoop jar which is pre-built, and I'd rather avoid cracking it open in order to modify its contents. Preferred solution aside, I'm glad for any help! Thanks in advance, David -- David Vrensk Systems developer, ICE House AB Mobile: +46 703 74 69 00
