Bob Freitas created PIG-3338:
--------------------------------
Summary: Temp files not deleted when run in Web App
Key: PIG-3338
URL: https://issues.apache.org/jira/browse/PIG-3338
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.11.1
Environment: Linux CentOS 6.3, with Hadoop 1.0.4 and Jetty 6
Reporter: Bob Freitas
Priority: Critical
We are executing Pig, via PigRunner, in a web app, but the temporary files
being created by Pig are not being deleted until the web app is shutdown. This
causes the /tmp directory to become very cluttered very quickly, and run the
risk of filling it up over time.
The work started in FileLocalizer with the deleteTempFiles() is an excellent
start, but it does not go far enough. If it was fully implemented then we
could use that method to delete those temp file after each Pig job has
completed.
What needs to happen is that the creation of temp files needs to always be
passed thru the FileLocalizer.getTemporaryPath() instead of using the
File.createTempFile() as it is now.
The places this needs to be implemented:
1) FileLocalizer #706 creation of localTempDir
2) JobControlCompiler.getJob() #512 creation of job jar
3) DefaultAbstractBag #388 creation of pigbag for spills
If these temp files are then stored into the already existing
ThreadLocal<Deque<ElementDescriptor>>() then we could use the
FileLocalizer.deleteTempFiles() to clean up after each Pig job and not need to
restart the web app.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira