Bob Freitas created PIG-3338:
--------------------------------

             Summary: Temp files not deleted when run in Web App
                 Key: PIG-3338
                 URL: https://issues.apache.org/jira/browse/PIG-3338
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.11.1
         Environment: Linux CentOS 6.3, with Hadoop 1.0.4 and Jetty 6
            Reporter: Bob Freitas
            Priority: Critical


We are executing Pig, via PigRunner, in a web app, but the temporary files 
being created by Pig are not being deleted until the web app is shutdown.  This 
causes the /tmp directory to become very cluttered very quickly, and run the 
risk of filling it up over time.  

The work started in FileLocalizer with the deleteTempFiles() is an excellent 
start, but it does not go far enough.  If it was fully implemented then we 
could use that method to delete those temp file after each Pig job has 
completed.  

What needs to happen is that the creation of temp files needs to always be 
passed thru the FileLocalizer.getTemporaryPath() instead of using the 
File.createTempFile() as it is now.

The places this needs to be implemented:

1) FileLocalizer #706 creation of localTempDir

2) JobControlCompiler.getJob() #512 creation of job jar

3) DefaultAbstractBag #388 creation of pigbag for spills

If these temp files are then stored into the already existing 
ThreadLocal<Deque<ElementDescriptor>>() then we could use the 
FileLocalizer.deleteTempFiles() to clean up after each Pig job and not need to 
restart the web app. 




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to