[
https://issues.apache.org/jira/browse/PIG-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810897#comment-13810897
]
Bob Freitas commented on PIG-3338:
----------------------------------
Found a work around, use AOP to expose the temp file info, save the locations
in a Singleton and then retrieve later to delete after the Pig job has
finished. If anyone would like more info on this work around, let me know,
happy to provide details.
> Temp files not deleted when run in Web App
> ------------------------------------------
>
> Key: PIG-3338
> URL: https://issues.apache.org/jira/browse/PIG-3338
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.11.1
> Environment: Linux CentOS 6.3, with Hadoop 1.0.4 and Jetty 6
> Reporter: Bob Freitas
> Priority: Critical
>
> We are executing Pig, via PigRunner, in a web app, but the temporary files
> being created by Pig are not being deleted until the web app is shutdown.
> This causes the /tmp directory to become very cluttered very quickly, and run
> the risk of filling it up over time.
> The work started in FileLocalizer with the deleteTempFiles() is an excellent
> start, but it does not go far enough. If it was fully implemented then we
> could use that method to delete those temp file after each Pig job has
> completed.
> What needs to happen is that the creation of temp files needs to always be
> passed thru the FileLocalizer.getTemporaryPath() instead of using the
> File.createTempFile() as it is now.
> The places this needs to be implemented:
> 1) FileLocalizer #706 creation of localTempDir
> 2) JobControlCompiler.getJob() #512 creation of job jar
> 3) DefaultAbstractBag #388 creation of pigbag for spills
> If these temp files are then stored into the already existing
> ThreadLocal<Deque<ElementDescriptor>>() then we could use the
> FileLocalizer.deleteTempFiles() to clean up after each Pig job and not need
> to restart the web app.
--
This message was sent by Atlassian JIRA
(v6.1#6144)