[
https://issues.apache.org/jira/browse/PIG-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585080#action_12585080
]
Pi Song commented on PIG-166:
-----------------------------
Can we do something like this?
{code}
public void cleanUpTempFiles() {
// do clean up temp files here
alreadyClean = true ;
}
protected void finalize() {
if (!alreadyClean) {
cleanUpTempFiles() ;
}
}
{code}
My observation is for finalize() you have to wait for GC to do it. If you know
you don't need it, why don't just clean it right away (optional)
1.GC is non-determistic
2. Your bag might live in older generation therefore unlikely to get clean-up
3. If the clean-up in finalize() takes too much time, GC will have to wait for
too long therefore under memory pressure it may not efficiently free more
memory.
PS. The main issue is to prevent disk full. I think this is still a part of the
solution.
> Disk Full
> ---------
>
> Key: PIG-166
> URL: https://issues.apache.org/jira/browse/PIG-166
> Project: Pig
> Issue Type: Bug
> Reporter: Amir Youssefi
>
> Occasionally spilling fills up (all) hard drive(s) on a Data Node and crashes
> Task Tracker (and other processes) on that node. We need to have a safety net
> and fail the task before crashing happens (and more).
> In Pig + Hadoop setting, Task Trackers get Black Listed. And Pig console gets
> stock at a percentage without returning nodes to cluster. I talked to Hadoop
> team to explore Max Percentage idea. Nodes running into this problem get into
> permanent problems and manual cleaning by administrator is necessary.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.