[ 
https://issues.apache.org/jira/browse/PIG-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585080#action_12585080
 ] 

Pi Song commented on PIG-166:
-----------------------------

Can we do something like this?

{code}
public void cleanUpTempFiles() {
   // do clean up temp files here
   alreadyClean = true ;
}

protected void finalize() {
   if (!alreadyClean) {
        cleanUpTempFiles() ;
   }
}
{code}

My observation is for finalize() you have to wait for GC to do it. If you know 
you don't need it, why don't just clean it right away (optional)
1.GC is non-determistic 
2. Your bag might live in older generation therefore unlikely to get clean-up 
3. If the clean-up in finalize()  takes too much time, GC will have to wait for 
too long therefore under memory pressure it may not efficiently free more 
memory.

PS. The main issue is to prevent disk full. I think this is still a part of the 
solution.

> Disk Full
> ---------
>
>                 Key: PIG-166
>                 URL: https://issues.apache.org/jira/browse/PIG-166
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Amir Youssefi
>
> Occasionally spilling fills up (all) hard drive(s) on a Data Node and crashes 
> Task Tracker (and other processes) on that node. We need to have a safety net 
> and fail the task before crashing happens (and more). 
> In Pig + Hadoop setting, Task Trackers get Black Listed. And Pig console gets 
> stock at a percentage without returning nodes to cluster. I talked to Hadoop 
> team to explore Max Percentage idea. Nodes running into this problem get into 
> permanent problems and manual cleaning by administrator is necessary. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to