[
https://issues.apache.org/jira/browse/HADOOP-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664385#action_12664385
]
Hemanth Yamijala commented on HADOOP-5022:
------------------------------------------
I looked at the patch. Some comments:
- Agree with Vinod on the option's name. I think the default value can be
'false' and we can mark this as an incompatible change. This seems like a
required behavior.
- The code that's deleting the files seems to be incorrectly indented. Note
that the cmd variable is overwritten while iterating over prefixes. This is not
how the code was previously. Can you please check.
Also, please do test this carefully.
> [HOD] logcondense should delete all hod logs for a user, including jobtracker
> logs
> ----------------------------------------------------------------------------------
>
> Key: HADOOP-5022
> URL: https://issues.apache.org/jira/browse/HADOOP-5022
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/hod
> Reporter: Hemanth Yamijala
> Assignee: Peeyush Bishnoi
> Priority: Blocker
> Fix For: 0.18.3
>
> Attachments: hadoop-5022.txt
>
>
> Currently, logcondense.py does not delete jobtracker logs that it uploads to
> the DFS when the HOD cluster is deallocated. This will result in the hod-logs
> directory to slowly accumulate a whole bunch of jobtracker logs. Particularly
> for users who run a lot of user jobs, this could fill up the namespace.
> Further these directories will cause the logcondense program to keep
> repeatedly looking at these directories stressing out the namenode. So,
> logcondense.py should optionally also delete the jobtracker logs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.