[
https://issues.apache.org/jira/browse/HADOOP-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624543#action_12624543
]
Lohit Vijayarenu commented on HADOOP-2676:
------------------------------------------
bq. +1 to Runping's comments.
Should we also think about supporting this for DataNodes? We have been thinking
about blacklisting datanodes, faulty ones. Namenode could consider a
blacklisted datanode equivalent to 'decommissioned under progress' node. And
also, un-blacklisting these nodes; does rebooting them makes them clean and
remove from blacklisted nodes?
> Maintaining cluster information across multiple job submissions
> ---------------------------------------------------------------
>
> Key: HADOOP-2676
> URL: https://issues.apache.org/jira/browse/HADOOP-2676
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.15.2
> Reporter: Lohit Vijayarenu
>
> Could we have a way to maintain cluster state across multiple job submissions.
> Consider a scenario where we run multiple jobs in iteration on a cluster back
> to back. The nature of the job is same, but input/output might differ.
> Now, if a node is blacklisted in one iteration of job run, it would be useful
> to maintain this information and blacklist this node for next iteration of
> job as well.
> Another situation which we saw is, if there are failures less than
> mapred.map.max.attempts in each iterations few nodes are never marked for
> blacklisting. But in we consider two or three iterations, these nodes fail
> all jobs and should be taken out of cluster. This hampers overall performance
> of the job.
> Could have have config variables something which matches a job type (provided
> by user) and maintains the cluster status for that job type alone?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.