[
https://issues.apache.org/jira/browse/HADOOP-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602604#action_12602604
]
Karam Singh commented on HADOOP-3483:
-------------------------------------
Checked that if we provide non-existent directory to hod allocate command e.g.
hod allocate -d testdir -n 5. Hod creates cluster directory.
Also checked that if provide cluster dir on non-writable path than it properly
throws Permission denied "error.
If hod allcoate is provided with cluster directory whose cluster status is
dead, then hod client thows warning -:
[
WARNING/30 hod:278 - Found a previously allocated cluster at cluster directory
'<clsuter dir path>'. Deallocating this cluster to allocate a new one.
]
and reallocates the cluster using that directory. Results are also same for
mapred dead cluster we normally face.
But if cluster status becomes "mapred dead" - due to job tracker is gone or
cluster status is "hdfs dead" - due to namenode is gone but the actual torque
job is running. In that case hod throws warning and also reallocates the
cluster with specified cluster directory but it does not deallocates previous
cluster so old cluster also remains running and new cluster job also starts.
Also some times in this type of case "hod list" displays mixed status of both
clusters means -: cluster id of new cluster but cluster state of old cluster.
> [HOD] Improvements with cluster directory handling
> --------------------------------------------------
>
> Key: HADOOP-3483
> URL: https://issues.apache.org/jira/browse/HADOOP-3483
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/hod
> Affects Versions: 0.16.0
> Reporter: Hemanth Yamijala
> Assignee: Hemanth Yamijala
> Fix For: 0.18.0
>
> Attachments: 3483.1.patch, 3483.2.patch, 3483.patch
>
>
> The following improvements are asked for from users related to cluster
> directory handling:
> - Create a new cluster directory if one does not exist.
> - If a cluster directory points to a dead cluster, currently allocate fails
> with a message asking user to deallocate it first. Instead, it should issue a
> warning, deallocate the cluster and automatically allocate a fresh one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.