[jira] [Commented] (HDFS-1960) dfs.*.dir should not default to /tmp (or other typically volatile storage)

Harsh J (Commented) (JIRA) Sat, 07 Jan 2012 09:44:03 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182041#comment-13182041
 ]


Harsh J commented on HDFS-1960:
-------------------------------

Another perhaps-working default could probably be a relative path "{{.}}".

I think this is an issue that could largely be solved by packaging and better 
administrator's documentation.
                
> dfs.*.dir should not default to /tmp (or other typically volatile storage)
> --------------------------------------------------------------------------
>
>                 Key: HDFS-1960
>                 URL: https://issues.apache.org/jira/browse/HDFS-1960
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>    Affects Versions: 0.20.2
>         Environment: *nix systems
>            Reporter: philo vivero
>            Priority: Critical
>
> The hdfs-site.xml file possibly will not define one or both of:
> dfs.name.dir
> dfs.data.dir
> If they are not specified, data is stored in /tmp. This is extremely 
> dangerous. Rationale: the cluster will work fine for days, possibly even 
> weeks, before blocks will start to go missing. Rebooting a datanode on common 
> Linux systems will clear all the data from that node. There is no documented 
> way (that I'm aware of) to recover the situation. The cluster must be 
> completely obliterated and rebuilt from scratch.
> Better reactions to the missing configuration parameters:
> 1. DataNode dies on startup and asks that these parameters be defined.
> 2. Default is /var/db/hadoop (or some other non-volatile storage location). 
> Naturally, inability to write into that directory leads to DataNode to die on 
> startup, logging error.
> The first solution would be most likely preferred by typical enterprise 
> sysadmins. The second solution is suboptimal (since /var/db/hadoop might not 
> be the optimal location for the data) but is still preferable to the current 
> implementation, since it will less often lead to an irretrievably corrupt 
> cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1960) dfs.*.dir should not default to /tmp (or other typically volatile storage)

Reply via email to