[ 
https://issues.apache.org/jira/browse/HDFS-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290516#comment-13290516
 ] 

Todd Lipcon commented on HDFS-2988:
-----------------------------------

Hi Miomir, thanks for the patch. A few comments:
- please make sure your editor is configured for spaces, not tabs, and 2-space 
indentation. Some of the whitespace is off in the patch you attached.
- it doesnt look like you're actually writing the locking node's identity into 
the lock file itself. The "failed to lock" error message is reporting its own 
PID/host, not the locker's PID/host
- for the fallback on host name, why not fall back to using the local hostname 
provided by InetAddress? Or, to make the whole thing a much simpler patch, I 
think it would be acceptable to just use the RuntimeMXBean.getName() in the 
message instead of attempting to parse out the pid and host. I think an error 
message saying "already locked by process [email protected]" would be clear 
enough for most people to track down the issue, and then we wouldn't need to 
add so much code for just this simple improvement.
                
> Improve error message when storage directory lock fails
> -------------------------------------------------------
>
>                 Key: HDFS-2988
>                 URL: https://issues.apache.org/jira/browse/HDFS-2988
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Todd Lipcon
>            Priority: Minor
>              Labels: newbie
>         Attachments: HDFS-2988.patch
>
>
> Currently, the error message is fairly opaque to a non-developer ("Cannot 
> lock storage" or something). Instead, we should have some improvments:
> - when we create the in_use.lock file, we should write the hostname/PID that 
> locked the file
> - if the lock fails, and in_use.lock exists, the error message should say 
> something like "It appears that another namenode (pid 23423 on host 
> foo.example.com) has already locked the storage directory."
> - if the lock fails, and no lock file exists, the error message should say 
> something like "if this storage directory is mounted via NFS, ensure that the 
> appropriate nfs lock services are running."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to