[ 
https://issues.apache.org/jira/browse/HDFS-6753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6753:
--------------------------------
    Assignee: J.Andreina  (was: Srikanth Upputuri)
      Status: Patch Available  (was: Open)

> When one the Disk is full and all the volumes configured are unhealthy , then 
> Datanode is not considering it as failure and datanode process is not 
> shutting down .
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6753
>                 URL: https://issues.apache.org/jira/browse/HDFS-6753
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: J.Andreina
>            Assignee: J.Andreina
>         Attachments: HDFS-6753.1.patch
>
>
> Env Details :
> =============
> Cluster has 3 Datanode
> Cluster installed with "Rex" user
> dfs.datanode.failed.volumes.tolerated  = 3
> dfs.blockreport.intervalMsec                  = 18000
> dfs.datanode.directoryscan.interval     = 120
> DN_XX1.XX1.XX1.XX1 data dir                         = 
> /mnt/tmp_Datanode,/home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data
>  
>  
> /home/REX/data/dfs1/data,/home/REX/data/dfs2/data,/opt/REX/dfs/data - 
> permission is denied ( hence DN considered the volume as failed )
>  
> Expected behavior is observed when disk is not full:
> ========================================
>  
> Step 1: Change the permissions of /mnt/tmp_Datanode to root
>  
> Step 2: Perform write operations ( DN detects that all Volume configured is 
> failed and gets shutdown )
>  
> Scenario 1: 
> ===========
>  
> Step 1 : Make /mnt/tmp_Datanode disk full and change the permissions to root
> Step 2 : Perform client write operations ( disk full exception is thrown , 
> but Datanode is not getting shutdown ,  eventhough all the volume configured 
> has failed)
>  
> {noformat}
>  
> 2014-07-21 14:10:52,814 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: 
> XX1.XX1.XX1.XX1:50010:DataXceiver error processing WRITE_BLOCK operation  
> src: /XX2.XX2.XX2.XX2:10106 dst: /XX1.XX1.XX1.XX1:50010
>  
> org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of space: The 
> volume with the most available space (=4096 B) is less than the block size 
> (=134217728 B).
>  
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy.chooseVolume(RoundRobinVolumeChoosingPolicy.java:60)
>  
> {noformat}
>  
> Observations :
> ==============
> 1. Write operations does not shutdown Datanode , eventhough all the volume 
> configured is failed ( When one of the disk is full and for all the disk 
> permission is denied)
>  
> 2. Directory scannning fails , still DN is not getting shutdown
>  
>  
>  
> {noformat}
>  
> 2014-07-21 14:13:00,180 WARN 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: Exception occured 
> while compiling report: 
>  
> java.io.IOException: Invalid directory or I/O error occurred for dir: 
> /mnt/tmp_Datanode/current/BP-1384489961-XX2.XX2.XX2.XX2-845784615183/current/finalized
>  
> at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1164)
>  
> at 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner$ReportCompiler.compileReport(DirectoryScanner.java:596)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to