[ 
https://issues.apache.org/jira/browse/HDFS-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111917#comment-17111917
 ] 

pengWei Dou edited comment on HDFS-12487 at 5/20/20, 8:24 AM:
--------------------------------------------------------------

hi,[~liumihust], when i  used [^HDFS-12487.003.patch] , i find the following 
code in DiskBalancer#getBlockToCopy,
{code:java}
// 
if (block != null) {
...
} else {
}
{code}
 

 so, why do null check in your patch, can you explain it? thanks!


was (Author: doudou):
hi,[~liumihust], when i  used [^HDFS-12487.003.patch] , i find the following 
code in DiskBalancer#getBlockToCopy,
{code:java}
// 
if (block != null) {
...
} else {
}
{code}
 

 so, why do null check in your patch, can you explain is? thanks!

> FsDatasetSpi.isValidBlock() lacks null pointer check inside and neither do 
> the callers
> --------------------------------------------------------------------------------------
>
>                 Key: HDFS-12487
>                 URL: https://issues.apache.org/jira/browse/HDFS-12487
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover, diskbalancer
>    Affects Versions: 3.0.0
>         Environment: CentOS 6.8 x64
> CPU:4 core
> Memory:16GB
> Hadoop: Release 3.0.0-alpha4
>            Reporter: liumi
>            Assignee: liumi
>            Priority: Major
>             Fix For: 3.3.0, 3.2.1, 3.1.3
>
>         Attachments: HDFS-12487.002.patch, HDFS-12487.003.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> BlockIteratorImpl.nextBlock() will look for the blocks in the source volume, 
> if there are no blocks any more, it will return null up to 
> DiskBalancer.getBlockToCopy(). However, the DiskBalancer.getBlockToCopy() 
> will check whether it's a valid block.
> When I look into the FsDatasetSpi.isValidBlock(), I find that it doesn't 
> check the null pointer! In fact, we firstly need to check whether it's null 
> or not, or exception will occur.
> This bug is hard to find, because the DiskBalancer hardly copy all the data 
> of one volume to others. Even if some times we may copy all the data of one 
> volume to other volumes, when the bug occurs, the copy process has already 
> done.
> However, when we try to copy all the data of two or more volumes to other 
> volumes in more than one step, the thread will be shut down, which is caused 
> by the bug above.
> The bug can fixed by two ways:
> 1)Before the call of FsDatasetSpi.isValidBlock(), we check the null pointer
> 2)Check the null pointer inside the implementation of 
> FsDatasetSpi.isValidBlock()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to