[ 
https://issues.apache.org/jira/browse/HDFS-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904047#comment-16904047
 ] 

Stephen O'Donnell commented on HDFS-14311:
------------------------------------------

Thanks for the patch [~caiyicong], this is a good discovery. I suspect the 
reason this has not come up before, is because it likely only happens when the 
Datanode volumes have a very small number of blocks.

The current code path iterates over each storage directory, and if it needs 
upgraded, it will return a callable which is submitted to an executor, and then 
the next directory is checked.

Inside the callable, it will first upgrade the storage before updating the 
BlockPoolSliceStorage instance variables. If the storage upgrade happens very 
quickly, then the first callable will change the instance variables in 
BlockPoolSliceStorage, and the later storage directories will get the error you 
mentiond. If the upgrade of the storage takes more time than it takes to create 
all the callables, which is likely if there many blocks on the storage, then 
this issue would not manifest.

If I understand correctly, your patch works around the problem by creating and 
collecting all the 'upgrade callables' and then submitting them to the executor 
only after all of them have been created. That way, it does not matter when the 
BlockPoolSliceStorage variables are updated.

With the current structure of the code, and how the layout version and ctime 
are used within BlockPoolSliceStorage, I think your patch is the best way of 
fixing this. Anything else would require a lot more refactoring.

I have just a few comments:
 # I don't believe any of the test failures are related to this change.
 # Could you address the checkstyle issues highlighted in the last run please?
 # I wonder if we could think of a way to add a test for this, to at least 
reproduce the issue. It could be tricky due to the timing of things, but if we 
create a single DN with quite a few storage directories at an older layout 
version, and then upgraded them, it may be possible.

 

> multi-threading conflict at layoutVersion when loading block pool storage
> -------------------------------------------------------------------------
>
>                 Key: HDFS-14311
>                 URL: https://issues.apache.org/jira/browse/HDFS-14311
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: rolling upgrades
>    Affects Versions: 2.9.2
>            Reporter: Yicong Cai
>            Assignee: Yicong Cai
>            Priority: Major
>         Attachments: HDFS-14311.1.patch
>
>
> When DataNode upgrade from 2.7.3 to 2.9.2, there is a conflict at 
> StorageInfo.layoutVersion in loading block pool storage process.
> It will cause this exception:
>  
> {panel:title=exceptions}
> 2019-02-15 10:18:01,357 [13783] - INFO [Thread-33:BlockPoolSliceStorage@395] 
> - Restored 36974 block files from trash before the layout upgrade. These 
> blocks will be moved to the previous directory during the upgrade
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:BlockPoolSliceStorage@226] 
> - Failed to analyze storage directories for block pool 
> BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:DataStorage@472] - Failed 
> to add storage directory [DISK]file:/mnt/dfs/2/hadoop/hdfs/data/ for block 
> pool BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748) 
> {panel}
>  
> root cause:
> BlockPoolSliceStorage instance is shared for all storage locations recover 
> transition. In BlockPoolSliceStorage.doTransition, it will read the old 
> layoutVersion from local storage, compare with current DataNode version, then 
> do upgrade. In doUpgrade, add the transition work as a sub-thread, the 
> transition work will set the BlockPoolSliceStorage's layoutVersion to current 
> DN version. The next storage dir transition check will concurrent with pre 
> storage dir real transition work, then the BlockPoolSliceStorage instance 
> layoutVersion will confusion.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to