[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054115#comment-14054115
 ] 

Colin Patrick McCabe commented on HDFS-6482:
--------------------------------------------

bq. My read is, this will not work if during upgrade the previous layout is 
changed to the new layout. During rolling upgrades hardlinks to all the blocks 
are not created, only for the ones deleted post rolling upgrade. This is done 
to keep the datanode upgrade time short to support quick restart. If rolling 
upgrades cannot be supported, this code can only go into a major release.

My understanding of this was that it was an optimization for the cases where 
the datanode layout hadn't changed significantly (which was most upgrades).  It 
should not be interpreted as a hard limitation that prevents us from making 
*any* changes for the datanode layout in the future.

James, it would be good to see some upgrade times for a DN with a few hundred 
thousand blocks.  It seems like this should be manageable, especially if we 
parallelize it a bit.

> Use block ID-based block layout on datanodes
> --------------------------------------------
>
>                 Key: HDFS-6482
>                 URL: https://issues.apache.org/jira/browse/HDFS-6482
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.5.0
>            Reporter: James Thomas
>            Assignee: James Thomas
>         Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch, 
> HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch, 
> HDFS-6482.patch
>
>
> Right now blocks are placed into directories that are split into many 
> subdirectories when capacity is reached. Instead we can use a block's ID to 
> determine the path it should go in. This eliminates the need for the LDir 
> data structure that facilitates the splitting of directories when they reach 
> capacity as well as fields in ReplicaInfo that keep track of a replica's 
> location.
> An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to