[
https://issues.apache.org/jira/browse/HDFS-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Trezzo updated HDFS-8789:
-------------------------------
Attachment: HDFS-8789-trunk-STRAWMAN-v1.patch
Attached is a v1 strawman patch based on trunk. Note: This patch is not ready
to be committed and is just being posted as an illustration of one approach to
this problem. That being said, it does work at scale and we have used a similar
tool internally to migrate very large (multiple thousands of nodes) clusters.
Some notes about the patch:
This migrator tool is client-side. It leverages existing block placement policy
logic with the addition of a method to BlockPlacementPolicy. If you would like
to use this tool with a specific placement policy it must implement this new
method.
The migrator borrows a large amount of code/inspiration from the existing
balancer. It works in three major phases. Phase 1 it builds the topology of the
cluster. Phase 2 it identifies blocks who's replica sets are in violation of
the specified block placement policy and determines the set of moves required
to make each block compliant with the policy. Phase 3 it executes the moves on
the cluster. The migrator leverages the existing copyBlock and replaceBlock
methods in DataTransferProtocol to move blocks, along with the existing data
node implementations of these methods that are used by the balancer.
There are a number of things currently not-implemented with the current patch,
please see various TODO's in the code for details. One major assumption is that
the cluster only has one storage type (DISK). It is also not properly
integrated with various hadoop shell commands.
The migrator tool also prints out some useful statistics during the run. These
include things like block size distribution, block rack diversity distribution
and block replication distribution. Here is a sample output from a run on one
of our test clusters with three block pools:
{noformat}
16/01/27 23:56:42 INFO migrator.Migrator: STATS for blockpoolID: BP-XXXXXXXX
====== STATS: PHASE 2 ======
Total Blocks: 3491697
Total SkippedBlocks: 0
Total ViolatingBlocks: 323
Total Ignored Replicas: 0
Moves To Make: 323
Total data movement (bytes): 35368192040
Inter-rack data movement (bytes): 0
Intra-rack data movement (bytes): 35368192040
Number of 3 (or more) rack blocks: 2448788
Number of 2 rack blocks: 1042909
Number of 1 rack blocks: 0
Replica count distribution (Number of replicas: Block count):
3: 2435309
10: 1056388
Block size distribution ([Start byte size - End byte size]: Block count):
[0 - 59652322]: 2506682
[59652323 - 119304645]: 234396
[119304646 - 178956968]: 127102
[178956969 - 238609291]: 379262
[238609292 - 298261614]: 59686
[298261615 - 357913937]: 39766
[357913938 - 417566260]: 21035
[417566261 - 477218583]: 40429
[477218584 - 536870906]: 8496
[536870907 <]: 74843
====== STATS: PHASE 3 ======
Successful Moves: 323
Timed out Moves: 0
Failed Moves: 0
16/01/27 23:56:42 INFO migrator.Migrator: STATS for blockpoolID: BP-YYYYYYYYYY
====== STATS: PHASE 2 ======
Total Blocks: 645074
Total SkippedBlocks: 0
Total ViolatingBlocks: 74
Total Ignored Replicas: 0
Moves To Make: 74
Total data movement (bytes): 24992011754
Inter-rack data movement (bytes): 0
Intra-rack data movement (bytes): 24992011754
Number of 3 (or more) rack blocks: 133328
Number of 2 rack blocks: 511746
Number of 1 rack blocks: 0
Replica count distribution (Number of replicas: Block count):
3: 644839
10: 235
Block size distribution ([Start byte size - End byte size]: Block count):
[0 - 59652322]: 236588
[59652323 - 119304645]: 16163
[119304646 - 178956968]: 12312
[178956969 - 238609291]: 13967
[238609292 - 298261614]: 15124
[298261615 - 357913937]: 13327
[357913938 - 417566260]: 20061
[417566261 - 477218583]: 15275
[477218584 - 536870906]: 18028
[536870907 <]: 284229
====== STATS: PHASE 3 ======
Successful Moves: 74
Timed out Moves: 0
Failed Moves: 0
16/01/27 23:56:42 INFO migrator.Migrator: STATS for blockpoolID:
BP-ZZZZZZZZZZZZZZZ
====== STATS: PHASE 2 ======
Total Blocks: 1307498
Total SkippedBlocks: 0
Total ViolatingBlocks: 308
Total Ignored Replicas: 0
Moves To Make: 308
Total data movement (bytes): 44949218274
Inter-rack data movement (bytes): 0
Intra-rack data movement (bytes): 44949218274
Number of 3 (or more) rack blocks: 1130907
Number of 2 rack blocks: 176533
Number of 1 rack blocks: 58
Replica count distribution (Number of replicas: Block count):
1: 58
3: 1307438
4: 1
10: 1
Block size distribution ([Start byte size - End byte size]: Block count):
[0 - 59652322]: 786578
[59652323 - 119304645]: 38393
[119304646 - 178956968]: 28264
[178956969 - 238609291]: 31321
[238609292 - 298261614]: 34763
[298261615 - 357913937]: 26562
[357913938 - 417566260]: 25916
[417566261 - 477218583]: 16208
[477218584 - 536870906]: 20765
[536870907 <]: 298728
====== STATS: PHASE 3 ======
Successful Moves: 308
Timed out Moves: 0
Failed Moves: 0
Migration took 10.459516666666667 minutes
{noformat}
Ideally I think it would be cool if migrating between block placement policies
was handled by the namenode. This has been a conversation that I have also had
with [~andrew.wang] and [~mingma]. Hopefully we can flesh out this idea a
little bit more on this jira.
Please let me know if you have any questions/comments! Thanks.
> Block Placement policy migrator
> -------------------------------
>
> Key: HDFS-8789
> URL: https://issues.apache.org/jira/browse/HDFS-8789
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Chris Trezzo
> Assignee: Chris Trezzo
> Attachments: HDFS-8789-trunk-STRAWMAN-v1.patch
>
>
> As we start to add new block placement policies to HDFS, it will be necessary
> to have a robust tool that can migrate HDFS blocks between placement
> policies. This jira is for the design and implementation of that tool.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)