[
https://issues.apache.org/jira/browse/HADOOP-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672251#comment-16672251
]
Hrishikesh Gadre commented on HADOOP-11391:
-------------------------------------------
[~xiaochen] Thanks a lot for the code review.
{quote} - BlockManager: we need to break out the writelock into finer-grained
configurable batches since the input {{blocks}} could be huge. Probably ok to
reuse {{numBlocksPerIteration}}.{quote}
Good catch! Fixed.
{quote} - BlockManager: the log about {{processMisReplicatedBlocks}} should be
debug or trace{quote}
Done
{quote} - Would be good to have the test to verify changing from 'never been
multi-rack' to multi-rack, and then -replicate.{quote}
Actually for this specific scenario, we don't need to use this new option (as
NN takes care of automatically fixing the mis-replication). Here is an existing
test which verifies this scenario (also review HDFS-3256).
[https://github.com/apache/hadoop/blob/d174b916352ffe701058f3bdae433c3f24eb37c2/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlocksWithNotEnoughRacks.java#L88]
{quote}Test: trivial, let's be consistent about naming in the comment.
'datanodes' should be fine, don't need 'data nodes'
{quote}
Done.
I will upload a new patch shortly.
> Enabling HVE/node awareness does not rebalance replicas on data that existed
> prior to topology changes.
> --------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-11391
> URL: https://issues.apache.org/jira/browse/HADOOP-11391
> Project: Hadoop Common
> Issue Type: Bug
> Environment: VMWare w/ local storage
> Reporter: ellen johansen
> Assignee: Hrishikesh Gadre
> Priority: Major
> Attachments: HADOOP-11391-001.patch, HADOOP-11391-002.patch
>
>
> Enabling HVE/node awareness does not rebalance replicas on data that existed
> prior to topology changes.
> [root@vmw-d10-001 jenkins]# more /opt/cloudera/topology.data
> 10.20.xxx.161 /rack1/nodegroup1
> 10.20.xxx.162 /rack1/nodegroup1
> 10.20.xxx.163 /rack3/nodegroup1
> 10.20.xxx.164 /rack3/nodegroup1
> 172.17.xxx.71 /rack2/nodegroup1
> 172.17.xxx.72 /rack2/nodegroup1
> before HVE:
> /user/impalauser/tpcds/store_sales <dir>
> /user/impalauser/tpcds/store_sales/store_sales.dat 1180463121 bytes, 9
> block(s): OK
> 0. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742xxx_1382
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 10.20.xxx.163:20002]
> 1. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742213_1389
> len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002,
> 10.20.xxx.161:20002]
> 2. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742214_1390
> len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002,
> 10.20.xxx.163:20002]
> 3. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742215_1391
> len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002,
> 10.20.xxx.163:20002]
> 4. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742216_1392
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 172.17.xxx.72:20002]
> 5. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742217_1393
> len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002,
> 10.20.xxx.163:20002]
> 6. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742220_1396
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002,
> 10.20.xxx.163:20002]
> 7. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742222_1398
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002,
> 10.20.xxx.161:20002]
> 8. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742224_1400
> len=106721297 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002,
> 172.17.xxx.72:20002]
> ---------
> Before enabling HVE:
> Status: HEALTHY
> Total size: 1648156454 B (Total open files size: 498 B)
> Total dirs: 138
> Total files: 384
> Total symlinks: 0 (Files currently being written: 6)
> Total blocks (validated): 390 (avg. block size 4226042 B) (Total open
> file blocks (not validated): 6)
> Minimally replicated blocks: 390 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 1 (0.25641027 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 3
> Average block replication: 2.8564103
> Corrupt blocks: 0
> Missing replicas: 5 (0.44682753 %)
> Number of data-nodes: 5
> Number of racks: 1
> FSCK ended at Wed Dec 10 14:04:35 EST 2014 in 50 milliseconds
> The filesystem under path '/' is HEALTHY
> ------
> after HVE (and NN restart):
> /user/impalauser/tpcds/store_sales <dir>
> /user/impalauser/tpcds/store_sales/store_sales.dat 1180463121 bytes, 9
> block(s): OK
> 0. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742xxx_1382
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002,
> 10.20.xxx.161:20002]
> 1. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742213_1389
> len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002,
> 10.20.xxx.161:20002]
> 2. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742214_1390
> len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002,
> 10.20.xxx.163:20002]
> 3. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742215_1391
> len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002,
> 10.20.xxx.163:20002]
> 4. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742216_1392
> len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002,
> 10.20.xxx.161:20002]
> 5. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742217_1393
> len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002,
> 10.20.xxx.163:20002]
> 6. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742220_1396
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002,
> 10.20.xxx.162:20002]
> 7. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742222_1398
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002,
> 10.20.xxx.161:20002]
> 8. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742224_1400
> len=106721297 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002,
> 10.20.xxx.162:20002]
> Status: HEALTHY
> Total size: 1659427036 B (Total open files size: 498 B)
> Total dirs: 176
> Total files: 529
> Total symlinks: 0 (Files currently being written: 6)
> Total blocks (validated): 532 (avg. block size 3119223 B) (Total open
> file blocks (not validated): 6)
> Minimally replicated blocks: 532 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 1 (0.18796992 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 3
> Average block replication: 2.8383458
> Corrupt blocks: 0
> Missing replicas: 7 (0.46143705 %)
> Number of data-nodes: 5
> Number of racks: 3
> FSCK ended at Wed Dec 10 14:29:23 EST 2014 in 115 milliseconds
> The filesystem under path '/' is HEALTHY
> ------------
> store sales pushed to hdfs after HVE was configured:
> /user/impalauser/tpcds_after_hve <dir>
> /user/impalauser/tpcds_after_hve/store_sales.dat 1180463121 bytes, 9
> block(s): OK
> 0. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743406_2582
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 172.17.xxx.72:20002]
> 1. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743412_2588
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002,
> 172.17.xxx.72:20002]
> 2. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743415_2591
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 172.17.xxx.72:20002]
> 3. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743416_2592
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002,
> 172.17.xxx.72:20002]
> 4. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743417_2593
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 172.17.xxx.72:20002]
> 5. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743418_2594
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 172.17.xxx.72:20002]
> 6. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743419_2595
> len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 172.17.xxx.72:20002]
> 7. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743422_2598
> len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002,
> 10.20.xxx.162:20002]
> 8. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743423_2599
> len=106721297 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002,
> 172.17.xxx.72:20002]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]