Walter Su created HDFS-8501: ------------------------------- Summary: Erasure Coding: Improve memory efficiency of BlockInfoStriped Key: HDFS-8501 URL: https://issues.apache.org/jira/browse/HDFS-8501 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su
Erasure Coding: Improve memory efficiency of BlockInfoStriped Assume we have a BlockInfoStriped: {noformat} triplets[] = {s0, s1, s2, s3} indices[] = {0, 1, 2, 3} {noformat} When we run balancer/mover to re-locate replica on s2, firstly it becomes: {noformat} triplets[] = {s0, s1, s2, s3, s2} indices[] = {0, 1, 2, 3, 2} {noformat} Then the replica on s1 is removed, finally it becomes: {noformat} triplets[] = {s0, s1, null, s3, s2} indices[] = {0, 1, -1, 3, 2} {noformat} The worst case is: {noformat} triplets[] = {null, null, null, null, s0, s1, s2, s3} indices[] = {-1, -1, -1, -1, 0, 1, 2, 3} {noformat} We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a storage is removed, we bring the last item front. With the improvement, the worst case become: {noformat} triplets[] = {s0, s1, s2, s3, null} indices[] = {0, 1, 2, 3, -1} {noformat} We have an empty slot. Notes: Assume we copy 4 storage first, then delete 4. Even with the improvement, the worst case could be: {noformat} triplets[] = {s0, s1, s2, s3, null, null, null, null} indices[] = {0, 1, 2, 3, -1, -1, -1, -1} {noformat} But the Balancer strategy won't move same block/blockGroup twice in a row. So this case is very rare. -- This message was sent by Atlassian JIRA (v6.3.4#6332)