[
https://issues.apache.org/jira/browse/HDFS-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976993#comment-14976993
]
Lei (Eddy) Xu commented on HDFS-9267:
-------------------------------------
[~cmccabe] thanks a lot for the suggestions.
bq. Do you think it would be better to have an Iterator<Replica> here?
In {{getStoredReplicas()}}, it needs to scan and load all on-disk replicas into
a local {{replicaMap}}, to verify the contents on the disk. Returning a
iterator of this local {{replicaMap}} has the same space complexity as
returning a Collection, because this {{replicaMap}} is still referred by the
iterator. Also it is less readable to implement {{isEmpty()}} using iterator
(i.e., using {{it.hasNext()}}) in the following code:
{code}
while (!utils.getStoredReplicas(bpid).isEmpty()) {
Thread.sleep(100);
}
{code}
bq. That collection of all the replicas in the dataset could get pretty big in
theory.
{{FsDatasetTestUtils}} is only used by {{HDFS}} unit tests, which should not
have millions of blocks in one test. Will {{Hbase}} or other projects use this
function? If the space is a concern, we could write a replica Scanner in the
future.
What do you think?
> TestDiskError should get stored replicas through FsDatasetTestUtils.
> --------------------------------------------------------------------
>
> Key: HDFS-9267
> URL: https://issues.apache.org/jira/browse/HDFS-9267
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: test
> Affects Versions: 2.7.1
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Priority: Minor
> Attachments: HDFS-9267.00.patch, HDFS-9267.01.patch,
> HDFS-9267.02.patch
>
>
> {{TestDiskError#testReplicationError}} scans local directories to verify
> blocks and metadata files, which leaks the details of {{FsDataset}}
> implementation.
> This JIRA will abstract the "scanning" operation to {{FsDatasetTestUtils}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)