[
https://issues.apache.org/jira/browse/HDFS-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Shvachko updated HDFS-11634:
---------------------------------------
Attachment: HDFS-11634.003.patch
* Fixed the typos. Thanks.
* Yes this the corner case.
In the previous patch if I have 3 storages having \{3, 3, 1\} blocks
respectively. And I want to set iterator to startBlock=2. Then s=2 <= numBlocks
for the first two storages, but not the third, and {{index}} will increment.
Which is incorrect as startBlock=2 is on storage #0, rather than #1. My
solution is to base the if condition directly on startBlock, and then one
should accumulate blocks in storages, which sumBlocks does.
Hope this makes sense.
> Optimize BlockIterator when interating starts in the middle.
> ------------------------------------------------------------
>
> Key: HDFS-11634
> URL: https://issues.apache.org/jira/browse/HDFS-11634
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 2.6.5
> Reporter: Konstantin Shvachko
> Assignee: Konstantin Shvachko
> Attachments: HDFS-11634.001.patch, HDFS-11634.002.patch,
> HDFS-11634.003.patch
>
>
> {{BlockManager.getBlocksWithLocations()}} needs to iterate blocks from a
> randomly selected {{startBlock}} index. It creates an iterator which points
> to the first block and then skips all blocks until {{startBlock}}. It is
> inefficient when DN has multiple storages. Instead of skipping blocks one by
> one we can skip entire storages. Should be more efficient on average.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]