[ https://issues.apache.org/jira/browse/HDFS-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Shvachko updated HDFS-11634: --------------------------------------- Attachment: HDFS-11634.003.patch * Fixed the typos. Thanks. * Yes this the corner case. In the previous patch if I have 3 storages having \{3, 3, 1\} blocks respectively. And I want to set iterator to startBlock=2. Then s=2 <= numBlocks for the first two storages, but not the third, and {{index}} will increment. Which is incorrect as startBlock=2 is on storage #0, rather than #1. My solution is to base the if condition directly on startBlock, and then one should accumulate blocks in storages, which sumBlocks does. Hope this makes sense. > Optimize BlockIterator when interating starts in the middle. > ------------------------------------------------------------ > > Key: HDFS-11634 > URL: https://issues.apache.org/jira/browse/HDFS-11634 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.6.5 > Reporter: Konstantin Shvachko > Assignee: Konstantin Shvachko > Attachments: HDFS-11634.001.patch, HDFS-11634.002.patch, > HDFS-11634.003.patch > > > {{BlockManager.getBlocksWithLocations()}} needs to iterate blocks from a > randomly selected {{startBlock}} index. It creates an iterator which points > to the first block and then skips all blocks until {{startBlock}}. It is > inefficient when DN has multiple storages. Instead of skipping blocks one by > one we can skip entire storages. Should be more efficient on average. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org