[ https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508580#comment-14508580 ]
Walter Su commented on HDFS-8033: --------------------------------- >... ByteBufferStrategy.doRead() ignores len argument. It always read >byteBuffer.remaining, untils EOF of the current block. Correct myself: ByteBufferStrategy.doRead() ignores len argument. It always read byteBuffer.remaining, untils EOF of the current *packet*. I read {{BlockSender.doSendBlock()}}. I found out that packet size is depended by "io.file.buffer.size" and BlockSender.MIN_BUFFER_WITH_TRANSFERTO. If we read block locally, then size of data part of packet is "io.file.buffer.size"(default 4096). HdfsConstants.BLOCK_STRIPED_CELL_SIZE = 256 * 1024; Good thing is, cellSize%packetSize == 0, 256 * 1024 /4096 == 4; so we call {{ByteBufferStrategy.doRead()}} 4 times. We can read exactly one cell. What if cellSize%packetSize != 0? It'll be wrong. Try config "io.file.buffer.size" == 4099. The testcase will failed. ( any other value cellSize%packetSize != 0 ) Your implementation for bytebuffer works now. But We have to make sure, cellSize % ("io.file.buffer.size") ==0 (for local read) cellSize % (BlockSender.MIN_BUFFER_WITH_TRANSFERTO) ==0 (for remote read) *When we choose another value for cellSize , we should be careful. Otherwise read(bytebuffer) won't work.* > Erasure coding: stateful (non-positional) read from files in striped layout > --------------------------------------------------------------------------- > > Key: HDFS-8033 > URL: https://issues.apache.org/jira/browse/HDFS-8033 > Project: Hadoop HDFS > Issue Type: Sub-task > Affects Versions: HDFS-7285 > Reporter: Zhe Zhang > Assignee: Zhe Zhang > Attachments: HDFS-8033.000.patch, HDFS-8033.001.patch, > HDFS-8033.002.patch, HDFS-8033.003.patch, hdfs8033-HDFS-7285.04.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)