[ 
https://issues.apache.org/jira/browse/HDDS-1496?focusedWorklogId=239831&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239831
 ]

ASF GitHub Bot logged work on HDDS-1496:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/May/19 17:43
            Start Date: 09/May/19 17:43
    Worklog Time Spent: 10m 
      Work Description: hanishakoneru commented on issue #804: HDDS-1496. 
Support partial chunk reads and checksum verification
URL: https://github.com/apache/hadoop/pull/804#issuecomment-490999735
 
 
   This patch requires more changes (after HDDS-1491 which fixes seek 
operation).
   1. In BlockInputStream#seek(), it is not sufficient to just check if the 
required chunkIndex matches with current buffers chunk index and that the 
buffer has data remaining. Since the buffer might have only a partial chunk, it 
is possible that it does not cover the position seeked. 
   2. In BlockInputStream#readChunkFromContainer(), we should not blindly 
increment the chunkIndex. The last read might have read only a part of the 
chunk and the next read may be required to read from the same chunk again.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 239831)
    Time Spent: 0.5h  (was: 20m)

> Support partial chunk reads and checksum verification
> -----------------------------------------------------
>
>                 Key: HDDS-1496
>                 URL: https://issues.apache.org/jira/browse/HDDS-1496
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Hanisha Koneru
>            Assignee: Hanisha Koneru
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> BlockInputStream#readChunkFromContainer() reads the whole chunk from disk 
> even if we need to read only a part of the chunk.
> This Jira aims to improve readChunkFromContainer so that only that part of 
> the chunk file is read which is needed by client plus the part of chunk file 
> which is required to verify the checksum.
> For example, lets say the client is reading from index 120 to 450 in the 
> chunk. And let's say checksum is stored for every 100 bytes in the chunk i.e. 
> the first checksum is for bytes from index 0 to 99, the next for bytes from 
> index 100 to 199 and so on. To verify bytes from 120 to 450, we would need to 
> read from bytes 100 to 499 so that checksum verification can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to