[
https://issues.apache.org/jira/browse/HDDS-1496?focusedWorklogId=255368&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-255368
]
ASF GitHub Bot logged work on HDDS-1496:
----------------------------------------
Author: ASF GitHub Bot
Created on: 06/Jun/19 20:01
Start Date: 06/Jun/19 20:01
Worklog Time Spent: 10m
Work Description: hanishakoneru commented on pull request #804:
HDDS-1496. Support partial chunk reads and checksum verification
URL: https://github.com/apache/hadoop/pull/804#discussion_r291348102
##########
File path:
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java
##########
@@ -53,58 +46,77 @@
private static final int EOF = -1;
- private final ArrayList<ChunkInputStreamEntry> streamEntries;
- // streamOffset[i] stores the offset at which blockInputStream i stores
- // data in the key
- private long[] streamOffset = null;
- private int currentStreamIndex;
+ private String key;
private long length = 0;
private boolean closed = false;
- private String key;
- public KeyInputStream() {
- streamEntries = new ArrayList<>();
- currentStreamIndex = 0;
- }
+ // List of BlockInputStreams, one for each block in the key
+ private final List<BlockInputStream> blockStreams;
- @VisibleForTesting
- public synchronized int getCurrentStreamIndex() {
- return currentStreamIndex;
- }
+ // blockOffsets[i] stores the index of the first data byte in
+ // blockStream i w.r.t the key data.
+ // For example, let’s say the block size is 200 bytes and block[0] stores
+ // data from indices 0 - 199, block[1] from indices 200 - 399 and so on.
+ // Then, blockOffset[0] = 0 (the offset of the first byte of data in
+ // block[0]), blockOffset[1] = 200 and so on.
+ private long[] blockOffsets = null;
- @VisibleForTesting
- public long getRemainingOfIndex(int index) throws IOException {
- return streamEntries.get(index).getRemaining();
+ // Index of the blockStream corresponding to the current position of the
+ // KeyInputStream i.e. offset of the data to be read next
+ private int blockIndex;
+
+ // Tracks the blockIndex corresponding to the last seeked position so that it
+ // can be reset if a new position is seeked.
+ private int blockIndexOfPrevPosition;
+
+ public KeyInputStream() {
+ blockStreams = new ArrayList<>();
+ blockIndex = 0;
}
/**
- * Append another stream to the end of the list.
- *
- * @param stream the stream instance.
- * @param streamLength the max number of bytes that should be written to this
- * stream.
+ * For each block in keyInfo, add a BlockInputStream to blockStreams.
*/
- @VisibleForTesting
- public synchronized void addStream(BlockInputStream stream,
- long streamLength) {
- streamEntries.add(new ChunkInputStreamEntry(stream, streamLength));
+ public static LengthInputStream getFromOmKeyInfo(OmKeyInfo keyInfo,
+ XceiverClientManager xceiverClientManager,
+ StorageContainerLocationProtocol storageContainerLocationClient,
Review comment:
Removed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 255368)
Time Spent: 8.5h (was: 8h 20m)
> Support partial chunk reads and checksum verification
> -----------------------------------------------------
>
> Key: HDDS-1496
> URL: https://issues.apache.org/jira/browse/HDDS-1496
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Reporter: Hanisha Koneru
> Assignee: Hanisha Koneru
> Priority: Major
> Labels: pull-request-available
> Time Spent: 8.5h
> Remaining Estimate: 0h
>
> BlockInputStream#readChunkFromContainer() reads the whole chunk from disk
> even if we need to read only a part of the chunk.
> This Jira aims to improve readChunkFromContainer so that only that part of
> the chunk file is read which is needed by client plus the part of chunk file
> which is required to verify the checksum.
> For example, lets say the client is reading from index 120 to 450 in the
> chunk. And let's say checksum is stored for every 100 bytes in the chunk i.e.
> the first checksum is for bytes from index 0 to 99, the next for bytes from
> index 100 to 199 and so on. To verify bytes from 120 to 450, we would need to
> read from bytes 100 to 499 so that checksum verification can be done.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]