[jira] [Updated] (HDFS-2034) length in getBlockRange becomes -ve when reading only from currently being written blk

John George (JIRA) Mon, 06 Jun 2011 09:56:44 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


John George updated HDFS-2034:
------------------------------

    Attachment: HDFS-2034-1.patch


> Daryn Sharp commented on HDFS-2034:
> -----------------------------------
> 
> * I'd suggest changing {{"lastblkcomplete but read past that"}} to something 
> more grammatically correct (I'm not the police!) like {{"can't read past last 
> completed block"}}.

Good point.

> 
> * There's an assertion for reading past the last *completed* block, but not 
> for reading past the last *located* block?

The reason that I added assert for *completed* blocks is because I was changing 
a previous assumption in that code. So, I wanted to make sure that I was not 
breaking the logic behind it. I should probably remove this assert.

The reason I am hesitant about adding an assert for reading past the last block 
is because it has to go out of its way to calculate the length of all the 
blocks (which is already being done by the caller of this function) and that 
does not seem to be the job of this function.

> 
> * I found the names of the booleans to not be clear at face value.  Ie. I 
> couldn't tell what they meant if I didn't see the assignment.

On reading the booleans today, I have to say I agree with you on two of the 
three boolean names.

Changed readOnlyIncompleteBlk to readOffsetWithinCompleteBlk (now it means the 
opposite of what it meant before, but it seems to help the code logic).

Changed readPastCompletedBlk to readLengthPastCompletedBlock.
I would like to continue to use lengthOfCompleteBlk instead of 
locatedBlocks.getFileLength() because (and as Todd suggested in HDFS-1907), it 
make sense to call it something related to completeBlocks as opposed to 
FileLength.

> 
> For your consideration, maybe something like this with a sprinkling of 
> comments:
> {code}
> if (locatedBlocks.isLastBlockComplete())
>   assert(!readOffsetPastCompletedBlock) : "can't read past last completed 
> block";
> 
> if (!readOffsetPastCompletedBlock) {
>   if (readLengthPastCompletedBlock) {
>     length = locatedBlocks.getFileLength() - offset;
>   }
>   blocks = getFinalizedBlockRange(offset, length);
> }
> if (readOffsetPastCompletedBlock) {
>   Block lastBlock = locatedBlocks.getLastLocatedBlock();
>   long lastBlockEndOffset = lastBlock.getStartOffset() + 
> lastBlock.getBlockSize();
>   assert(offset < lastBlockEndOffset) : "can't read past last located block";
>   blocks.add(lastLocatedBlock);
> }
> {code}
> 

Thanks for your suggestions.

> length in getBlockRange becomes -ve when reading only from currently being 
> written blk
> --------------------------------------------------------------------------------------
>
>                 Key: HDFS-2034
>                 URL: https://issues.apache.org/jira/browse/HDFS-2034
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: John George
>            Assignee: John George
>            Priority: Minor
>         Attachments: HDFS-2034-1.patch, HDFS-2034-1.patch, HDFS-2034.patch
>
>
> This came up during HDFS-1907. Posting an example that Todd posted in 
> HDFS-1907 that brought out this issue.
> {quote}
> Here's an example sequence to describe what I mean:
> 1. open file, write one and a half blocks
> 2. call hflush
> 3. another reader asks for the first byte of the second block
> {quote}
> In this case since offset is greater than the completed block length, the 
> math in getBlockRange() of DFSInputStreamer.java will set "length" to 
> negative.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2034) length in getBlockRange becomes -ve when reading only from currently being written blk

Reply via email to