[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-11 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.014.patch

Thanks Jing for reviewing and Kai for the comments on codec!

Attaching 014 patch to address the comments.

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678-HDFS-7285.009.patch, HDFS-7678-HDFS-7285.010.patch, 
 HDFS-7678-HDFS-7285.012.patch, HDFS-7678-HDFS-7285.013.patch, 
 HDFS-7678-HDFS-7285.014.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-08 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.012.patch

The 012 patch has a more complete implementation of:
# {{Stripe}} (inherited {{StripeRange}})
# {{StripingChunk}}
# {{StripingCell}}

It passes the normal pread test and still fails {{testPreadWithDNFailure}} 
because somehow I'm passing illegal parameters to {{actualGetFromOneDataNode}}. 
I guess it's a detailed error in calculating stripes. I will fix it in the next 
rev and also add unit tests for the added classes and methods in 
{{StripedBlockUtil}}.

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678-HDFS-7285.009.patch, HDFS-7678-HDFS-7285.010.patch, 
 HDFS-7678-HDFS-7285.012.patch, HDFS-7678-HDFS-7285.11.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-08 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: (was: HDFS-7678-HDFS-7285.11.patch)

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678-HDFS-7285.009.patch, HDFS-7678-HDFS-7285.010.patch, 
 HDFS-7678-HDFS-7285.012.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-08 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.013.patch

013 patch passes all existing tests.

One caveat is that as [~hitliuyi] [found | 
https://issues.apache.org/jira/browse/HDFS-8347?focusedCommentId=14533932page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14533932]
 yesterday, there's a codec bug. So I'm filling fixed content into decoded 
buffer and also manually verified that the correct range from parity blocks is 
fetched. I will file a separate JIRA to create a simulated codec algorithm for 
more isolated unit testing.

[~jingzhao] I added the test you suggested and it passes now (I tested failing 
each individual DN). I'm working on adding extensive unit tests for the new 
{{StripedBlockUtil}} logic. But as the patch is getting big now, please let me 
know if you think I should split the {{StripedBlockUtil}} changes to HDFS-8320. 
If we assume the new {{StripedBlockUtil}} arithmetic calculations are correct, 
the new pread logic in {{DFSStripedInputStream}} is quite simple. So maybe it's 
easier to review them separately.

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678-HDFS-7285.009.patch, HDFS-7678-HDFS-7285.010.patch, 
 HDFS-7678-HDFS-7285.012.patch, HDFS-7678-HDFS-7285.013.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-07 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.11.patch

I originally planned to consolidate striping terminologies under HDFS-8320 but 
it seems necessary to do some of the consolidation now. 011 patch updates the 
definition of a {{Stripe}}, which covers an arbitrary range from all internal 
blocks (same coverage for each internal block), a {{StripingCell}}, a 
{{StripingChunk}}. Using these new tools {{DFSStripedInputStream}} has a much 
simpler logic to read a {{Stripe}}, which in turn reads individual 
{{StripingChunk}}'s. Right now the patch is not completed but posting here for 
some feedback on the direction.

With this structure, all complexities are migrated to abstract number-crunching 
in {{StripedBlockUtil}}, which can be easily and extensively unit-tested.

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678-HDFS-7285.009.patch, HDFS-7678-HDFS-7285.010.patch, 
 HDFS-7678-HDFS-7285.11.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-05 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.010.patch

Thanks Jing for the review; it's very helpful. Updating the patch to address 
the comments.

bq. futures and stripedReadsService can be converted into local variables 
inside of DFSStripedInputStream.
Thanks for pointing it out. Fixed.

bq. fetchBlockByteRange needs to be split into multiple functions
Refactored. Let me know how the new structure looks.

bq. I have some concerns about the current timeout mechanism:
Very good point re. unfair timeout for parity blocks. I agree we should handle 
slow readers (not as slow enough to trigger {{BlockReader}} timeout) in a 
separate JIRA.

bq. Looks like the following code will not add r.index into fetchedBlkIndices 
for blk_0 in the java comment example?
You are right; and this is the expected logic. {{fetchedBlkIndices}} only 
contains blocks fetched at maximum portion (therefore not requiring any 
re-fetch even if other blocks are missing). In the example, if blk_0 returns a 
{{SUCCESS}}, it's not put into {{fetchedBlkIndices}}, then if blk_1 request 
returns as {{FAILED}}, we will issue another request to fetch blk_0 with 
maximum portion. At that time, if the re-fetch succeeds, blk_0 will enter 
{{fetchedBlkIndices}}.

bq. When the reading length is smaller than a full stripe, this if will not 
be hit.
That is right; and it is the expected logic. This part of code (separated out 
as {{processReadTaskEvents}} in the new patch) uses an event-driven flow. If 
all original read requests succeed, no recovery read future will be inserted, 
so {{futures}} will just drain. 

bq. We also need to check if the number of missing blocks is already  number 
of parity blocks. In that case we should fail the read.
Good point! Fixed.

bq. We also need to check if the number of missing blocks is already  number 
of parity blocks. In that case we should fail the read.
Totally agreed. In particular, we should add more seek-and-read tests and try 
to emulate more timing-related scenarios. I believe [~xinwei] is working on 
that under HDFS-8201 and HDFS-8202.

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678-HDFS-7285.009.patch, HDFS-7678-HDFS-7285.010.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-05-04 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Status: Open  (was: Patch Available)

Jenkins not stable recently, will rely on the branch nightly build for this JIRA

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-04 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Summary: Erasure coding: DFSInputStream with decode functionality (pread)  
(was: Erasure coding: DFSInputStream with decode functionality)

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality (pread)

2015-05-04 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.009.patch

Rebased again to fix the test error in {{TestDFSStripedInputStream}}. 

{code}
 int idx = (int) (((blkStartOffset - lb.getStartOffset()) / cellSize)
-% dataBlkNum);
+% (dataBlkNum + parityBlkNum));
{code}
[~jingzhao] I just realized this is not a bug. For contiguous blocks, the 
{{LocatedBlock#offset}} field represents the starting offset of a block in a 
file. For a striped block group, this field represents the same meaning and is 
used to retrieve the group from NN. However, for an internal block, this field 
is only used to identify its index in the group. So as long as we have a 
consistent mapping it should work fine. We can even set {{startOffset}} as 
{{bg.getStartOffset() + idxInBlockGroup}} in {{constructInternalBlock}}, and in 
{{getBlockAt}} as long as we apply the reverse mapping, it should work. We can 
think about how to assign the values to improve readability (maybe in 
HDFS-8320).

 Erasure coding: DFSInputStream with decode functionality (pread)
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678-HDFS-7285.009.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-05-04 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.008.patch

Rebased after HDFS-8281. 

[~jingzhao] Since {{ReadPortion}} is just a util class I intentionally changed 
its fields to be public so as to avoid calling getter and setters. Let me know 
if it looks OK to you. 

I was working on both HDFS-7678 and HDFS-8282 at the same time so 
{{containsReadPortion}} appeared there without usage. Adding it back in this 
patch.

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678-HDFS-7285.008.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-05-01 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.007.patch

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678-HDFS-7285.007.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-05-01 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.006.patch

Thanks Andrew for the review!

bq. When fetching recovery work, we exclude blocks that still have an in-flight 
read. This means we might sometimes error out when we need additional data from 
the in-flight block.
This is a very good catch. I took a closer look at it, with some additional 
test  cases. I think the logic in 005 patch is functionally correct. Basically 
we always _hope_ all inflight read tasks are both successful and cover max read 
portion. If the next returned request turns out otherwise, that event will 
trigger {{scheduleRecoveryReads}}. Note that the loop will continue until 
{{futures}} is empty. That said, the suggested change is a performance 
optimization to speedup recovery. I added {{actualReadPortions}} to keep track 
of the actual issued read portion for each index. Using it we are able to more 
accurately avoid some inflight reads but not all.

bq. The test logic where we kill a DN doesn't look quite right, since we need 
to make sure the killed DN has the expected missing block.
Actually since we are manually injecting simulated blocks, the block-DN 
mapping is fixed. I did extend the test quite a bit to cover more misaligned 
cases:
{code}
int delta = 10;
int done = 0;
// read a small delta, shouldn't trigger decode
// |cell_0 |
// |10 |
done += in.read(0, readBuffer, 0, delta);
assertEquals(delta, done);
// both head and trail cells are partial
// |c_0  |c_1|c_2 |c_3 |c_4  |c_5 |
// |256K - 10|missing|256K|256K|256K - 10|not in range|
done += in.read(delta, readBuffer, delta,
CELLSIZE * (DATA_BLK_NUM - 1) - 2 * delta);
assertEquals(CELLSIZE * (DATA_BLK_NUM - 1) - delta, done);
// read the rest
done += in.read(done, readBuffer, done, readSize - done);
{code}

Please let me know if the new changes look OK. Thanks!

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-30 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.005.patch

Rebased for HDFS-8282

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678-HDFS-7285.005.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-29 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: (was: HDFS-7678-HDFS-7285.004.patch)

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-29 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.004.patch

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-29 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.003.patch

Attaching new patch which based on Andrew's comments.

# An overall timeout is enforced
# All data fetching happens in a single loop, leveraging Yi's idea under 
HDFS-7348
# It also refactors shared striped reading logic (among client and DN) to the 
util class. [~andrew.wang] / [~hitliuyi] could you take a look at the changes 
in {{StripedBlockUtil}}? If that part looks OK I'll split it to HDFS-8282 and 
get it in first, so this client decode JIRA doesn't block HDFS-7348.

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-29 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.004.patch

Rebased after the changes of HDFS-8272.

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-28 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

 Target Version/s: HDFS-7285
Affects Version/s: HDFS-7285
   Status: Patch Available  (was: In Progress)

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678.000.patch, 
 HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-28 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.002.patch

New patch with a functional test. Also renaming to trigger Jenkins.

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-28 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678.001.patch

Revised patch. The {{DFSStripedInputStream}} part should be functional. The 
test needs some additional tweak because it's not trivial to emulate parity 
block content with {{SimulatedFSDataset}}.

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678.000.patch, 
 HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-04-17 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678.000.patch

The patch is not functional yet but demonstrates the idea. 

[~drankye] I have a question about decoding: in a (6+3) schema, if block #2 is 
missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I 
construct the inputs to {{RawErasureDecoder#decode}}?

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678.000.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality

2015-03-31 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Summary: Erasure coding: DFSInputStream with decode functionality  (was: 
Block group reader with decode functionality)

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: BlockGroupReader.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)