[jira] [Commented] (HDFS-8347) Erasure Coding: whether to use the same chunkSize in decoding with the value in encoding

2015-05-11 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539115#comment-14539115
 ] 

jack liuquan commented on HDFS-8347:


Hi Kai,
bq.do you think this design change works for you or not to implement HitchHiker 
algorithm in HADOOP-11828?
It's OK for me.

 Erasure Coding: whether to use the same chunkSize in decoding with the value 
 in encoding
 

 Key: HDFS-8347
 URL: https://issues.apache.org/jira/browse/HDFS-8347
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng

 Currently decode buffersize for Datanode striped block reconstruction is 
 configurable and can be less or larger than chunksize, it may cause issue for 
 Hitchhiker which may require encode/decode using same buffersize.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7337) Configurable and pluggable Erasure Codec and schema

2015-04-23 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508740#comment-14508740
 ] 

jack liuquan commented on HDFS-7337:


Hi Kai,
As I know, Hitchhiker code can be configured the same as RS code. Using  system 
defined schemas RS(6,3) and RS(10, 4) is OK.
Hitchhikercodec can also be configured as you showing in 
PluggableErasureCodec-v3.pdf.

 Configurable and pluggable Erasure Codec and schema
 ---

 Key: HDFS-7337
 URL: https://issues.apache.org/jira/browse/HDFS-7337
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Kai Zheng
 Attachments: HDFS-7337-prototype-v1.patch, 
 HDFS-7337-prototype-v2.zip, HDFS-7337-prototype-v3.zip, 
 PluggableErasureCodec-v2.pdf, PluggableErasureCodec-v3.pdf, 
 PluggableErasureCodec.pdf


 According to HDFS-7285 and the design, this considers to support multiple 
 Erasure Codecs via pluggable approach. It allows to define and configure 
 multiple codec schemas with different coding algorithms and parameters. The 
 resultant codec schemas can be utilized and specified via command tool for 
 different file folders. While design and implement such pluggable framework, 
 it’s also to implement a concrete codec by default (Reed Solomon) to prove 
 the framework is useful and workable. Separate JIRA could be opened for the 
 RS codec implementation.
 Note HDFS-7353 will focus on the very low level codec API and implementation 
 to make concrete vendor libraries transparent to the upper layer. This JIRA 
 focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-04-07 Thread jack liuquan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack liuquan updated HDFS-7715:
---
Attachment: 7715-hitchhikerXOR-v2-testcode.patch

How about on this Thursday afternoon?
bq.That's ok. maybe we can discuss in tomorrow meeting .
I thought you could upload it just for me to review.
bq.I have uploaded the test codes.
No,it's not in the good style. How about {{pbIndex}}?
No. It's good to have the temp variables, but please use good names, 
consistently.
bq.I have got it. I will modify them in next patch.



 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: 7715-hitchhikerXOR-v2-testcode.patch, 
 7715-hitchhikerXOR-v2.patch, HDFS-7715-hhxor-decoder.patch, 
 HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-04-06 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482497#comment-14482497
 ] 

jack liuquan commented on HDFS-7715:


High level:
1. Please write a comprehensive class header comments about the new code and 
coder, also acknowledging the original author's effort.
bq.OK,sure.
2. For now, we need to figure out how to map these raw HH coders to 
corresponding high level {{ErasureCoder}}s, if we decide to implement them as 
raw coders directly;
bq.When do you have available time to make a phone call, I want to disscuss 
with you by phone, Thanks.:)
3. Do we have tests for the new coders?
bq.Yes, I have test the news coders, and It's right. But for the 30K limit, I 
didn't upload the test codes. Can I upload the test codes alone?

1. Any better name for variable *pb_vec*?
bq.It's named by Rashmi. Maybe Rashmi can give a suggestion. I think *pb_vec* 
is a index for storing piggybacks of first sub-stripe, maybe pb_index is ok.
2. Move the codes about computing generating polynomial to HHUtil?
bq.Sounds good, I will do it in new patch.
3. The following variables are not good. Please use numDataUnits, 
numParityUnits instead for consistency in all places.
bq.If use numDataUnits, numParityUnits instead for consistency, we need change 
{{private}} to {{protected}} of numDataUnits, numParityUnits in 
{{AbstractRawErasureCoder}}

4. In HHUtil.getPiggyBacksFromInput, the parameter encoder isn't used.
bq. the parameter encoder is used in line 62:
{code}
+encoder.encode(tempInput, tempOutput);
{code}


 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: 7715-hitchhikerXOR-v2.patch, 
 HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-04-03 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394113#comment-14394113
 ] 

jack liuquan commented on HDFS-7715:


Hi Kai,
Thanks for your suggestions.
But I think in HH, we can also read chunk by chunk, we just need to divide one 
chunk into two sub-stripes and deal with.
Do you think it would be OK ?

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: 7715-hitchhikerXOR-v2.patch, 
 HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-04-02 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393991#comment-14393991
 ] 

jack liuquan commented on HDFS-7715:


Hi Kai,
As we all know, the advantage of hitchhiker is reduction of data required 
during the reconstruction of one data unit missing.
To implement Hitchhiker encoder/decoder for block group in coder path, I think, 
the only different is the ECChunk[] inputChunks in performCoding() of decoder.
Can you tell me how the inputBlocks point to the real HDFS blocks and fetch the 
data to the inputChunks?
Because hitchhiker divided one block into two stripes to encode/decode, I think 
maybe we can add a new HitchhikerBlock class extends ECBlock to directive which 
sub-stripe of block to fetch for hitchhiker decoding.

eg:
public class HitchhikerBlock extends ECBlock{

  private int substripe; 
  
  //if substripe=0, fetch whole HDFS block data which this ECBlock point to,
  //if substripe=1, fetch the first sub-stripe of HDFS block which this ECBlock 
point to,
  //if substripe=2, fetch the second sub-stripe of HDFS block which this 
ECBlock point to.
  
Thanks a lot if you can give me some suggestions.:)

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: 7715-hitchhikerXOR-v2.patch, 
 HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-04-02 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392629#comment-14392629
 ] 

jack liuquan commented on HDFS-7715:


Hi Kai,
OK, I have seen your test result in HADOOP-11540, It's very good. I will also 
do some test using the test case in HADOOP-11588.
Thanks!

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: 7715-hitchhikerXOR-v2.patch, 
 HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-28 Thread jack liuquan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack liuquan updated HDFS-7715:
---
Attachment: 7715-hitchhikerXOR-v2.patch

Hi kai,
I have uploaded a new implementation of hitchhiker-XOR version. This separate 
the RS and HH specific math calculation, reusing the existing RS code as more 
as possible. Hope native XOR and RS raw coders can be utilized to benefit from 
the performance improvement.
please review the code.Thanks!

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: 7715-hitchhikerXOR-v2.patch, 
 HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-25 Thread jack liuquan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack liuquan updated HDFS-7715:
---
Attachment: (was: HDFS-7715.zip)

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: HDFS-7715-hhxor-decoder.patch, 
 HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-25 Thread jack liuquan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack liuquan updated HDFS-7715:
---
Attachment: HDFS-7715-hhxor-decoder.patch
HDFS-7715-hhxor-encoder.patch

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: HDFS-7715-hhxor-decoder.patch, 
 HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-25 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379445#comment-14379445
 ] 

jack liuquan commented on HDFS-7715:


Hi Kai,
Thanks for your comments!
1. the pom file doesn't need to change. I just modify it for building in my 
machine. sorry for that.
2. I have refined my codes with your suggestions, coding style using google 
style.
3. current implementation codes of hh add some functions in GaloisField class 
for performance improvement consideration.
RS and piggyback calculations are always together. We can separate RS and 
piggyback calculations and use existing RS raw coders for hh, but need to 
sacrifice a little performance. I think we should balance gains between using 
native RS raw coders and separate them.

for review:
1. I have only attach the code of hitchhiker-XOR version
2. Last time i use zip format cause i can't upload patch more than 30kb. 
Thanks again!

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: HDFS-7715-hhxor-decoder.patch, 
 HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-25 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379446#comment-14379446
 ] 

jack liuquan commented on HDFS-7715:


Hi Kai,
Thanks for your comments!
1. the pom file doesn't need to change. I just modify it for building in my 
machine. sorry for that.
2. I have refined my codes with your suggestions, coding style using google 
style.
3. current implementation codes of hh add some functions in GaloisField class 
for performance improvement consideration.
RS and piggyback calculations are always together. We can separate RS and 
piggyback calculations and use existing RS raw coders for hh, but need to 
sacrifice a little performance. I think we should balance gains between using 
native RS raw coders and separate them.

for review:
1. I have only attach the code of hitchhiker-XOR version
2. Last time i use zip format cause i can't upload patch more than 30kb. 
Thanks again!

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: HDFS-7715-hhxor-decoder.patch, 
 HDFS-7715-hhxor-encoder.patch


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-23 Thread jack liuquan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack liuquan updated HDFS-7715:
---
Attachment: HDFS-7715.zip

hi all, I have uploaded the core code of Hitchhiker, please review the code and 
tell me if you find any code not right. Thanks!

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan
 Attachments: HDFS-7715.zip


 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-15 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362660#comment-14362660
 ] 

jack liuquan commented on HDFS-7715:


Hi Rashmi,
Thanks for your respones.
I make a code modification of remainderWithPBCalcXOR() in GaloisField.java,
and make Hitchhiker-XOR+ not relying the all-XOR-parity property of underlying 
RS code.
Can you review my code below and check whether it is OK ?
Thank you very much!

===
  public void remainderWithPBCalcXOR(byte[][] dividend, int[] divisor, byte[][] 
piggys, 
int[] piggyBackSetSizes, int pb_vec, int stripeSize, int 
paritySize) { 
...
...

// add for hitchhiker-XOR+, no need the all-XOR-parity property.
// calc all-XOR of all first substrip data units
int piggys_pbvec = 0;
for (int m = paritySize; m  dividend.length; ++m) {
piggys_pbvec = (piggys_pbvec ^ dividend[m][k]);
}

for (int i = dividend.length - divisor.length; i = 0; i--) {
for (int j = 0; j  divisor.length; j++) {  

//calculate the parities
int ratio = 
divTable[dividend[i + 
divisor.length - 1][k]  0x00FF][divisor[divisor.length - 1]];
dividend[j + i][k] = (byte)((dividend[j + i][k] 
 0x00FF) ^ mulTable[ratio][divisor[j]]);
}
}

dividend[pb_vec][k] = (byte)piggys_pbvec;
}

This modification may sacrifice a little performance.

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan

 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-15 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362706#comment-14362706
 ] 

jack liuquan commented on HDFS-7715:


Oh, I recognize that there is a problem with my modification. When underlying 
RS code satisfying the 'all-XOR-parity' property, my modification cann't 
reconstruct four units data missing scene.

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan

 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-13 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14360119#comment-14360119
 ] 

jack liuquan commented on HDFS-7715:


Hi Rashmi,
as you mentioned in your paper, Hitchhiker-XOR+ need the underlying RS code to 
possess the all-XOR-parity property.
and the matrix 'generatorMatrixForParity' is the generator matrix of the 
underlying RS code employed in HDFS-RAID. 
I found the 'generatorMatrixForParity' not having the all-XOR-parity property.
Is that mean Hitchhiker-XOR+ can't been used for current HDFS-RAID? 


ps:
I have tested the reconstruction of tenth unit data of Hitchhiker-XOR+ with the 
test code in TestErasureCodes.java,
and found the result was not right. But the Hitchhiker-nonXOR's result was 
right.

test Hitchhiker-XOR+ running logs:
===stripe data===
0:1,|0:2,|0:3,|0:4,|0:5,|0:6,|0:7,|0:8,|0:9,|0:10,|
0:1,|0:2,|0:3,|0:4,|0:5,|0:6,|0:7,|0:8,|0:9,|0:10,|
==parity data after encode===
94,-74,44,6
94,-74,43,-55
=readBufs===
0:0,|0:0,|0:0,|0:6,|0:0,|0:0,|0:0,|0:0,|0:0,|0:0,|0:0,|0:0,|0:0,|0:0,|
0:94,|0:-74,|0:43,|0:0,|0:1,|0:2,|0:3,|0:4,|0:5,|0:6,|0:7,|0:8,|0:9,|0:0,|
=writeBufs===
0:-50,|0:0,|0:0,|0:0,|
0:10,|0:-74,|0:44,|0:-49,|

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan

 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
 during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-10 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356105#comment-14356105
 ] 

jack liuquan commented on HDFS-7715:


OK, I see. Thanks for your responses.

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan

 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25% and 
 45% during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7715) Implement the Hitchhiker erasure coding algorithm

2015-03-10 Thread jack liuquan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356104#comment-14356104
 ] 

jack liuquan commented on HDFS-7715:


OK, I see. Thanks for your responses.

 Implement the Hitchhiker erasure coding algorithm
 -

 Key: HDFS-7715
 URL: https://issues.apache.org/jira/browse/HDFS-7715
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: jack liuquan

 [Hitchhiker | 
 http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
 a new erasure coding algorithm developed as a research project at UC 
 Berkeley. It has been shown to reduce network traffic and disk I/O by 25% and 
 45% during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
 HDFS-EC framework, as one of the pluggable codec algorithms.
 The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)