[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-15 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148230#comment-15148230
 ] 

Rui Li commented on HADOOP-12041:
-

Thanks Zhe and Kai for the explanations. I just filed HADOOP-12808 as the 
follow on task.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-15 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148174#comment-15148174
 ] 

Kai Zheng commented on HADOOP-12041:


bq. So the follow-on task should cover the coder renaming and adding 
package-info, right?
Yeah. A thing to add is, change the default RS coder for HDFS to this new one.
bq. I meant we should have a way to iterate through all combinations of failed 
indexes etc. 
Great idea. We might do it as another follow-on task.



> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133403#comment-15133403
 ] 

Zhe Zhang commented on HADOOP-12041:


Sorry about the confusion. I meant we should have a way to iterate through all 
combinations of failed indexes etc. Similar to {{getDnIndexSuite}} in 
{{TestDFSStripedOutputStreamWithFailure}}

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131408#comment-15131408
 ] 

Kai Zheng commented on HADOOP-12041:


Thanks [~zhz] and [~walter.k.su] for the review and commit!! Wish it to be much 
useful for EC users and worth the much time I spent on it. :)
bq. I think we should also file a follow-on JIRA to make the tests more 
systematic, like what Rui did for TestDFSStripedOutputStreamWithFailure.
I guess you meant [~demongaorui]. Sounds good to have systematic tests for the 
coders. Compatible tests with ISA-L coder will be provided in HADOOP-12540. For 
this follow-on and the follow-on works discussed in this issue, [~lirui], would 
you help with these? Thanks.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131415#comment-15131415
 ] 

Kai Zheng commented on HADOOP-12041:


I meant HADOOP-11540.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-03 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131557#comment-15131557
 ] 

Rui Li commented on HADOOP-12041:
-

So the follow-on task should cover the coder renaming and adding package-info, 
right?
[~zhz], could you please elaborate a bit on how we should make tests 
systematic? Thanks.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131369#comment-15131369
 ] 

Hudson commented on HADOOP-12041:
-

FAILURE: Integrated in Hadoop-trunk-Commit #9242 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9242/])
HADOOP-12041. Implement another Reed-Solomon coder in pure Java. (zhezhang: rev 
c89a14a8a4fe58f01f0cba643f2bc203e1a8701e)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/RSRawEncoder2.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/TestRSRawCoder.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/AbstractRawErasureDecoder.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/RawErasureCoder.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/RSRawDecoder.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/TestRSRawCoder2.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/util/DumpUtil.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/TestRSRawCoderBase.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/util/RSUtil2.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/TestXORRawCoder.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/RSRawErasureCoderFactory2.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/util/RSUtil.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/RSRawDecoder2.java
* hadoop-common-project/hadoop-common/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/util/GF256.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/erasurecode/rawcoder/util/CoderUtil.java


> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-01 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127239#comment-15127239
 ] 

Zhe Zhang commented on HADOOP-12041:


Thanks for the great work Kai. Patch LGTM overall. A few minor issues:
# {{makeValidIndexes}} should be static and maybe moved to {{CodecUtil}}
# The reported {{checkstyle}} issue looks valid.
# When you say the new coder is compatible with ISA-L coder, you mean I can use 
new Java coder to decode data encoded with ISA-L, right?
# {{gftbls}} is hard to understand by name, does it mean {{gfTables}}?
# Any reason to allocate {{encodeMatrix}}, {{decodeMatrix}}, and 
{{invertMatrix}} as 1D arrays but use them as matrices? Can we use 2D arrays?
# The below is not easy to understand. Why don't we need to prepare 
{{decodeMatrix}} if the cached indexes haven't changed?
{code}
if (Arrays.equals(this.cachedErasedIndexes, erasedIndexes) &&
Arrays.equals(this.validIndexes, tmpValidIndexes)) {
  return; // Optimization. Nothing to do
}
{code}
# {{RSUtil2#genReedSolomonMatrix}} is unused
# Patch is already very large, I think we should add {{package-info}} 
separately.
# So about the incompatibility between HDFS-RAID coder and the new Java coder: 
is it because they use different GF matrices?

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-01 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127327#comment-15127327
 ] 

Kai Zheng commented on HADOOP-12041:


Thanks Zhe for the careful review. The comments are very helpful.
bq. makeValidIndexes should be static and maybe moved to CodecUtil
How about adding {{CoderUtil}} for coder implementation? This is something I 
wanted to do but get missed. As we're going to have various raw erasure coders, 
we'll need a general utility class to embrace such. {{CodecUtil}} can focus on 
caller interface side.
bq. The reported checkstyle issue looks valid.
Sure, pending the fixup for your comments. :)
bq. When you say the new coder is compatible with ISA-L coder, you mean I can 
use new Java coder to decode data encoded with ISA-L, right?
Exactly. Probably that's why we're here. As you may have already noted, there 
does be some thin HDFS client that's not equipped with native hadoop library, 
so pure Java solution in good performance would be needed in case the ISA-L 
coder isn't available.
bq. gftbls is hard to understand by name, does it mean gfTables?
Ah, yes. Will rename it.
bq. Any reason to allocate encodeMatrix, decodeMatrix, and invertMatrix as 1D 
arrays but use them as matrices? Can we use 2D arrays?
I borrow the approach from the ISA-L library as the whole is. I thought it goes 
like that for better performance, because the mentioned matrices are highly 
frequently accessed during encoding/decoding computing thus better to be 
compact together so able to remain in the cache.
bq. The below is not easy to understand. Why don't we need to prepare 
decodeMatrix if the cached indexes haven't changed? 
I thought better to add some comment for easier understanding. The decodeMatrix 
is only relevant to the schema and erased indexes. The schema is bounded during 
initialization phase and won't change; the erased indexes however can change 
during different calls, but if not changed, the decodeMatrix won't need to be 
recomputed. As indicated in HDFS side, we're very likely calling a decoder 
repeatedly for a corrupt block group which is of the fixed erasure indexes, the 
optimization trick here would make some sense.
bq. RSUtil2#genReedSolomonMatrix is unused
OK, let me remove it for now and add it later when support the mode of encode 
matrix generation.
bq. Patch is already very large, I think we should add package-info separately.
Yeah, sure. I think we can add them when doing the coder rename, if we can bear 
the checking style issue complaining the lack of them.
bq. So about the incompatibility between HDFS-RAID coder and the new Java 
coder: is it because they use different GF matrices?
I think so. The implementation approach is also quite different. As you see, in 
this new Java coder (same in the ISA-L library), encode/decode matrix is 
previously calculated, the encoding and decoding are unified in the same way 
that purely does the matrix operation against the input data bytes. The benefit 
is, the decoder and encoder can be optimized together and are of the same high 
performance.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-01 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127569#comment-15127569
 ] 

Kai Zheng commented on HADOOP-12041:


Patch updated addressing an unused import issue.
bq. Missing package-info.java file.
This will be addressed separately as discussed above.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, 
> HADOOP-12041-v6.patch, HADOOP-12041-v7.patch, HADOOP-12041-v8.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127548#comment-15127548
 ] 

Hadoop QA commented on HADOOP-12041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 59s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 34s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 22s 
{color} | {color:red} hadoop-common-project/hadoop-common: patch generated 2 
new + 8 unchanged - 1 fixed = 10 total (was 9) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 17s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 15s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 48s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12785663/HADOOP-12041-v7.patch 
|
| JIRA Issue | HADOOP-12041 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  

[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-02-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127674#comment-15127674
 ] 

Hadoop QA commented on HADOOP-12041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
49s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
38s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
4s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 0s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
39s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 30s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 50s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
32s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 111m 35s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.security.ssl.TestReloadingX509TrustManager |
|   | hadoop.fs.shell.find.TestAnd |
|   | hadoop.io.compress.TestCodecPool |
|   | hadoop.fs.shell.find.TestPrint |
|   | hadoop.fs.shell.find.TestIname |
|   | hadoop.fs.shell.find.TestName |
|   | hadoop.ipc.TestIPC |
| JDK v1.7.0_91 Failed junit tests | 

[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101570#comment-15101570
 ] 

Hadoop QA commented on HADOOP-12041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
19s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 5m 31s 
{color} | {color:red} root in trunk failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 5m 59s 
{color} | {color:red} root in trunk failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 12m 48s 
{color} | {color:red} root-jdk1.8.0_66 with JDK v1.8.0_66 generated 157 new 
issues (was 576, now 733). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 48s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 47s 
{color} | {color:red} root in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 47s {color} 
| {color:red} root in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-common-project/hadoop-common (total was 9, now 9). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 30s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 32s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 44s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12782488/HADOOP-12041-v6.patch 
|
| JIRA Issue | HADOOP-12041 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 31b6eacb7ea9 3.13.0-36-lowlatency 

[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-15 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101407#comment-15101407
 ] 

Walter Su commented on HADOOP-12041:


The new coder passes test and is stable locally.

bq. What did you mean by "still possible GF256 be inited twice"? 
{code}
156   public static void init() {
157 if (inited) {
158   return;
159 }
160 
161 synchronized (GF256.class) {
162   theGfMulTab = new byte[256][256];
163   for (int i = 0; i < 256; i++) {
164 for (int j = 0; j < 256; j++) {
165   theGfMulTab[i][j] = gfMul((byte) i, (byte) j);
166 }
167   }
168   inited = true;
169 }
170   }
{code}
{{inited}} is initially {{false}}, and 2 threads may reach line 157 at the same 
time, then both goto line 161.

bq. The old HDFS-RAID originated coder will still be there for comparing, and 
converting old data from HDFS-RAID systems.
HDFS-RAID is no longer in latest release. So HDFS-RAID system is an old 
cluster, we should use DistCp or etc. I guest it's ok to remove the old coder?

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-15 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101434#comment-15101434
 ] 

Kai Zheng commented on HADOOP-12041:


Thanks [~walter.k.su] for the nice comments!

bq. inited is initially false, and 2 threads may reach line 157 at the same 
time, then both goto line 161.
Ah you're right. I got it. Even though it won't hurt to do the init twice, it's 
better to avoid it. How about making the init() call in a static block in 
{{GF256}}? You know when I wrote the codes I didn't want to be complicated. :)

bq. HDFS-RAID is no longer in latest release ...
Yeah actually the related codes were in the rather old history. But I do know 
some companies still using the old coder (or their new ones originated from 
it). {{DistCp}} is a good option for them. I may consider too much when 
thinking about some situations the coder may be used out of HDFS, note the 
coder/codec framework resides in hadoop common side and it potentially can be 
used in other contexts. Another reason we might still need the codes is, some 
new codec/coder bases on the related codes, like the HitchHicker one 
[~jack_liuquan] is implementing in HADOOP-11828. On the other hand, the old 
coder is hard to maintain to align with the new Java coder and ISA-L coder, I 
thought eventually we would better to get it rid of as you said when assured. 
[~zhz] mentioned it can be marked as _legacy_ is also an option.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-08 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090120#comment-15090120
 ] 

Zhe Zhang commented on HADOOP-12041:


Agreed, let's do the rename separately.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-07 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15088720#comment-15088720
 ] 

Kai Zheng commented on HADOOP-12041:


Hi [~zhz],

bq. we should rename the existing coder as {{RSRawEncoderLegacy}} and name the 
new one as RSRawEncoder
I'm afraid we'd better do the rename things separately. If we do it here, then 
the patch is rather messy and hard to review. In the current patch revision, 
most of the codes are new and no change for existing coders. But doing the 
rename, like RSRawEncoder->RSRawEncoderLegacy, RSRawEncoder2->RSRawEncoder will 
make the patch be like most of changes into the existing RS coders.

So how about we do this separately, together some other things like applying it 
to HDFS side?

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-06 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086391#comment-15086391
 ] 

Zhe Zhang commented on HADOOP-12041:


Thanks Kai for the work. I'm still reviewing the patch but would like to 
discuss a high level question. 

bq. The old HDFS-RAID originated coder will still be there for comparing, and 
converting old data from HDFS-RAID systems.
The old and new coders should generate the same results, right? I don't think 
we need the old coder to port data? I guess it depends on whether 
{{GaloisField}} has the same matrix at size 256 as the new {{GF256}}?

bq. The new Java RS coder will be favored and used by default
If that's our position, we should rename the existing coder as 
{{RSRawEncoderLegacy}}} and name the new one as {{RSRawEncoder}} (same for 
decoder). Alternatively, if we think the stability of the new coder needs more 
testing, we can keep the current naming in v5 patch, implying that the new 
coder is in "beta" mode.

Some unused methods: {{genReedSolomonMatrix}}, {{gfBase}}, {{gfLogBase}}



> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-06 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086713#comment-15086713
 ] 

Kai Zheng commented on HADOOP-12041:


Thanks Zhe for the good questions.
bq. The old and new coders should generate the same results, right?
Unfortunately not. That's why I would propose and work on another pure Java 
coder here. Even both so called {{Reed-Solomon}} coder, the HDFS-RAID one and 
ISA-L one uses different coding forms internally. Both use GF256 as the 
targeted HDFS is a byte-based data system. The existing GaliosField facility 
used by HDFS-RAID also favours other symbol size than 256 but as this isn't 
needed in fact, so the new GF256 facility is much simplified. This new Java 
coder is developed to be compatible with the ISA-L coder in case native library 
isn't available in both development and experimental environment. The HDFS-RAID 
one isn't compatible but can be used to port existing data from legacy system 
in case it's needed.

bq. we should rename the existing coder as RSRawEncoderLegacy} and name the new 
one as RSRawEncoder
Excellent idea! Thanks.

bq. Some unused methods: genReedSolomonMatrix, gfBase, gfLogBase
{{GF256}} serves as a complete GF basic facility class I would suggest we keep 
them even unused for now. {{genReedSolomonMatrix}} will be needed because 
people may want to support that coding matrix generation in the algorithm.

Look forward to your more review comments. :)

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15084623#comment-15084623
 ] 

Hadoop QA commented on HADOOP-12041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 29s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 3 new checkstyle issues in 
hadoop-common-project/hadoop-common (total was 9, now 11). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 45s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 41s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 9s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_91 Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.ha.TestZKFailoverController |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12780653/HADOOP-12041-v5.patch 
|
| JIRA Issue | HADOOP-12041 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 591e53311eef 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 

[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2016-01-05 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15084503#comment-15084503
 ] 

Kai Zheng commented on HADOOP-12041:


Hi [~walter.k.su],

I'm going to update this. Would you help with this question? In GF256.init(), 
it's mainly to initialize {{theGfMulTab}}. What did you mean by "still possible 
GF256 be inited twice"? Thanks.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-12-01 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035407#comment-15035407
 ] 

Kai Zheng commented on HADOOP-12041:


Thanks [~walter.k.su] for the review. I work on this one together with the 
ISA-L coder, so I guess the unused function may come from there. I will check 
with your comments when update the patch.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-12-01 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035229#comment-15035229
 ] 

Walter Su commented on HADOOP-12041:


It looks good. Thanks [~drankye].
1. erasures2erased(int[] erasures) is not used?
2. In GF256.init(), it is still possible GF256 be inited twice.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, 
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993347#comment-14993347
 ] 

Hadoop QA commented on HADOOP-12041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s 
{color} | {color:red} Patch generated 2 new checkstyle issues in 
hadoop-common-project/hadoop-common (total was 9, now 10). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 4s {color} | 
{color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 15s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 7s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_60 Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
| JDK v1.7.0_79 Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.security.ssl.TestReloadingX509TrustManager |
|   | hadoop.ipc.TestDecayRpcScheduler |
|   | hadoop.metrics2.impl.TestGangliaMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-06 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12770964/HADOOP-12041-v4.patch 
|
| JIRA Issue | HADOOP-12041 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux 90591839b0d6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | 

[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986717#comment-14986717
 ] 

Hadoop QA commented on HADOOP-12041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 49s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 35s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s 
{color} | {color:red} Patch generated 5 new checkstyle issues in 
hadoop-common-project/hadoop-common (total was 9, now 13). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 22s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 8s {color} | 
{color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
27s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 40s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_79 Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.ipc.TestDecayRpcScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-03 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12770237/HADOOP-12041-v3.patch 
|
| JIRA Issue | HADOOP-12041 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux d4e37e2e8cac 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/patchprocess/apache-yetus-1a9afee/precommit/personality/hadoop.sh
 |
| git revision | trunk / d565480 |
| Default Java | 

[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-11-02 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986355#comment-14986355
 ] 

Kai Zheng commented on HADOOP-12041:


Looks like it isn't too bad for the large portion of new codes :). Will refine 
and update the patch today.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-11-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984725#comment-14984725
 ] 

Hadoop QA commented on HADOOP-12041:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 10s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 50s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s 
{color} | {color:red} Patch generated 17 new checkstyle issues in 
hadoop-common-project/hadoop-common (total was 4, now 21). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 4 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 7s 
{color} | {color:red} hadoop-common-project/hadoop-common introduced 1 new 
FindBugs issues. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 31s 
{color} | {color:red} hadoop-common-project_hadoop-common-jdk1.8.0_60 with JDK 
v1.8.0_60 has problems. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 5m 8s 
{color} | {color:red} hadoop-common-project_hadoop-common-jdk1.7.0_79 with JDK 
v1.7.0_79 has problems. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 31s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 51s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m 51s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common-project/hadoop-common |
|  |  Synchronization on Boolean in 
org.apache.hadoop.io.erasurecode.rawcoder.util.GF256.init()  At GF256.java: At 
GF256.java:[line 145] |
| JDK v1.7.0_79 Failed 

[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-10-27 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977547#comment-14977547
 ] 

Kai Zheng commented on HADOOP-12041:


I will rebase the patch on trunk when HADOOP-12040 is in, but I guess there 
won't much to change.
[~zhz] maybe you could help review this assuming you're interested? Thanks! 
Also welcome others' comments. 

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12041) Implement another Reed-Solomon coder in pure Java

2015-10-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977720#comment-14977720
 ] 

Zhe Zhang commented on HADOOP-12041:


Thanks Kai! Sure, I will review the patch soon.

> Implement another Reed-Solomon coder in pure Java
> -
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)