[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-05-05 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529866#comment-14529866
 ] 

Zhe Zhang commented on HDFS-7859:
-

Cool! I just registered. Thanks for organizing it Allen.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
  Labels: BB2015-05-TBR
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-05-05 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529858#comment-14529858
 ] 

Zhe Zhang commented on HDFS-7859:
-

[~aw] Could you explain a bit what this {{BB2015-05-TBR}} label means?

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
  Labels: BB2015-05-TBR
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-05-05 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529562#comment-14529562
 ] 

Kai Zheng commented on HDFS-7859:
-

Nicholas,

This was already move out of HDFS-7285 you did and there was no plan to commit 
this in phase I AFAIK. I thought the patch updated here is good to have to be 
ready for follow-on once we get the merge done.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-05-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529859#comment-14529859
 ] 

Allen Wittenauer commented on HDFS-7859:


It means you haven't been paying attention to the bug bash emails. :)

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
  Labels: BB2015-05-TBR
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-05-05 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529040#comment-14529040
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

Please do not commit this JIRA to the HDFS-7285 branch since we won't support 
multiple schemas for the moment.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-30 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521029#comment-14521029
 ] 

Xinwei Qin  commented on HDFS-7859:
---

The 003 patch removes MODIFY and REMOVE ECSchema editlog operations, these 
operations will be added by another JIRA(HDFS-8295) later when they are 
supported. 

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
 HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521376#comment-14521376
 ] 

Hadoop QA commented on HDFS-7859:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 48s | The applied patch generated  
10  additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 11s | The patch appears to introduce 9 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 239m 34s | Tests failed in hadoop-hdfs. |
| | | 288m  5s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 90% of time  
Unsynchronized access at DFSOutputStream.java:90% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Class org.apache.hadoop.hdfs.DataStreamer$LastException is not derived 
from an Exception, even though it is named as such  At DataStreamer.java:from 
an Exception, even though it is named as such  At DataStreamer.java:[lines 
177-201] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 105] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:[line 85] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.planReadPortions(int, int, long, 
int, int)  At StripedBlockUtil.java:[line 167] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestMetadataVersionOutput |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestDFSRollback |
|   | hadoop.hdfs.server.namenode.TestCreateEditsLog |
|   | hadoop.hdfs.protocol.TestLayoutVersion |
|   | hadoop.hdfs.TestDFSFinalize |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | 

[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-29 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518754#comment-14518754
 ] 

Zhe Zhang commented on HDFS-7859:
-

[~aw] Thanks again for bringing in the feature-branch pre-commit Jenkins 
functionality! It's really helpful. We just saw another successful run under 
HDFS-7678.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517414#comment-14517414
 ] 

Allen Wittenauer commented on HDFS-7859:


P.S., thanks for letting me use this issue as a guinea pig. :D

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-28 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517393#comment-14517393
 ] 

Allen Wittenauer commented on HDFS-7859:


bq. 263m 35s 

Youch.  Just under the wire.

bq. git revisionHDFS-7285 / bc3091b 

 (y) So yes, it switched to the feature branch to run the tests, as was 
expected.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516609#comment-14516609
 ] 

Hadoop QA commented on HDFS-7859:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  2s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 49s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 14s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 48s | The applied patch generated  
10  additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 18s | The patch appears to introduce 
11 new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 216m 12s | Tests failed in hadoop-hdfs. |
| | | 263m 35s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 89% of time  
Unsynchronized access at DFSOutputStream.java:89% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, 
int, int)  At DFSStripedInputStream.java:to long in 
org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, 
int, int)  At DFSStripedInputStream.java:[line 95] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 104] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$AddECSchemaOp.toString() 
makes inefficient use of keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:[line 4552] |
|  |  
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$ModifyECSchemaOp.toString() 
makes inefficient use of keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:[line 4624] |
|  |  
org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.writeECSchema(DataOutputStream,
 ECSchema) makes inefficient use of keySet iterator instead of entrySet 
iterator  At FSImageSerialization.java:of keySet iterator instead of entrySet 
iterator  At FSImageSerialization.java:[line 792] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 

[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14515030#comment-14515030
 ] 

Allen Wittenauer commented on HDFS-7859:


test-patch.sh reads the name of the patch, not any of the JIRA metadata.  So if 
the patch is named something generic, it thinks it is trunk.  See 
HowToContribute for the official rules, but as you can see from the name of the 
patch above, it knows about a few different methods to name them.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, 
 HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514890#comment-14514890
 ] 

Zhe Zhang commented on HDFS-7859:
-

[~aw] I quickly went through HDFS-7285 sub tasks. If you'd like you can try 
with HDFS-8236. 

I actually tried with HDFS-8033 earlier but it still tried to apply the patch 
against trunk. Maybe it's because I didn't set target version to HDFS-7285 
_when submitting patch_.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, 
 HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514233#comment-14514233
 ] 

Allen Wittenauer commented on HDFS-7859:


(now we just need a submit button. lol)

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, 
 HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516309#comment-14516309
 ] 

Allen Wittenauer commented on HDFS-7859:


FYI, there are now two of these running:

https://builds.apache.org/job/PreCommit-HDFS-Build/10424/console
https://builds.apache.org/job/PreCommit-HDFS-Build/10425/console

It's still churning through hadoop-hdfs unit tests on the one that [~xinwei] 
submitted earlier.  hadoop-hdfs is one of the slowest set of unit tests we 
have. I have a hunch that you folks have added code in this branch which has 
made it even slower ... to the point that Jenkins will likely kill the test 
patch job before it finishes. 

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, 
 HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-27 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514718#comment-14514718
 ] 

Allen Wittenauer commented on HDFS-7859:


I did some playing with a test jira this morning.  IIRC, It looks like submit 
patch is only available to the requester and the assignee when the jira is in 
the 'in progress' status.  The 'in progress' status can only be changed by the 
assignee and/or the requester.  I then thought well, I'll force it through 
jenkins... but test-patch.sh is smart in that it will only process jiras that 
are in patch available status.  So while I could have changed the meta info in 
the JIRA to force it to kick off, I didn't want to freak anyone out more than I 
already had by popping up in here.  I thought it was going to be an easy/quick 
test. :(

Running test-patch.sh as a developer against this JIRA # *does* run it against 
the HDFS-7285 branch though, as expected.  :D  (I had tested patches against 
branch-2, but hadn't had a chance to test against a dev branch... so this 
updated last night and thought it'd be a good guinea pig)

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, 
 HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514629#comment-14514629
 ] 

Zhe Zhang commented on HDFS-7859:
-

Thanks Allen. Do you know why Submit Patch isn't available here?

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859-HDFS-7285.002.patch, HDFS-7859.001.patch, 
 HDFS-7859.002.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516433#comment-14516433
 ] 

Hadoop QA commented on HDFS-7859:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 23s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 41s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   4m  1s | The applied patch generated  9 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 50s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 13s | The patch appears to introduce 
11 new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 217m 55s | Tests failed in hadoop-hdfs. |
| | | 263m 53s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 89% of time  
Unsynchronized access at DFSOutputStream.java:89% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, 
int, int)  At DFSStripedInputStream.java:to long in 
org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, 
int, int)  At DFSStripedInputStream.java:[line 95] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 104] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$AddECSchemaOp.toString() 
makes inefficient use of keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:[line 4552] |
|  |  
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$ModifyECSchemaOp.toString() 
makes inefficient use of keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:keySet iterator instead of entrySet iterator  At 
FSEditLogOp.java:[line 4624] |
|  |  
org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.writeECSchema(DataOutputStream,
 ECSchema) makes inefficient use of keySet iterator instead of entrySet 
iterator  At FSImageSerialization.java:of keySet iterator instead of entrySet 
iterator  At FSImageSerialization.java:[line 792] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 

[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-16 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498895#comment-14498895
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

HDFS-8062 does note require this since default schema can be hard coded.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-16 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498847#comment-14498847
 ] 

Kai Zheng commented on HDFS-7859:
-

[~szetszwo] I don't have much time to sort the complete list yet but thought 
HDFS-8062 would be one.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-15 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497306#comment-14497306
 ] 

Kai Zheng commented on HDFS-7859:
-

[~szetszwo],
bq.Since we don't not yet support add/delete/update/rename schema operations, 
we don't need to persist anything in NN at this moment. We will support some of 
these schema operations down the road. We may persist schemas at that time. 
Sound good?
Please note it's not true we don't need to persist anything in NN at this 
moment.. We had already persisted some hard-coded values that should be 
covered by a schema in the image. Without this, we will definitely need to 
revisit the image format change some time later. As I said above, it's flexible 
enough in the schema definition and if we persist the whole schema object in 
image, we would not likely need to change the image later. Please note this 
issue blocks many subsequent issues and I thought we still have enough time for 
them right before the merge happening. 

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-15 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497447#comment-14497447
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

 ... We had already persisted some hard-coded values that should be covered by 
 a schema ...

What do you mean?  Could you give an example?

 ... Please note this issue blocks many subsequent issues and I thought we 
 still have enough time for them right before the merge happening.

What are the subsequent issues?

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-15 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497485#comment-14497485
 ] 

Kai Zheng commented on HDFS-7859:
-

bq.What do you mean? Could you give an example?
Well, my last said was bad and inaccurate. After double checking related codes, 
I saw only stripped blocks derived from the following hard-coded values are 
persisted in the image. So please ignore the saying. 
bq.What are the subsequent issues?
We do have some and will sort them out later. I have opened HDFS-8156 to 
resolve some deps caused by HDFS-7866, originally planned to be done here.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-15 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495853#comment-14495853
 ] 

Kai Zheng commented on HDFS-7859:
-

Hi [~szetszwo],

Per your request I updated the doc in HDFS-7337 accordingly. It entirely 
rewrote the schema section and mainly reflects existing related discussions and 
even implementations. I wish it addresses your questions here well. Your 
further comments and questions are very welcome. Thanks in advance!

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493702#comment-14493702
 ] 

Xinwei Qin  commented on HDFS-7859:
---

OK, I will track it.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493695#comment-14493695
 ] 

Kai Zheng commented on HDFS-7859:
-

Note I have updated the patch in HDFS-7866 aligning with this. When it's 
getting in then this one can rebase and be in then.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493699#comment-14493699
 ] 

Xinwei Qin  commented on HDFS-7859:
---

[~drankye], thanks for your comments.
{quote}
1. Looks like this couples with HDFS-7866. Maybe I could commit HDFS-7866 first 
and then this gets all the left work done. Will it work for you this way?
{quote}
Yes, committing HDFS-7866 first is better.
bq. 2. What methods can ECSchemaManager call to make it happen? 
Some methods like {{logAddECSchema()}} in {{FSEditLog.java}} are missing, I 
will add them in next patch.
bq. 3. In ECSchemaManager, new methods like addECSchema are not necessarily 
public.
I will change to friendly.
bq. 4. Are we supporting the two formats? Please add Javadoc to explain them, 
thanks.
Yes, two formats are supported. These methods are all only called during 
namenode startup or do checkpoint, and which method is called depends on the 
FSImage format. I will add detail Javadoc on them.
bq.  5. Would you have separate issue(s) for the following?
I will create a new issue for it.


 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493703#comment-14493703
 ] 

Xinwei Qin  commented on HDFS-7859:
---

OK, I will track it.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493704#comment-14493704
 ] 

Xinwei Qin  commented on HDFS-7859:
---

OK, I will track it.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495498#comment-14495498
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

 The patch under this JIRA handles saving / loading these default schemas in 
 fsimage. I think this is necessary even without loading custom schemas from 
 XML. Otherwise we cannot guarantee the NameNode which loads the fsimage has 
 the same default schemas as the NameNode which saved it. It is obviously even 
 more necessary when we add custom schemas ...

I think we should not persist anything to NN before we have a clear design 
since we don't know what to persist.  For example, should we persist schema ID? 
 We are not able to answer this question since we don't even know if a schema 
should have an ID.

If we change the layout later on, it requires cluster upgrade for the new 
layout and we have to support the old layout for backward compatibility.

For now, I suggest to just hard code the only (6,3)-Reed-Solomon schema.  We 
don't even need the xml file.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495546#comment-14495546
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

 ... schema name for the ID purpose. ...

There are a few choice choices:
# Using schema name as ID
# A schema name and a separated numeric ID
# Multiple schema names and a numeric ID

Why using #1?

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495553#comment-14495553
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

 ... We would persist the whole schema object ...

How can we be sure that the schema object format won't change?

Since we don't not yet support add/delete/update/rename schema operations, we 
don't need to persist anything in NN at this moment.  We will support some of 
these schema operations down the road.  We may persist schemas at that time.  
Sound good?

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495442#comment-14495442
 ] 

Zhe Zhang commented on HDFS-7859:
-

[~szetszwo] / [~drankye]: The [phasing plan | 
https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14391207page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391207]
 I posted might be a little confusing in regards of schemas. My apologies.

In the offline meetup on 03/31, we didn't reach a clear conclusion on how much 
of schema work to include before merging. Therefore I left it in phase I, but 
marked it as optional. My thought was that we could make a better decision 
after observing how fast the work could proceed. Up to this point I think this 
thread is going pretty well and it seems we can have a multi-schema 
implementation when other HDFS-7285 tasks are done (see details below).

Good [questions | 
https://issues.apache.org/jira/browse/HDFS-7859?focusedCommentId=14494933page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494933]
 on schema design. I think we eventually need to answer them in the broader 
scope of HDFS-7337. IIUC HDFS-7859 / HDFS-7866 are not touching most of the 
tricky scenarios. Based on Kai's latest [comment | 
https://issues.apache.org/jira/browse/HDFS-7866?focusedCommentId=14494050page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14494050],
 HDFS-7866 will mostly handle _default_ schemas embedded in the 
{{ECSchemaManager}} code. 

The patch under this JIRA handles saving / loading these default schemas in 
fsimage. I think this is necessary even without loading custom schemas from 
XML. Otherwise we cannot guarantee the NameNode which loads the fsimage has the 
same default schemas as the NameNode which saved it. It is obviously even more 
necessary when we add custom schemas. The logic in the patch is quite 
straightforward; it's mostly about serialize / deserialize schemas.

So here's my proposal:
# Shrink this patch to get rid of logics on modifying and removing schemas 
({{ECSchemaManager#modifyECSchema}} and {{OP_MODIFY_EC_SCHEMA}}). 
# Repurpose HDFS-7866 to focus on loading custom schemas from site xml files.

[~szetszwo], [~drankye], [~vinayrpet]: let me know if you agree with the above. 
If we are all synced on this, how about moving this JIRA back to HDFS-7285 and 
keeping HDFS-7866 under HDFS-8031?

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495556#comment-14495556
 ] 

Kai Zheng commented on HDFS-7859:
-

bq.Using schema name as ID
As we would not make it heavy so don't have some field like {{description}} for 
an {{ECSchema}}, a friendly name like {{RS-6-3}} would make it more sense in 
the way rather than an number ID. Users should be clearly understand the schema 
before using it to create any zone. The name will help with identifying that. 
bq.We don't even need the xml file.
Yeah, if we would do that thru command to define a schema by specifying the 
schema parameters, it should also be OK. I don't have strongly preference about 
that. Any file format or even not using file would also work I guess. We talked 
about this in the meetup, looks like XML file was synced.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495493#comment-14495493
 ] 

Kai Zheng commented on HDFS-7859:
-

Hi [~zhz],

Thanks for taking care of this and your good suggestion. It looks reasonable to 
me. This will sound like a more solid base for the merge. 

To summarize further:
1. This issue HDFS-7859 would provide two system defined schemas in Java codes: 
one is the system default schema (rs-6-3), already there; a new one, suggesting 
rs-10-4; It also ensure the two schemas will be persisted in the image/editlog 
for later querying.
2. The left gaps will be processed as follow-on to be done in HDFS-7866, mainly 
about how to customize site specific schemas thru a XML file. The design will 
also be updated.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495517#comment-14495517
 ] 

Kai Zheng commented on HDFS-7859:
-

bq.we don't know what to persist. For example, should we persist schema ID? We 
are not able to answer this question since we don't even know if a schema 
should have an ID.
It's not true. We have {{ECSchema}} defined and it uses schema name for the ID 
purpose. We would persist the whole schema object. The on-going work although 
isn't reflected in the design doc but we did do that following our related 
discussion. In the meetup with [~zhz] and [~jingzhao], we covered this aspect 
and even your questions already. It's my mistake I didn't put it down clearly 
and update the doc accordingly. 

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495560#comment-14495560
 ] 

Kai Zheng commented on HDFS-7859:
-

bq.How can we be sure that the schema object format won't change?
Good question. In {{ECSchema}} class, in addition to the common parameters 
widely used by typical erasure codecs, an {{options}} map is also included so 
potentially any complex codec can use it to contain its own specific parameters 
or key-value pairs, such parameters are subject to its corresponding erasure 
coders to interpret. We try to make it flexible enough to avoid such change, 
but in case it needs change anyway, I thought it's supported, I mean the image 
layout version.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494933#comment-14494933
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

I have the following questions:
- How to add a schema?  Using a command?
- Then, is it possible to delete a schema?  It seems that we have to support 
deletion since a schema may be created by mistake or there could be typos when 
creating a schema.
- If deletion is supported, what to do with the existing files with that schema?
- Do we support renaming schema?
- Does a EC schema have a schema ID?

I think we need a design for EC schema to answer all these questions and 
specify what operations are supported.

BTW, we only support one schema (6,3)-Reed-Solomon in the first phase 
HDFS-7285.  I think we should focus on finishing a complete, working basic EC 
feature and get HDFS-7285 merged to trunk.  How about moving this JIRA and 
related JIRAs to HDFS-8031 and defer the work?  Sorry for commenting on this 
late and thanks for all the good works.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495040#comment-14495040
 ] 

Kai Zheng commented on HDFS-7859:
-

Hi [~szetszwo], thanks for taking care of this.
These questions did be considered thru the related work. The overall design and 
discussion are in HDFS-7337, would you take a look at it. Let's discuss further 
there. I will sort out latest discussions and clearly answer your questions 
there. Thanks. 

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495049#comment-14495049
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

 ... The overall design and discussion are in HDFS-7337, would you take a look 
 at it. ...

Yes, I looked at it earlier but it did not answer my questions.  Since 
HDFS-7337 is already under HDFS-8031, let's move all the related works to 
HDFS-8031.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495059#comment-14495059
 ] 

Kai Zheng commented on HDFS-7859:
-

HDFS-7337 is rather large, we're implementing its related tasks incrementally. 
In your view, what's the difficulty that makes this sub-task hard to be in the 
merge?

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-14 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495089#comment-14495089
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7859:
---

I just want to make sure we have the right design before adding code.  
Persisting schema to fsimage is not an obvious task and it is not required in 
HDFS-7285.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-13 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493484#comment-14493484
 ] 

Kai Zheng commented on HDFS-7859:
-

Thanks [~xinwei] for the great work! Some comments or questions after a quick 
look:
1. Looks like this couples with HDFS-7866. Maybe I could commit HDFS-7866 first 
and then this gets all the left work done. Will it work for you this way?
2. I guess when reloading of predefined xml happens, some schemas need to be 
updated/removed/added to editlog. What methods can ECSchemaManager call to make 
it happen? I did notice some OPs like OP_REMOVE_EC_SCHEMA are added, but where 
the OPs are triggered?
3. In {{ECSchemaManager}}, new methods like {{addECSchema}} are not necessarily 
public.
4. A question would you help with, it's not clear to me when to call 
{{loadECSchemas}} and when to call {{loadState}}. Are we supporting the two 
formats? Please add Javadoc to explain them, thanks.
5. Would you have separate issue(s) for the following?
{code}
@Override
protected void toXml(ContentHandler contentHandler) throws SAXException {
  // TODO Support for offline EditsVistor over an OEV XML file
}

@Override
void fromXml(Stanza st) throws InvalidXmlException {
  // TODO Support for offline EditsVistor over an OEV XML file
}
{code}

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-7859.001.patch


 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-13 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492394#comment-14492394
 ] 

Xinwei Qin  commented on HDFS-7859:
---

Hi [~drankye]
The patch has been completed,  but is a little big. I will post it about half 
an hour later at home.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-12 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491942#comment-14491942
 ] 

Kai Zheng commented on HDFS-7859:
-

Any update or question? Thanks.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-09 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487291#comment-14487291
 ] 

Xinwei Qin  commented on HDFS-7859:
---

Hi [~drankye],
Thanks for your clarification and suggestion. I'm more clear on this issue, and 
will post the patch ASAP.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-09 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487072#comment-14487072
 ] 

Kai Zheng commented on HDFS-7859:
-

Hi [~xinwei],

About persisting schema object, I guess the work in HDFS-8023 may be helpful 
for your reference, as pointed by [~vinayrpet] above.

And would you take a look at the initial codes attached in HDFS-7866 and 
HDFS-8062 so have more clear idea about the scope of this issue? 

Thanks.



 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-07 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482766#comment-14482766
 ] 

Vinayakumar B commented on HDFS-7859:
-

bq. When we have this work done, we may also have the idea about how to 
serialize/deserialize an EC schema in RPC between NameNode and client/DataNode
This is included in the latest patch provided in HDFS-8023.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395186#comment-14395186
 ] 

Kai Zheng commented on HDFS-7859:
-

When we have this work done, we may also have the idea about how to 
serialize/deserialize an EC schema in RPC between NameNode and client/DataNode. 
If necessary we may create another issue to handle the aspect.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395184#comment-14395184
 ] 

Kai Zheng commented on HDFS-7859:
-

I would clarify here that this issue would focus on how to persist EC schemas 
in NameNode, as already agreed, in the fsimage and editlog. Or in other words, 
this issue is to change fsimage and editlog to persist EC schemas, and should 
consider relevant issues like the image version, how to upgrade/downgrade and 
etc.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-02 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393895#comment-14393895
 ] 

Xinwei Qin  commented on HDFS-7859:
---

Hi, [~drankye], I'm interested in this issue, if you have no time to do, can 
reassign this to me. Thanks.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist EC schemas in NameNode

2015-04-02 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393901#comment-14393901
 ] 

Kai Zheng commented on HDFS-7859:
-

Great, please take it. Thanks.

 Erasure Coding: Persist EC schemas in NameNode
 --

 Key: HDFS-7859
 URL: https://issues.apache.org/jira/browse/HDFS-7859
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Zheng

 In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
 persist EC schemas in NameNode centrally and reliably, so that EC zones can 
 reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)