[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-08-23 Thread Ewan Higgs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138036#comment-16138036
 ] 

Ewan Higgs commented on HDFS-11639:
---

{quote}Hi Ewan Higgs, As this change is required only for writes, we can move 
this to HDFS-12090. Are you OK with that?
{quote}

Sure. It needs a rebase as well, I see.

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137939#comment-16137939
 ] 

Hadoop QA commented on HDFS-11639:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-11639 does not apply to HDFS-9806. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-11639 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869427/HDFS-11639-HDFS-9806.005.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20816/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-08-22 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137937#comment-16137937
 ] 

Virajith Jalaparti commented on HDFS-11639:
---

Hi [~ehiggs], As this change is required only for writes, we can move this to  
HDFS-12090. Are you OK with that?

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021142#comment-16021142
 ] 

Hadoop QA commented on HDFS-11639:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 4s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
30s{color} | {color:green} HDFS-9806 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
30s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-9806 
has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} HDFS-9806 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 55s{color} | {color:orange} hadoop-hdfs-project: The patch generated 14 new 
+ 1188 unchanged - 7 fixed = 1202 total (was 1195) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
29s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11639 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869427/HDFS-11639-HDFS-9806.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 760e177c73e7 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 
16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-9806 / 5d021f3 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 

[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-18 Thread Ewan Higgs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015479#comment-16015479
 ] 

Ewan Higgs commented on HDFS-11639:
---

{quote}Any particular reason for changing BlockProvider to implement 
Iterable from Iterable?{quote}
Yes, as the purpose is to put the {{BlockAlias}} into the client protocol, the 
{{ProvidedStorageMap}} needs to get more than just the {{Block}}. This was done 
by changing the {{BlockProvider}} to return a {{BlockAlias}} instead of the 
{{Block}}. 

{quote}Was blockId intentionally left out of FileRegionProto even though 
FileRegion contains it?{quote}
Yes, this was done for two reasons:

1. The blockid is already in the message. Having it in two locations is a bug 
vector and more wasteful than it needs to be.
2. The FileRegion is really the value in the key value store. The blockid is 
the key. I was going to investigate whether the blockid could be pulled out and 
mapping a {{FileRegion}} to a blockid would be done by association rather than 
embedding the value in the structure, but it's very low priority and well 
beyond the scope of this PR.

If you think the blockid should be in the {{FileRegionProto}} so it maps 
exactly onto the {{FileRegion}} as it exists today, I'm fine with putting it in.

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014860#comment-16014860
 ] 

Hadoop QA commented on HDFS-11639:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
31s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
30s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
29s{color} | {color:green} HDFS-9806 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
33s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-9806 
has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} HDFS-9806 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 53s{color} | {color:orange} hadoop-hdfs-project: The patch generated 31 new 
+ 1271 unchanged - 18 fixed = 1302 total (was 1289) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
33s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new 
+ 1 unchanged - 0 fixed = 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
16s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 29s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}107m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.setBlockAlias(byte[]) may 
expose internal representation by storing an externally mutable object into 
BlockReaderFactory.blockAlias  At BlockReaderFactory.java:by storing an 
externally mutable object into BlockReaderFactory.blockAlias  At 
BlockReaderFactory.java:[line 311] |
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11639 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12868593/HDFS-11639-HDFS-9806.004.patch
 |
| Optional Tests |  asflicense  compile  

[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-17 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014695#comment-16014695
 ] 

Virajith Jalaparti commented on HDFS-11639:
---

Thanks for the updated patch [~ehiggs]! A couple of questions:

# Any particular reason for changing {{BlockProvider}} to implement  
{{Iterable}} from {{Iterable}}? 
# Was {{blockId}} intentionally left out of {{FileRegionProto}} even though 
{{FileRegion}} contains it?

I fixed the following issues (along with some checkstyle fixes) and am posting 
a modified patch.

- In {{ProvidedBlocksBuilder#newLocatedBlock}}, the following code: 

{code}
  FileRegion fileRegion;
  try {
fileRegion = (FileRegion) blockProvider.resolve(eb.getLocalBlock());
  } catch (IOException e) {
LOG.error("Could not resolve PROVIDED block: {}", e);
fileRegion = null;
  }
{code}

is moved inside the {{if(hasProvidedLocations)}} block.

- Modified {{Sender#transferBlock}} to add an null check for {{blockAlias}}

- Removed a redundant {{proto.build()}} in {{Sender#readBlock}}



> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014174#comment-16014174
 ] 

Hadoop QA commented on HDFS-11639:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
38s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
32s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
33s{color} | {color:green} HDFS-9806 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
51s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-9806 has 10 
extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
28s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-9806 
has 3 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} HDFS-9806 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 25s{color} 
| {color:red} hadoop-hdfs-project generated 1 new + 55 unchanged - 0 fixed = 56 
total (was 55) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} hadoop-hdfs-project: The patch generated 48 new 
+ 1270 unchanged - 18 fixed = 1318 total (was 1288) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
41s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new 
+ 3 unchanged - 0 fixed = 4 total (was 3) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
16s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client |
|  |  
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.setBlockAlias(byte[]) may 
expose internal representation by storing an externally mutable object into 
BlockReaderFactory.blockAlias  At BlockReaderFactory.java:by storing an 
externally mutable object into BlockReaderFactory.blockAlias  At 
BlockReaderFactory.java:[line 311] |
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestFileAppend4 |
|   | hadoop.hdfs.crypto.TestHdfsCryptoStreams |
|   | 

[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-15 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011095#comment-16011095
 ] 

Virajith Jalaparti commented on HDFS-11639:
---

Also, how about creating a new sub-task for the refactoring in the 
FsDatasetImpl? That seems separable from this task, and will keep the patch 
sizes reasonable.

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-15 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011091#comment-16011091
 ] 

Virajith Jalaparti commented on HDFS-11639:
---

bq. If this entails a protocol change, I think it makes the most sense to do it 
at this point so all the protocol changes happen up front in one change if we 
need this to get in for 3.0. 
Agree

bq. Does it make sense to have the BlockAlias in transferBlock? If we know the 
targetStorageTypes and targetStorageIDs then we can know that nothing needs to 
be transferred. Or is this an issue if we want to transfer from PROVIDED to 
DISK?
We would need a non-null {{BlockAlias}} in {{transferBlock}} whenever 
{{transferBlock}} is called for a PROVIDED replica. This will happen when (a) a 
data write pipeline fails mid-way and a new datanode is added for the PROVIDED 
replica, and (b) a provided replica has to be created from a Finalized (local) 
replica.

bq. With the pending refactoring of the FsDatasetImpl which won't have replicas 
a priori, I wonder if it makes sense for the Datanode to have a 
FileRegionProvider or BlockProvider at all. They are given the appropriate 
block ID and block alias in the readBlock or writeBlock message. Maybe I'm 
overlooking what's still being provided.
I was trying to reconcile the existing design (FsDatasetImpl knows about 
provided blocks apriori) with the new design where FsDatasetImpl will not know 
about these before but just constructs them on-the-fly using the {{BlockAlias}} 
from {{readBlock}} or {{writeBlock}}. Using {{BlockProvider#resolve()}} allows 
us to have both designs exist in parallel. I was wondering if we should still 
retain the earlier given the latter design.



> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-15 Thread Ewan Higgs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010483#comment-16010483
 ] 

Ewan Higgs commented on HDFS-11639:
---

{quote}
Btw, I rebased the HDFS-9806 branch on the most recent version of trunk (hash 
83dd14aa84ad697ad32c51007ac31ad39feb4288).{quote}
Thanks!
{quote}In DataTransfer#run(), the blockAlias should be null unless it is for a 
provided block. I think this will entail adding BlockAlias to transferBlock() 
and also to BlockCommand for the DatanodeProtocol.DNA_TRANSFER command. 
However, this will only be relevant for writing provided blocks (and in 
particular recovery).{quote}
If this entails a protocol change, I think it makes the most sense to do it at 
this point so all the protocol changes happen up front in one change if we need 
this to get in for 3.0. 
Does it make sense to have the BlockAlias in transferBlock? If we know the 
targetStorageTypes and targetStorageIDs then we can know that nothing needs to 
be transferred. Or is this an issue if we want to transfer from PROVIDED to 
DISK?

{quote}Looking over this patch, one thing that occurred to me is if it makes 
sense to unify FileRegionProvider with BlockProvider? They both have very close 
functionality.{quote}
I think this makes a lot of sense.
{code}I like the use of BlockProvider#resolve(). If we unify FileRegionProvider 
with BlockProvider, then resolve can return null if the block map is accessible 
from the Datanodes also. If it is accessible only from the Namenode, then a 
non-null value can be propagated to the Datanode.{code}
With the pending refactoring of the FsDatasetImpl which won't have replicas a 
priori, I wonder if it makes sense for the Datanode to have a 
FileRegionProvider or BlockProvider at all. They are given the appropriate 
block ID and block alias in the readBlock or writeBlock message. Maybe I'm 
overlooking what's still being provided.

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-11 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007254#comment-16007254
 ] 

Virajith Jalaparti commented on HDFS-11639:
---

Btw, I rebased the HDFS-9806 branch on the most recent version of trunk (hash 
83dd14aa84ad697ad32c51007ac31ad39feb4288).

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-11 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007247#comment-16007247
 ] 

Virajith Jalaparti commented on HDFS-11639:
---

Thanks for working on this [~ehiggs]. I think the general direction of the 
patch (using {{BlockProvider#resolve()}} in {{ProvidedBlocksBuilder}} to get 
the {{BlockAlias}}, passing it along as a set of bytes to the client, and 
resolving it only on the Datanode) looks good. 

A few comments:

* In {{ProvidedBlocksBuilder#newLocatedBlock}}, the {{fileRegion}} should be 
resolved only if the block has {{PROVIDED}} locations (i.e., 
{{hasProvidedLocations}} is true). When {{dfs.namenode.provided.enabled}} is 
set to true, all {{LocatedBlock}} are created in this method, and for 
non-provided blocks, a resolution of {{BlockAlias}} is needed.

* {{PBHelperClient#convertLocatedBlockProto()}} and 
{{PBHelperClient#convertLocatedBlock()}} should be modified to decode/encode 
the {{BlockAlias}} bytes.

* How about decoding the {{blockAlias}} bytes in {{DataXceiver#readBlock}} 
using a new {{DataTransferProtoUtil#blockAliasFromProto(bytes[] blockAlias}} 
method instead of using the {{BlockAlias#builder()}}? The former will be 
in-line with the way the protobufs are decoded in {{DataXceiver}}. Further, if 
in the future a different {{BlockAlias}} is used, the current implementation of 
using the {{FileRegion#Builder}} in {{DataXceiver#readBlock}}  will be hard to 
extend (will end up being try {{FileRegion#Builder}}, if null try 
{{BlockAliasXX#Builder}} and so on). 

* Similar to passing on BlockAlias from {{DataXceiver#readBlock}} to 
{{BlockSender}}, it should be passed along from {{DataXceiver#readBlock}}  to 
{{BlockReceiver}}. However, we would not need it till we have writes 
implemented.

* {{DataStreamer#blockAlias}} will never be non-null. I think it should be 
initialized in {{DFSOutputStream}}.

* Current implementation of {{TextReader#resolve}} can be expensive as it has 
to scan through the whole list of blocks. I don't have a very good solution for 
this as one alternative would be to maintain an in-memory map to store this 
information. I think it would be good to have both implementations, although 
this is not a blocker.

* In {{DataTransfer#run()}}, the {{blockAlias}} should be null unless it is for 
a provided block. I think this will entail adding BlockAlias to 
{{transferBlock()}} and also to {{BlockCommand}} for the 
{{DatanodeProtocol.DNA_TRANSFER}} command. However, this will only be relevant 
for writing provided blocks (and in particular recovery).

* I don't think 
{{TestDataTransferProtocol#testDataTransferProtocolWithBlockAlias}} is 
currently correct/tests something useful. Shouldn't block alias be relevant for 
only provided blocks?

Looking over this patch, one thing that occurred to me is if it makes sense to 
unify {{FileRegionProvider}} with {{BlockProvider}}? They both have very close 
functionality.

I like the use of {{BlockProvider#resolve()}}. If we unify 
{{FileRegionProvider}} with {{BlockProvider}}, then {{resolve}} can return 
{{null}} if the block map is accessible from the Datanodes also. If it is 
accessible only from the Namenode, then a non-null value can be propagated to 
the Datanode.

One of the motivations for adding the {{BlockAlias}} to the client protocol was 
to have the blocks map only on the Namenode. In this scenario, the 
{{ReplicaMap}} in {{FsDatasetImpl}} of will not have any replicas apriori. 
Thus, one way to ensure that the {{FsDatasetImpl}} interface continues to 
function as today is to create a {{FinalizedProvidedReplica}} in 
{{FsDatasetImpl#getBlockInputStream}} when {{BlockAlias}} is not null.



> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-11 Thread Ewan Higgs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006131#comment-16006131
 ] 

Ewan Higgs commented on HDFS-11639:
---

I forgot to mention that the current patch requires 
6e4c5539c50cdea1f4d307c4f4273dc42af5601c and 
6fcaf3097156aa83e83def29006198fb4a632163 to be cherry picked into the branch.

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org