[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-12 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324270#comment-16324270
 ] 

genericqa commented on HDFS-12911:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10285 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
19s{color} | {color:red} root in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 1s{color} | {color:green} HDFS-10285 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  2m 
20s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  8s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 21 new + 334 unchanged - 0 fixed = 355 total (was 334) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m  
9s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m  8s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  5m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12911 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905895/HDFS-12911-HDFS-10285-03.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4f5b3d18093b 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-10285 / ed60479 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22659/artifact/out/branch-mvninstall-root.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22659/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| mvnsite | 

[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-04 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311247#comment-16311247
 ] 

Rakesh R commented on HDFS-12911:
-

FYI, as per discussion with Uma, few tasks has been re-prioritized to make the 
refactoring easy, (1) push lock optimization HDFS-12982 patch to the branch 
first and (2) after that [HDFS-12911 
patch|https://issues.apache.org/jira/secure/attachment/12903981/HDFS-12911-HDFS-10285-02.patch].
 Please excuse, if there is any confusions due to changing focus to another 
jira from here.

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-03 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310137#comment-16310137
 ] 

Rakesh R commented on HDFS-12911:
-

bq. readLockNS/readUnlockNS are sensitive APIs to expose to an external service.
Thanks [~chris.douglas] for the comment. I have uploaded initial patch to 
HDFS-12982 jira, I've used {{Context}} interface proposed earlier in this jira 
and included simple methods into it, which can be used by the SPS class. 
Majorly, query part is getting {{HdfsLocatedFileStatus}} from NN and does the 
computation, assignment logic based on the file status object. Please review 
the approach/patch when you get a chance. Thanks!

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-03 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309914#comment-16309914
 ] 

Daryn Sharp commented on HDFS-12911:


Will review soon.

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-02 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309222#comment-16309222
 ] 

Uma Maheswara Rao G commented on HDFS-12911:


Hi [~chris.douglas],  Thank you so much for your time and helpful review.

{quote}
•   readLockNS/readUnlockNS are sensitive APIs to expose to an 
external service. If the SPS dies, stalls, or loses connectivity to the NN 
without releasing the lock: what happens? Instead, would it be possible to 
break up analyseBlocksStorageMovementsAndAssignToDN into operations that- if 
any fail due to concurrent modification- could abort and retry? Alternatively 
(and as in the naive HDFS-12911.00.patch), the server side of the RPC could be 
"heavier", by taking the lock and running this code server-side. This would 
counter many of the advantages to running it externally, unfortunately.

{quote}

I should have explained a bit about current APIs in Context. Actually all the 
APIs in Context may not be needed once we refine the lock usage reduction in 
SPS code. Rakesh worked on a patch to reduce lock and refining APIs to access 
namesystem HDFS-12982 (he should upload a patch shortly). So, this lock APIs 
from Context will go away. 

Probably I will wait to refine this patch until  HDFS-12982. Just don’t want to 
mix the refining of lock thing with this refactoring patch.

{quote}
•   getSPSEngine may leak a heavier abstraction than is necessary. 
As used, it appears to check for liveness of the remote service, which can be a 
single call (i.e., instead of checking that both the SPS and namesystem are 
running, create a call that checks both in the context object), or it's used to 
get a reference to another component. The context could return those components 
directly, rather than returning the SPS. It would be cleaner still if the 
underlying action could be behind the context API, so the caller isn't managing 
multiple abstractions.
{quote}
Good point. I agree to move the SPS engine part completely behind the Context 
and expose only required stuff outside. Actually, I was not clear (until today) 
on how we wanted to establish client to external SPS communication and hence 
internal, I did left out startup and queueing logics as is in BlockManager 
which needs SPS instance to operate . I can refine it to push completely behind 
Context abstraction.
Thanks for the comment.

{quote}
•   The BlockMoveTaskHandler and FileIDCollector abstract the 
"write" (scheduling moves) and "read" (scan namespace) interfaces?
{quote}
Yes

{quote}
•   addSPSHint has no callers in this patch? Is this future 
functionality, or only here for symmetry with removeSPSHint?
{quote}
I thought this may be useful if external SPS wants to implement some other 
mechanism to track SPS calls instead of Xattr. Internal SPS calls Xattrs in 
block manager itself, so it is not using it now.

{quote}
•   Are these calls idempotent? Or would an external provider need 
to resync with the NN on RPC timeouts?
{quote}
I think setting Xattrs are non idempotent. I think SPS can use hdfs client 
exposed set Xattr APIs. There we had retry cache implementation which tries to 
handle retries.
Even if we go with alternative implementations, I feel External provider need 
to resend with NN on RPC timeouts. All may not be idempotent. But majority of 
APIs would be just getters, so they should be idempotent IMO. Once we finalize 
the APIs we need to mark them.

{quote}
•   Since they're stubs in this patch, could we move the external 
implementation to a separate JIRA? Just to keep the patch smaller.
{quote}

Sure. Why I included that classes in this first version to make the idea more 
clearer for reviewers. Surely I will separate that classes from this patch once 
approaches are agreed and clear. Seems we are clear on the approaches mostly, I 
will remove in next version.



> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we 

[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2018-01-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308948#comment-16308948
 ] 

Chris Douglas commented on HDFS-12911:
--

A few questions about the Context API in [^HDFS-12911-HDFS-10285-02.patch]. In 
general, a few methods might not be cleanly implementable outside the NN:
* {{readLockNS}}/{{readUnlockNS}} are sensitive APIs to expose to an external 
service. If the SPS dies, stalls, or loses connectivity to the NN without 
releasing the lock: what happens? Instead, would it be possible to break up 
{{analyseBlocksStorageMovementsAndAssignToDN}} into operations that- if any 
fail due to concurrent modification- could abort and retry? Alternatively (and 
as in the naive [^HDFS-12911.00.patch]), the server side of the RPC could be 
"heavier", by taking the lock and running this code server-side. This would 
counter many of the advantages to running it externally, unfortunately.
* {{getSPSEngine}} may leak a heavier abstraction than is necessary. As used, 
it appears to check for liveness of the remote service, which can be a single 
call (i.e., instead of checking that both the SPS and namesystem are running, 
create a call that checks both in the context object), or it's used to get a 
reference to another component. The context could return those components 
directly, rather than returning the SPS. It would be cleaner still if the 
underlying action could be behind the context API, so the caller isn't managing 
multiple abstractions.
* The {{BlockMoveTaskHandler}} and {{FileIDCollector}} abstract the "write" 
(scheduling moves) and "read" (scan namespace) interfaces? 
* {{addSPSHint}} has no callers in this patch? Is this future functionality, or 
only here for symmetry with {{removeSPSHint}}?
* Are these calls idempotent? Or would an external provider need to resync with 
the NN on RPC timeouts?
* Since they're stubs in this patch, could we move the external implementation 
to a separate JIRA? Just to keep the patch smaller.

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911-HDFS-10285-01.patch, 
> HDFS-12911-HDFS-10285-02.patch, HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306025#comment-16306025
 ] 

genericqa commented on HDFS-12911:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-10285 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
18s{color} | {color:red} root in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 0s{color} | {color:green} HDFS-10285 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  2m 
19s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-hdfs in HDFS-10285 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 61 new + 393 unchanged - 0 fixed = 454 total (was 393) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m  
9s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  6m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12911 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903981/HDFS-12911-HDFS-10285-02.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ab81a0066861 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-10285 / 7f4c4ab |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22520/artifact/out/branch-mvninstall-root.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22520/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| mvnsite | 

[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-18 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295918#comment-16295918
 ] 

Uma Maheswara Rao G commented on HDFS-12911:


I don't like to repeat my answers again and again, as I tried to give my 
explanations already in HDFS-10285.

However I would like to give a little clarifications on #1, To avoid confusions 
for others.

{quote}
I consider this feature far from complete and I am worried that other feature 
additions will hurt Namenode more and more.
{quote}
This could be your opinion. This feature does not intend to add any new 
policies. Please look at JIRA description/design doc. 
The goal of this work is to make the NN state and storage state to match. 
Current policies are the basic core policies which we are persisting in NN. SPS 
works to satisfy that core policy states. I am not sure whether you have more 
policies which requires the persistence in NN at all. To the bottom line, SPS 
works for the policies which you persisted in NN.
Fancier policies can be built anywhere outside as I think they don't need to 
persist in NN, and I am not discussing that work today as that needs good time 
to discuss that feature( IMO, its a non trivial feature) . Whatever fancy 
policies you may define, they may be falling into the same existing fundamental 
storage policies, they are HOT, WARM, COLD. SPS works only for the policies 
which you persist in the NN. *Does not intend to add any new policies on its 
own, This works under the NN exposed HSM options.*

{quote}
I completely disagree, this is a design choice that was made. Nothing prevents 
this process from working outside the Namenode.
{quote}
Probably I have to say the same thing in other way. :-) Nothing much overhead 
by running inside too.

{quote}
In my humble opinion, if you have not resolved the big picture, then proceeding 
with things like optimizing lock is pointless
{quote}
Thanks for the respect on developers time. Devs will receive commonly agreed 
feedback and work on, we don't see objection on lock improvement. And Currently 
its already 47th task in HDFS-10285, titled *SPS in NN*. Right now, unable to 
work on things which are not agreed by others yet.

Keeping arguments aside,  from SPS dev POV, we would be happy to work on, if we 
get commonly agreed feedbacks. As a community developer(with my experience) , I 
would like to respect everyone's  feedbacks. 
Right now there is no common agreement on the approach how we start SPS. 
Probably we should figure out a way how to satisfy all arguments. Probably 
online meeting should help discuss on that steps. Thanks



> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by 

[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291461#comment-16291461
 ] 

Anu Engineer commented on HDFS-12911:
-

I feel like I am repeating myself, It must be pretty painful for the others :), 
but I am going to answer each question since you asked.

bq. 1. With disable of SPS, absolutely no effect on NN, I have posted 
benchmarks.
I consider this feature far from complete and I am worried that other feature 
additions will hurt namenode more and more. Here is the bottom line, either you 
are building an HSM -- when fully featured will have different policies, or 
this is one off. Either one, I am against having this inside Namenode.

bq. 2. Front Q filing up is not real issue, as that is not going to be too much 
memory consumption with realistic scenarios. To increase the speed of 
processing, we have to tune DN bandwidths, but not NN Qs. So, with throttling, 
processing memory will be constant.

*realistic scenarios* -  You would be surprised by what you find in the field. 
The notion of reasonable use goes out of the window as soon as you support 
cases like, "I accidentally ran format on my production cluster, can you please 
recover it". 

bq. 3. It will be easy to respect replication tasks as high priority tasks if 
we are inside Hadoop via keeping inline tasks controlling via xmits.
We have seen issues with IBR handling and hence things like lifeline protocol. 
So I am not willing to buy that xmits or any such control will work effectively 
if your Namenode is under severe load. Then you will be sitting around trying 
to get data or maintaining large work queues which you cannot make any forward 
progress.


bq. 4. Non of the tools like, Balancer/DiskBalancer, need to track or access 
namespace, they just need utilization and DN block informations. So, this is 
not exactly same as that.
I completely disagree, this is a design choice that was made. Nothing prevents 
this process from working outside the Namenode.

bq. 5. We did not get consensus to add new process yet.
Agree, but we have no consensus to merge the code into Namenode either.

bq. 6. We got comments from other reviewer about lock and other stuff, 
respecting their reviews, we are continuing to do this task. We can continue 
discussions in HDFS-10285 if further issues.
In my humble opinion, if you have not resolved the big picture, then proceeding 
with things like optimizing lock is pointless. Unless you are sure you *will be 
able to merge* this into Namenode, optimizing a lock path that is inside 
namenode is a waste of time.

And let me come back to the issue which does not seem to be discussed, we have 
an existing pattern that works well. The only argument I have heard so far is 
management tools have to do extra work if this feature is outside. I am sure 
management tools have to do extra work even if this is inside. I have already 
shared my point of view about that. I am still in the camp that this process 
should be outside inside of Namenode.


> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object 

[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-14 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291410#comment-16291410
 ] 

Uma Maheswara Rao G commented on HDFS-12911:


[~anu],

>if we are pulling this out of NN, wouldn't this be a no-op?
If I understand the discussions correctly, all participants not ok to pull out, 
considering the overheads than keeping inside. 
Considering the facts that: 
1. With disable of SPS, absolutely no effect on NN, I have posted benchmarks. 
2. Front Q filing up is not real issue, as that is not going to be too much 
memory consumption with realistic scenarios. To increase the speed of 
processing, we have to tune DN bandwidths, but not NN Qs. So, with throttling, 
processing memory will be constant.
3. It will be easy to respect replication tasks as high priority tasks if we 
are inside Hadoop via keeping inline tasks controlling via xmits. 
4. Non of the tools like, Balancer/DiskBalancer, need to track or access 
namespace, they just need utilization and DN block informations. So, this is 
not exactly same as that. 
5. We did not get consensus to add new process yet.
6. We got comments from other reviewer about lock and other stuff, respecting 
their reviews, we are continuing to do this task. We can continue discussions 
in HDFS-10285 if further issues.

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-14 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291186#comment-16291186
 ] 

Uma Maheswara Rao G commented on HDFS-12911:


[~chris.douglas], Thank you for all the valuable suggestions. We are working on 
to reduce the lock and other optimizations. For lock improvement,  [~rakeshr] 
is almost ready with Patch. Once tested he will post.

For the above mem optimizations, I am trying to profile. 
Some updates: 
For UMA.2: Actually we are storing the matters in Feature as byte array. So, 
having just Xattr created obj static and reusing will not give benefit, that 
could reduce obj for GC though. Another improvement could be creating buffer 
pool and reuse, as when new Xattr added, it is removing the existing added 
feature and adding again. However, reducing xattr key name will have less space 
occupied.

Overall with my perf number locally shows no significant growth because of SPS. 
For 5 million filed up SPS calls, it is approximately taking 240MB ( for front 
Q and Xattr). After few of above improvement, it could reduce further. Also, 
IMO, 5 million directories means, it could cover whole cluster as users most 
preferably call SPS on directories where storage policy set. Consuming from 
front Q is throttled, so memory will be fixed from that part of processing.

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-12911.00.patch
>
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So far comments to handle
> Daryn:
>  # Lock should not kept while executing placement policy.
>  # While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> UMA:
> # I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Feature creation because of SPS alone.
> # Currently SPS putting long id objects in Q for tracking SPS called Inodes. 
> So, it is additional created and size of it would be (obj ref + value) = (8 + 
> 8) bytes [ ignoring alignment for time being]
> So, the possible improvement here is, instead of creating new Long obj, we 
> can keep existing inode object for tracking. Advantage is, Inode object 
> already maintained in NN, so no new object creation is needed. So, we just 
> need to maintain one obj ref. Above two points should significantly reduce 
> the memory requirements of SPS. So, for SPS call: 8bytes for called inode 
> tracking + 8 bytes for Xattr ref.
> # Use LightWeightLinkedSet instead of using LinkedList for from Q. This will 
> reduce unnecessary Node creations inside LinkedList. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12911) [SPS]: Fix review comments from discussions in HDFS-10285

2017-12-08 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284572#comment-16284572
 ] 

Anu Engineer commented on HDFS-12911:
-

bq. 1. Lock should not kept while executing placement policy.

if we are pulling this out of NN, wouldn't this be a no-op?

> [SPS]: Fix review comments from discussions in HDFS-10285
> -
>
> Key: HDFS-12911
> URL: https://issues.apache.org/jira/browse/HDFS-12911
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
>
> This is the JIRA for tracking the possible improvements or issues discussed 
> in main JIRA
> So, far from Daryn:
>   1. Lock should not kept while executing placement policy.
>2. While starting up the NN, SPS Xattrs checks happen even if feature 
> disabled. This could potentially impact the startup speed. 
> I am adding one more possible improvement to reduce Xattr objects 
> significantly.
>  SPS Xattr is constant object. So, we create one Xattr deduplication object 
> once statically and use the same object reference when required to add SPS 
> Xattr to Inode. So, here additional bytes required for storing SPS Xattr 
> would turn to same as single object ref ( i.e 4 bytes in 32 bit). So Xattr 
> overhead should come down significantly IMO. Lets explore the feasibility on 
> this option.
> Xattr list Future will not be specially created for SPS, that list would have 
> been created by SetStoragePolicy already on the same directory. So, no extra 
> Future creation because of SPS alone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org