[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-07 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157942#comment-16157942
 ] 

Manoj Govindassamy commented on HDFS-12357:
---

hdfs-default.xml is missing the right config param and the fix is available in 
HDFS-12404. 

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch, HDFS-12357.006.patch, 
> HDFS-12357.007.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157706#comment-16157706
 ] 

Hudson commented on HDFS-12357:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12811 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12811/])
HDFS-12357. Let NameNode to bypass external attribute provider for (yzhang: rev 
d77ed238a911fc85d6f4bbce606cac7ec44f557f)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch, HDFS-12357.006.patch, 
> HDFS-12357.007.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-07 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157353#comment-16157353
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Many thanks to [~asuresh], [~chris.douglas], [~daryn], [~manojg] and [~atm] for 
the discussion and review. 

I committed to trunk, branch-3.0 and branch-2.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch, HDFS-12357.006.patch, 
> HDFS-12357.007.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156654#comment-16156654
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 466 unchanged - 0 fixed = 469 total (was 466) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12885747/HDFS-12357.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 514a3507bf4c 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / b6e7d13 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21033/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-07 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156615#comment-16156615
 ] 

Manoj Govindassamy commented on HDFS-12357:
---

Thanks for the patch revision [~yzhangal]. LGTM, +1. 

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch, HDFS-12357.006.patch, 
> HDFS-12357.007.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156498#comment-16156498
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Thanks [~manojg] for the review.

Uploaded rev7 to address all except for "u4", which is already covered by a 
pre-existing case as I stated earlier. Good point to check permission in 
addition to checking CALLED map in test, added that.



> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch, HDFS-12357.006.patch, 
> HDFS-12357.007.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-06 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155925#comment-16155925
 ] 

Manoj Govindassamy commented on HDFS-12357:
---

Thanks for working on the patch revision [~yzhangal]. Overall looks good to me. 
+1, pending following nits.

1. {{FSDirectory.java#initUsersToBypassExtProvider}}
{noformat}
373 List bpUserList = new ArrayList();
374 for(int i = 0; i < bypassUsers.length; i++) {
375   String tmp = bypassUsers[i].trim();
376   if (!tmp.isEmpty()) {
377 bpUserList.add(tmp);
378   }
379 }
380 if (bpUserList.size() > 0) {
381   usersToBypassExtAttrProvider = new HashSet();
382   for(String user : bpUserList) {
383 LOG.info("Add user " + user + " to the list that will bypass 
external"
384 + " attribute provider.");
385 usersToBypassExtAttrProvider.add(user.trim());
386   }
387 }
{noformat}

The above 2 for loops can be simplified to 1 loop. Checking for trim and adding 
to _usersToBypassExtAttrProvider_ can be done in the same block.

2. {{TestINodeAttributeProvider}}
{noformat}
240 String[] bypassUsers = {"u2", "u3"};
{noformat}
Can we please add "u4" also this list? yes, since this user is not in the 
bypass list the getFileStatus is going to differ from other users. Adding this 
non bypass user will make the test complete.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch, HDFS-12357.006.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155802#comment-16155802
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 466 unchanged - 0 fixed = 470 total (was 466) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 29s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure060 |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestFileAppendRestart |
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12885507/HDFS-12357.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-05 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154736#comment-16154736
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Thanks for the review [~manojg], good catches.

I uploaded rev6 to address all, except the third one, since we already have the 
test for it right before the test I added.

Would you please take another look? Thanks.



> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch, HDFS-12357.006.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-05 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154605#comment-16154605
 ] 

Manoj Govindassamy commented on HDFS-12357:
---

Thanks for working on this [~yzhangal]. Thanks [~chris.douglas] for the 
valuable comments and the alternative proposals. Much appreciated.

My review comments for HDFS-12357.001b.patch.

1. {{FSDirectory.java}}
-- {{getUserFilteredAttributeProvider}} -- line 401 to 407 can be simplified in 
a single if block
{noformat}
if (ugi == null || isUserBypassingExtAttrProvider(ugi.getUserName()) {
   return null;
}
{noformat}

-- {{initUsersToBypassExtProvider()}} : an user list like "a, b, " can trip the 
code to add a null object to the {{usersToBypassExtAttrProvider}}. Probably we 
want to verify the trimmed user before adding it to the bypass list.

2. {{hdfs-default.xml}}
-- "..for whom the external attributes provider will be bypassed" - This config 
description can have more details like what bypass would mean for the user 
operations. Is that only for permission checking or other operations etc.,

3. {{TestINodeAttributeProvider}}
-- Can you please add one non-bypassed user "u4" to the test list of users in 
line 239? Basically a true negative case.
-- Check style issues

4. Probably the patch can be renamed to HDFS-12357.006.patch so that any new 
reviewers looking at this jira can go straight to the latest patch instead of 
suffix versions in the older patch.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-05 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154520#comment-16154520
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Many thanks [~chris.douglas] for the offline discussions! 

Good summary! I can create a jira for more general solution, we can work on the 
jira when there is a new use case coming up, such as filtering by path or by 
the combination of user/path. I think the new solution can be stacked on top of 
the fix of this jira.

I checked with folks around and we agreed that for the dedicated user we can 
bypass external provider when doing permission checking. This means the 
permission checking will be using HDFS metadata.  Welcome to comment If other 
folks have any.

Thanks again!


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-05 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154337#comment-16154337
 ] 

Chris Douglas commented on HDFS-12357:
--

Had offline discussions with [~yzhangal]. We tried a version that would bypass 
not only the path component logic, but also add more generic filtering (by 
{{INodesInPath}} and {{NodeAttributes}}). Unfortunately, the API is not always 
invoked in contexts where this information is freely available.

Internally, the NameNode relies on null values for the 
{{INodeAttributeProvider}} and {{AccessControlEnforcer}}; it constructs some 
intermediate data to satisfy the plugin APIs. To extend v004/v005 to also avoid 
these costs would not be as straightforward as the invocation in 
{{FSDirectory}}. Fixing this across all providers- by pushing these conditions 
ahead of the call- is a more significant refactor with implications for 
existing implementations. [~yzhangal] cited experience in the field, where 
copying jobs cause NN failover. We don't have specific data implicating the 
costs we're avoiding here, but the more general solution has no willing 
implementors, so we can press forward with v001b.

Someone more familiar with external attribute providers should 
[verify|https://issues.apache.org/jira/browse/HDFS-12357?focusedCommentId=16151280=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16151280]
 the bypass of {{AccessControlEnforcer}} for the configured users.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-02 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151587#comment-16151587
 ] 

Chris Douglas commented on HDFS-12357:
--

bq. if we can avoid duplicate of components, it would be great.
Are there any clusters that benefit from this optimization? The fraction of 
calls for distcp jobs, in clusters with an external attribute provider 
configured, run by a particular user is unlikely to be significant. This cost 
is currently incurred by _every_ call with an external attribute provider. 
Again, if this cost is significant, then it should be optimized across external 
attribute providers (moving this logic into {{INodeAttributeProvider}}).

The approach in v001\* is difficult to extend. For example, if another 
developer were to add a config knob that bypassed particular paths, then it 
would have to work around the logic in {{FSDirectory}}.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001a.patch, HDFS-12357.001b.patch, 
> HDFS-12357.001.patch, HDFS-12357.002.patch, HDFS-12357.003.patch, 
> HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151577#comment-16151577
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 466 unchanged - 0 fixed = 471 total (was 466) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}113m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestPipelines |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 |
|   | hadoop.hdfs.TestListFilesInFileContext |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12885094/HDFS-12357.001b.patch 
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 3b184b55560f 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 275980b |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20979/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151547#comment-16151547
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 466 unchanged - 0 fixed = 471 total (was 466) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 43s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.TestAclsEndToEnd |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestPipelines |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12885085/HDFS-12357.001a.patch 
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux dd60d533f99e 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 275980b |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-02 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151448#comment-16151448
 ] 

Yongjun Zhang commented on HDFS-12357:
--

HI [~manojg],

{quote}
Having UserFilterINodeAttributeProvider seems like a cleaner approach. Is it 
possible to examine the bypassUser config and skip the wrapper 
UserFilterINodeAttributeProvider if the user list is empty. Most of the times, 
the bypass user list is going to empty and we can totally skip the wrapper if 
so.
{quote}
Thanks for the good point here, sorry too many updates today I missed the above 
one again.

If we move the code of loading conf and checking isBypassUse to {{FSDirectory}} 
class  (like done in v001), we could  skip the wrapper when the bypassUser is 
empty. However, even when bypassUser is not empty, it's only one of two users, 
the wrapper is still created when many other users are not in the list. Any 
further thought?

Hi [~chris.douglas],

Looking at the change I did in rev5 again, it saved the extra cost of 
{{components = Arrays.copyOfRange(components, 1, components.length);}}, but it 
introduced another extra cost: {{isBypassUser()}} is called twice. One at
{code}
if (attributeProvider != null &&
!attributeProvider.isBypassUser()) {
{code}
The other at the trapper implementation
{code}
nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
{code} 

after the first one is checked and found to be a non bypassUser, the second one 
checks again. And this extra call happens to most users unfortunately.  Seems 
not easy to avoid both extra costs with the wrapper approach.

v001 implementation does't have either of these extra costs. But certainly the 
wrapper class is a better abstraction.  I can go with either approach if 
agreed, and we can certainly keep improving the solution.

Thanks a lot.






> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151396#comment-16151396
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 45s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 4 new + 411 unchanged - 
0 fixed = 415 total (was 411) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 41s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 647 unchanged - 0 fixed = 651 total (was 647) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12885045/HDFS-12357.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux b864616b7569 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7996eca |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20976/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| checkstyle | 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151362#comment-16151362
 ] 

Yongjun Zhang commented on HDFS-12357:
--

HI [~chris.douglas],

I uploaded rev005 to avoid the {{components = Arrays.copyOfRange(components, 1, 
components.length);}} overhead.

Basically I added a new API (package scope)  {{boolean isBypassUser() {}} to 
{{INodeAttributeProvider}} class, and have a default implementation of 
returning false. Then let {{UserFilterINodeAttributeProvider}} version to 
override it. Then do the following
{code}
if (attributeProvider != null &&
!attributeProvider.isBypassUser()) {
  // permission checking sends the full components array including the
  // first empty component for the root.  however file status
  // related calls are expected to strip out the root component according
  // to TestINodeAttributeProvider.
  byte[][] components = iip.getPathComponents();
  components = Arrays.copyOfRange(components, 1, components.length);
  nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
}
return nodeAttrs;
..
{code}
similar to the logic as in v001. 

So here is a trade-off between not exposing the isBypassUser API and suffer the 
cost overhead, vs exposing it and save the cost.

Wonder what you think?

Thanks.



> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151361#comment-16151361
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 56s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 4 new + 411 unchanged - 
0 fixed = 415 total (was 411) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 643 unchanged - 0 fixed = 647 total (was 643) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 |
|   | hadoop.hdfs.TestEncryptedTransfer |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.hdfs.TestReadStripedFileWithDecoding |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | HDFS-12357 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12885032/HDFS-12357.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 671cd6c01249 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151345#comment-16151345
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Hi [~chris.douglas],

With v004, the only concern is now 

https://issues.apache.org/jira/browse/HDFS-12357?focusedCommentId=16151306=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16151306

If we can avoid {{components = Arrays.copyOfRange(components, 1, 
components.length);}}, that will be great. Because this happens to every 
{{getAttributes}} call which can be avoided when it's the bypass user.

Thanks.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151344#comment-16151344
 ] 

Manoj Govindassamy commented on HDFS-12357:
---

Thanks for the patch [~chris.douglas]. Having 
{{UserFilterINodeAttributeProvider}} seems like a cleaner approach. Is it 
possible to examine the {{bypassUser}} config and skip the wrapper 
{{UserFilterINodeAttributeProvider}} if the user list is empty. Most of the 
times, the bypass user list is going to empty and we can totally skip the 
wrapper if so. 

{noformat}
205   void setINodeAttributeProvider(
206   INodeAttributeProvider provider, Configuration conf) {
207 attributeProvider = null == provider
208 ? null
209 : new UserFilterINodeAttributeProvider(provider, conf);
207   } 210 
{noformat}

[~yzhangal], I don't see the problem with {{getAccessControlEnforcer}}. But as 
you pointed out, if we can avoid duplicate of components, it would be great. 

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151325#comment-16151325
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Ah, I overlooked the code here you added in the new class
{code}
 @Override
  public AccessControlEnforcer getExternalAccessControlEnforcer(
  AccessControlEnforcer defaultEnforcer) {
return isBypassUser()
? defaultEnforcer
: provider.getExternalAccessControlEnforcer(defaultEnforcer);
  }
{code}
so, that actually addressed comment 2.b.

Thanks.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151315#comment-16151315
 ] 

Yongjun Zhang commented on HDFS-12357:
--

HI [~chris.douglas],

Would you please revisit my comment 2.b? 

In order to do the wrapper implementation,  we either need to add a new API to 
the provider base class, such that it returns the real provider based on 
whether it's bypass user, or add a new API to say whether it's bypass user, and 
let the following method to call this API:
{code}
 private AccessControlEnforcer getAccessControlEnforcer() {
return (attributeProvider != null)
? attributeProvider.getExternalAccessControlEnforcer(this) : this;
  }
{code}
Adding this new API is an integration issue to me.

Thanks.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151306#comment-16151306
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Hm, I saw that you do this
{code}
  @Override
  public INodeAttributes getAttributes(
  String[] pathElements, INodeAttributes inode) {
return isBypassUser()
? inode
: provider.getAttributes(pathElements, inode);
  }
{code}
that is, you did not try to get the HDFS attributes again, instead, you 
returned the attributes passed from caller.

However, 
{code}
  byte[][] components = iip.getPathComponents();
  components = Arrays.copyOfRange(components, 1, components.length);
{code}
the above code is avoided in v001, but it's unavoidable in wrapper 
implementation, even though {{components}} will not be used when it's bypass 
user. So this is a waste.




> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151304#comment-16151304
 ] 

Chris Douglas commented on HDFS-12357:
--

The inner provider is not invoked. 
{{attributeProvider.getAttributes(components, nodeAttrs)}} checks for the user, 
and returns {{nodeAttrs}}.

If {{iip.getPathComponents()}} and the copy is a significant cost- which would 
be bad news for external attribute providers generally- then this could still 
be pushed down a level, out of {{FSDirectory}}.

I don't see why this is a significant difference.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151290#comment-16151290
 ] 

Yongjun Zhang commented on HDFS-12357:
--

{quote}
As in the v001 version, this is avoided.
{quote}
Not really. In the following code, 

we get HDFS attributes first by {{INodeAttributes nodeAttrs = 
node.getSnapshotINode(snapshot);}}. Then we get the external provider attribute 
if needed.

In v001, for special user, it's not needed to get external provider attribute, 
thus we don't call {{nodeAttrs = attributeProvider.getAttributes(components, 
nodeAttrs);}}; 

However, in the wrapper solution, we will go into the {{if (attributeProvider 
!= null) {}} block and call it. If the {{attributeProvider.getAttributes}} 
decides to bypass external provider, it's going to do the same thing as  
{{INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);}} to get the 
HDFS version attribute. So we get the HDFS attribute twice.In v001, we only get 
it once.

{code}
 INodeAttributes getAttributes(INodesInPath iip)
  throws FileNotFoundException {
INode node = FSDirectory.resolveLastINode(iip);
int snapshot = iip.getPathSnapshotId();
INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
if (attributeProvider != null) {
  // permission checking sends the full components array including the
  // first empty component for the root.  however file status
  // related calls are expected to strip out the root component according
  // to TestINodeAttributeProvider.
  byte[][] components = iip.getPathComponents();
  components = Arrays.copyOfRange(components, 1, components.length);
  nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
}
return nodeAttrs;
  }
{code}


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151280#comment-16151280
 ] 

Chris Douglas commented on HDFS-12357:
--

v002 assumed that enforcement should always delegate to the provider, unlike 
v001 and v004 (which should be equivalent). I'm not familiar with how 
{{INodeAttributeProvider}} is used in practice, so I'll defer to you on the 
correct semantics.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151272#comment-16151272
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Hi [~chris.douglas],

In patch rev1, I passed null to attributeProvider in getPermissionChecker when 
it's a bypass user (2.b of my earlier comments),  so the external provider is 
bypassed and we don't need to check {{isBypassUser}} in {{private 
AccessControlEnforcer getAccessControlEnforcer()}} you asked.

I think my rev1 covered all. But it's possible I missed something.

Thanks.
 


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151246#comment-16151246
 ] 

Chris Douglas commented on HDFS-12357:
--

I wasn't sure if {{INodeAttributeProvider::getExternalAccessControlEnforcer}} 
should have respected the user list, but from 
{{FSPermissionChecker::getAccessControlEnforcer}}:
{code:java}
  private AccessControlEnforcer getAccessControlEnforcer() {
return (attributeProvider != null)
? attributeProvider.getExternalAccessControlEnforcer(this) : this;
  }
{code}
It looks like v001 should check {{isBypassUser}} and return the default if it 
matches, exactly like the other methods. Are there other cases that this needs 
to cover?

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151239#comment-16151239
 ] 

Chris Douglas commented on HDFS-12357:
--

bq. 1. the wrapper need to create two provider objects, one is the default 
(HDFS), the other is the external provider, and switch between these two. 
However, in the existing code, I don't see the default provider object is 
always created
Sure, but if no external attribute provider is created, then the wrapper 
doesn't need to be created. What is the problem?

bq. 2a. \[...]  The easiest way is to check if the user is a special user, then 
we don't ask for provider's data at all. If we do this in a wrapper class, we 
always have to get some attributes, which maybe from HDFS or not. \[...]
As in the v001 version, this is avoided.

bq. 2b. Here we need to pass either a null or the external attributeProvider 
configured to permission checker. if we include this logic to the external 
provider, we need have an API in this wrapper class, to return the external 
provicer or null
Unless this is invoked in a separate thread, doesn't the same logic apply? If 
the provider is configured then it's invoked by {{FSPermissionChecker}}, if 
it's a filtered user then it doesn't consult the external attribute provider.

bq. My comments are largely about the integration, which is the key part that 
you did not address in the example patch. If you'd like, would you please take 
a look?
I'll take a second pass, but I don't intend to take over the patch...

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151189#comment-16151189
 ] 

Yongjun Zhang commented on HDFS-12357:
--

HI [~chris.douglas],

Sorry I did not see your latest comment and even updated a revised patch when I 
made the earlier comments. Thanks much for doing that. 

It seems my last comments still applies. My comments are largely about the 
integration, which is the key part that you did not address in the example 
patch. If you'd like, would you please take a look?

Thanks.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151121#comment-16151121
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m  1s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 4 new + 411 unchanged - 
0 fixed = 415 total (was 411) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 466 unchanged - 0 fixed = 471 total (was 466) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
50s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Write to static field 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.usersToBypassExtAttrProvider 
from instance method new 
org.apache.hadoop.hdfs.server.namenode.FSDirectory(FSNamesystem, Configuration) 
 At FSDirectory.java:from instance method new 
org.apache.hadoop.hdfs.server.namenode.FSDirectory(FSNamesystem, Configuration) 
 At FSDirectory.java:[line 371] |
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure130 |
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 |
|   | hadoop.hdfs.tools.TestDebugAdmin |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 |
| Timed out junit tests | 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151047#comment-16151047
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Thanks [~chris.douglas] and [~manojg].

Sorry for a lengthy reply here:

{quote}
Would a filter implementation wrapping the configured, external attribute 
provider suffice?
{quote}
The current patch implements this logic (like an inlined version of the wrapper 
class in C++ world). If we put this logic to the wrapper class, I can see some 
issues:

1. the wrapper need to create two provider objects, one is the default (HDFS), 
the other is the external provider, and switch between these two. However, in 
the existing code, I don't see the default provider object is always created. 
See 2.a below.

2. currently there are two places to decide whether to consult external 
attribute provider
2.a.
{code}
  INodeAttributes getAttributes(INodesInPath iip)
  throws FileNotFoundException {
INode node = FSDirectory.resolveLastINode(iip);
int snapshot = iip.getPathSnapshotId();
INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
if (attributeProvider != null) {
  // permission checking sends the full components array including the
  // first empty component for the root.  however file status
  // related calls are expected to strip out the root component according
  // to TestINodeAttributeProvider.
  byte[][] components = iip.getPathComponents();
  components = Arrays.copyOfRange(components, 1, components.length);
  nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
}
return nodeAttrs;
  }
{code}
we already got the attributes from HDFS, then we decide to whether to overwrite 
it with provider's data. The easiest way is to check if the user is a special 
user, then we don't ask for provider's data at all. If we do this in a wrapper 
class, we always have to get some attributes, which maybe from HDFS or not. 
It's not a clear implementation and may incur runtime cost.

2.b
{code}
 @VisibleForTesting
  FSPermissionChecker getPermissionChecker(String fsOwner, String superGroup,
  UserGroupInformation ugi) throws AccessControlException {
return new FSPermissionChecker(
fsOwner, superGroup, ugi, attributeProvider);
  }
{code}
Here we need to pass either a null or the external attributeProvider configured 
to permission checker. if we include this logic to the external provider, we 
need have an API in this wrapper class, to return the external provicer or 
null, and pass it to the "attributeProvider" parameter in the above code. like
{code}
return new FSPermissionChecker(
fsOwner, superGroup, ugi, attributeProvider.getRealAttributeProvider());
{code}
We need to add this getRealAttibuteProvider() API to the base provider class, 
which is a bit weird because this API is only meaning ful in the wrapper layer.

Thoughts?

Thanks.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150921#comment-16150921
 ] 

Manoj Govindassamy commented on HDFS-12357:
---

Thanks for working on this [~yzhangal]. Thanks [~chris.douglas] for your review 
and comments.

I believe the motive here is to strictly not return any of external provider 
attributes for certain users. Tools like distcp can listFileStatus() as this 
special user to get plain/standalone hdfs attributes which can then be _safely_ 
copied to a remote hdfs. We might not want tools like DistCp to copy external 
attributes to HDFS. 

Now, this knob/control for returning external attributes can either be given to 
HDFS or the external provider. While having all the logics about returning the 
right set of attributes at a single place, like the provider does sound like 
very good idea, there is still a gap in the design. If I understand the problem 
rightly, here the choice need to be given to HDFS whether to contact external 
attributes provider or return the local default provider, so as to be totally 
sure that right set of attributes are returned. May be this guarantee is not 
established if the control is placed at the external provider. 


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150890#comment-16150890
 ] 

Chris Douglas commented on HDFS-12357:
--

bq. we can implement the same logic in the provider. However, that means all 
different providers (sentry, ranger etc) need to be fixed accordingly
Would a filter implementation wrapping the configured, external attribute 
provider suffice?

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-09-01 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150819#comment-16150819
 ] 

Yongjun Zhang commented on HDFS-12357:
--

HI [~chris.douglas],

Thanks a lot for your comment.

Some thoughts:

- I assumed that the external attribute provider is not expected to have 
knowledge of NameNode, is this not the case? 
- I agree that if we call NameNode.getRemoteUser in external provider, we can 
implement the same logic in the provider. However, that means all different 
providers (sentry, ranger etc) need to be fixed accordingly, otherwise we will 
get unexpected result. Is this what we want to do?
- The problem here is to decide whether to consult ext provider based on user, 
not based on user/path combination. So it seems more clear to let NN to decide 
whether to consult ext provider. If we let the provider to decide, and if there 
is bug in the provider, we will get unexpected result.
- Operation-wise, to change all provider's implementation and update clusters 
is more expensive. 

What do you think about these points?

Thanks.



> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-31 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149947#comment-16149947
 ] 

Chris Douglas commented on HDFS-12357:
--

Sorry to be dense, but why can't this live in the external attribute provider? 
{{NameNode::getRemoteUser}} is not only public, it's a 2-line method calling 
stable APIs. Given:
{code:java}
  INodeAttributes getAttributes(INodesInPath iip)
  throws FileNotFoundException {
INode node = FSDirectory.resolveLastINode(iip);
int snapshot = iip.getPathSnapshotId();
INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
if (attributeProvider != null) {
  // permission checking sends the full components array including the
  // first empty component for the root.  however file status
  // related calls are expected to strip out the root component according
  // to TestINodeAttributeProvider.
  byte[][] components = iip.getPathComponents();
  components = Arrays.copyOfRange(components, 1, components.length);
  nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
}
return nodeAttrs;
  }
{code}
can't {{attributeProvider}} return the formal {{nodeAttrs}} unmodified after 
performing the same logic as {{NameNode::getRemoteUser}}?

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-31 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149912#comment-16149912
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Hi [~atm], [~daryn] [~manojg], any comments/thoughts on my previous reply?

Hi [~asuresh] and [~chris.douglas], would appreciate if you guys could take a 
look at the patch too.

Thanks a lot.


> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-30 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148139#comment-16148139
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Thanks you all for the review and comments!

[~atm]: good point, will remove the static. Thanks.

[~daryn], thanks for your comments, some thoughts:
1. Based on user/path to decide what attributes to reveal is indeed more 
refined. However, it adds complexity. And every provider has to provide an 
implementation. Wonder if you can provide an example we want to decide things 
based on user/path?
2. Currently I use NameNode.getRemoteUser() to tell which user it is. If we put 
this bypass logic into Provider, the provider need to know what the current 
user is. we either have to change the API of provider, or add some new methods 
in parallel.

[~manojg], talking about SnapshotDiff to bypass provider, the caller need to 
tell the provider to do that, thus new API is needed. Right? thanks.

Look forward to your further thoughts and comments!

Thanks a lot.



> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-30 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148062#comment-16148062
 ] 

Manoj Govindassamy commented on HDFS-12357:
---

[~yzhangal],
  Here is one other jira on the similar lines - HDFS-12203 - 
INodeAttributesProvider#getAttributes() support for default/passthrough mode.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-30 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147258#comment-16147258
 ] 

Daryn Sharp commented on HDFS-12357:


Should this perhaps be implemented in the external attribute provider itself?  
Instead of an all or nothing approach, it will grant the provider fine-grain 
authorization control over combinations of users and paths to expose "real" 
attrs.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-29 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146393#comment-16146393
 ] 

Aaron T. Myers commented on HDFS-12357:
---

Took a quick look at the patch, not thorough. One thing jumped out at me on a 
cursory inspection - why make {{usersToBypassExtAttrProvider}} static? Probably 
won't cause any problems, but also doesn't seem necessary, and could 
potentially confuse the situation, e.g. in a unit test with multiple NNs.

> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143514#comment-16143514
 ] 

Hadoop QA commented on HDFS-12357:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 465 unchanged - 0 fixed = 470 total (was 465) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
48s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 90m 41s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}116m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Write to static field 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.usersToBypassExtAttrProvider 
from instance method new 
org.apache.hadoop.hdfs.server.namenode.FSDirectory(FSNamesystem, Configuration) 
 At FSDirectory.java:from instance method new 
org.apache.hadoop.hdfs.server.namenode.FSDirectory(FSNamesystem, Configuration) 
 At FSDirectory.java:[line 371] |
| Failed junit tests | hadoop.hdfs.TestLeaseRecoveryStriped |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 |
|   | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure020 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure100 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure050 |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
\\
\\
|| 

[jira] [Commented] (HDFS-12357) Let NameNode to bypass external attribute provider for special user

2017-08-28 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143421#comment-16143421
 ] 

Yongjun Zhang commented on HDFS-12357:
--

Hi [~asuresh], [~chris.douglas], [~daryn], [~manojg] and other folks who are 
interested,

Posted a draft patch for further discussion.  If you could review and comment, 
it would be very much appreciated.

Thanks.



> Let NameNode to bypass external attribute provider for special user
> ---
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org