[jira] [Updated] (HADOOP-16350) Ability to tell HDFS client not to request KMS Information from NameNode

2019-06-25 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-16350:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Ability to tell HDFS client not to request KMS Information from NameNode
> 
>
> Key: HADOOP-16350
> URL: https://issues.apache.org/jira/browse/HADOOP-16350
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms
>Affects Versions: 2.8.3, 3.0.0, 2.7.6, 3.1.2
>Reporter: Greg Senia
>Priority: Major
> Fix For: 3.3.0, 2.8.6
>
> Attachments: HADOOP-16350-branch-2.8.01.patch, 
> HADOOP-16350-branch-2.8.02.patch, HADOOP-16350.00.patch, 
> HADOOP-16350.01.patch, HADOOP-16350.02.patch, HADOOP-16350.03.patch, 
> HADOOP-16350.04.patch, HADOOP-16350.05.patch
>
>
> Before HADOOP-14104 Remote KMSServer URIs were not requested from the remote 
> NameNode and their associated remote KMSServer delegation token. Many 
> customers were using this as a security feature to prevent TDE/Encryption 
> Zone data from being distcped to remote clusters. But there was still a use 
> case to allow distcp of data residing in folders that are not being encrypted 
> with a KMSProvider/Encrypted Zone.
> So after upgrading to a version of Hadoop that contained HADOOP-14104 distcp 
> now fails as we along with other customers (HDFS-13696) DO NOT allow 
> KMSServer endpoints to be exposed out of our cluster network as data residing 
> in these TDE/Zones contain very critical data that cannot be distcped between 
> clusters.
> I propose adding a new code block with the following custom property 
> "hadoop.security.kms.client.allow.remote.kms" it will default to "true" so 
> keeping current feature of HADOOP-14104 but if specified to "false" will 
> allow this area of code to operate as it did before HADOOP-14104. I can see 
> the value in HADOOP-14104 but the way Hadoop worked before this JIRA/Issue 
> should of at least had an option specified to allow Hadoop/KMS code to 
> operate similar to how it did before by not requesting remote KMSServer URIs 
> which would than attempt to get a delegation token even if not operating on 
> encrypted zones.
> Error when KMS Server traffic is not allowed between cluster networks per 
> enterprise security standard which cannot be changed they denied the request 
> for exception so the only solution is to allow a feature to not attempt to 
> request tokens. 
> {code:java}
> $ hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
> -Dmapreduce.job.hdfs-servers.token-renewal.exclude=tech 
> hdfs:///processed/public/opendata/samples/distcp_test/distcp_file.txt 
> hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt
> 19/05/29 14:06:09 INFO tools.DistCp: Input Options: DistCpOptions
> {atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
> fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=100, 
> sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], 
> preserveRawXattrs=false, atomicWorkPath=null, logPath=null, 
> sourceFileListing=null, 
> sourcePaths=[hdfs:/processed/public/opendata/samples/distcp_test/distcp_file.txt],
>  
> targetPath=hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt,
>  targetPathExists=true, filtersFile='null', verboseLog=false}
> 19/05/29 14:06:09 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 5093920 for gss2002 on ha-hdfs:unit
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN 
> token 5093920 for gss2002)
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> kms-dt, Service: ha21d53en.unit.hdp.example.com:9292, Ident: (owner=gss2002, 
> renewer=yarn, realUser=, issueDate=1559153170120, maxDate=1559757970120, 
> sequenceNumber=237, masterKeyId=2)
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; 
> dirCnt = 0
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Build file listing completed.
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 556079 for gss2002 on ha-hdfs:tech
> 19/05/29 14:06:10 

[jira] [Updated] (HADOOP-16350) Ability to tell HDFS client not to request KMS Information from NameNode

2019-06-25 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-16350:

Fix Version/s: 2.8.6

> Ability to tell HDFS client not to request KMS Information from NameNode
> 
>
> Key: HADOOP-16350
> URL: https://issues.apache.org/jira/browse/HADOOP-16350
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms
>Affects Versions: 2.8.3, 3.0.0, 2.7.6, 3.1.2
>Reporter: Greg Senia
>Priority: Major
> Fix For: 3.3.0, 2.8.6
>
> Attachments: HADOOP-16350-branch-2.8.01.patch, 
> HADOOP-16350-branch-2.8.02.patch, HADOOP-16350.00.patch, 
> HADOOP-16350.01.patch, HADOOP-16350.02.patch, HADOOP-16350.03.patch, 
> HADOOP-16350.04.patch, HADOOP-16350.05.patch
>
>
> Before HADOOP-14104 Remote KMSServer URIs were not requested from the remote 
> NameNode and their associated remote KMSServer delegation token. Many 
> customers were using this as a security feature to prevent TDE/Encryption 
> Zone data from being distcped to remote clusters. But there was still a use 
> case to allow distcp of data residing in folders that are not being encrypted 
> with a KMSProvider/Encrypted Zone.
> So after upgrading to a version of Hadoop that contained HADOOP-14104 distcp 
> now fails as we along with other customers (HDFS-13696) DO NOT allow 
> KMSServer endpoints to be exposed out of our cluster network as data residing 
> in these TDE/Zones contain very critical data that cannot be distcped between 
> clusters.
> I propose adding a new code block with the following custom property 
> "hadoop.security.kms.client.allow.remote.kms" it will default to "true" so 
> keeping current feature of HADOOP-14104 but if specified to "false" will 
> allow this area of code to operate as it did before HADOOP-14104. I can see 
> the value in HADOOP-14104 but the way Hadoop worked before this JIRA/Issue 
> should of at least had an option specified to allow Hadoop/KMS code to 
> operate similar to how it did before by not requesting remote KMSServer URIs 
> which would than attempt to get a delegation token even if not operating on 
> encrypted zones.
> Error when KMS Server traffic is not allowed between cluster networks per 
> enterprise security standard which cannot be changed they denied the request 
> for exception so the only solution is to allow a feature to not attempt to 
> request tokens. 
> {code:java}
> $ hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
> -Dmapreduce.job.hdfs-servers.token-renewal.exclude=tech 
> hdfs:///processed/public/opendata/samples/distcp_test/distcp_file.txt 
> hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt
> 19/05/29 14:06:09 INFO tools.DistCp: Input Options: DistCpOptions
> {atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
> fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=100, 
> sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], 
> preserveRawXattrs=false, atomicWorkPath=null, logPath=null, 
> sourceFileListing=null, 
> sourcePaths=[hdfs:/processed/public/opendata/samples/distcp_test/distcp_file.txt],
>  
> targetPath=hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt,
>  targetPathExists=true, filtersFile='null', verboseLog=false}
> 19/05/29 14:06:09 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 5093920 for gss2002 on ha-hdfs:unit
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN 
> token 5093920 for gss2002)
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> kms-dt, Service: ha21d53en.unit.hdp.example.com:9292, Ident: (owner=gss2002, 
> renewer=yarn, realUser=, issueDate=1559153170120, maxDate=1559757970120, 
> sequenceNumber=237, masterKeyId=2)
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; 
> dirCnt = 0
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Build file listing completed.
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 556079 for gss2002 on ha-hdfs:tech
> 19/05/29 14:06:10 ERROR tools.DistCp: Exception encountered 
> 

[jira] [Updated] (HADOOP-16350) Ability to tell HDFS client not to request KMS Information from NameNode

2019-06-25 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-16350:

Attachment: HADOOP-16350-branch-2.8.02.patch

> Ability to tell HDFS client not to request KMS Information from NameNode
> 
>
> Key: HADOOP-16350
> URL: https://issues.apache.org/jira/browse/HADOOP-16350
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms
>Affects Versions: 2.8.3, 3.0.0, 2.7.6, 3.1.2
>Reporter: Greg Senia
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16350-branch-2.8.01.patch, 
> HADOOP-16350-branch-2.8.02.patch, HADOOP-16350.00.patch, 
> HADOOP-16350.01.patch, HADOOP-16350.02.patch, HADOOP-16350.03.patch, 
> HADOOP-16350.04.patch, HADOOP-16350.05.patch
>
>
> Before HADOOP-14104 Remote KMSServer URIs were not requested from the remote 
> NameNode and their associated remote KMSServer delegation token. Many 
> customers were using this as a security feature to prevent TDE/Encryption 
> Zone data from being distcped to remote clusters. But there was still a use 
> case to allow distcp of data residing in folders that are not being encrypted 
> with a KMSProvider/Encrypted Zone.
> So after upgrading to a version of Hadoop that contained HADOOP-14104 distcp 
> now fails as we along with other customers (HDFS-13696) DO NOT allow 
> KMSServer endpoints to be exposed out of our cluster network as data residing 
> in these TDE/Zones contain very critical data that cannot be distcped between 
> clusters.
> I propose adding a new code block with the following custom property 
> "hadoop.security.kms.client.allow.remote.kms" it will default to "true" so 
> keeping current feature of HADOOP-14104 but if specified to "false" will 
> allow this area of code to operate as it did before HADOOP-14104. I can see 
> the value in HADOOP-14104 but the way Hadoop worked before this JIRA/Issue 
> should of at least had an option specified to allow Hadoop/KMS code to 
> operate similar to how it did before by not requesting remote KMSServer URIs 
> which would than attempt to get a delegation token even if not operating on 
> encrypted zones.
> Error when KMS Server traffic is not allowed between cluster networks per 
> enterprise security standard which cannot be changed they denied the request 
> for exception so the only solution is to allow a feature to not attempt to 
> request tokens. 
> {code:java}
> $ hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
> -Dmapreduce.job.hdfs-servers.token-renewal.exclude=tech 
> hdfs:///processed/public/opendata/samples/distcp_test/distcp_file.txt 
> hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt
> 19/05/29 14:06:09 INFO tools.DistCp: Input Options: DistCpOptions
> {atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
> fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=100, 
> sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], 
> preserveRawXattrs=false, atomicWorkPath=null, logPath=null, 
> sourceFileListing=null, 
> sourcePaths=[hdfs:/processed/public/opendata/samples/distcp_test/distcp_file.txt],
>  
> targetPath=hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt,
>  targetPathExists=true, filtersFile='null', verboseLog=false}
> 19/05/29 14:06:09 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 5093920 for gss2002 on ha-hdfs:unit
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN 
> token 5093920 for gss2002)
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> kms-dt, Service: ha21d53en.unit.hdp.example.com:9292, Ident: (owner=gss2002, 
> renewer=yarn, realUser=, issueDate=1559153170120, maxDate=1559757970120, 
> sequenceNumber=237, masterKeyId=2)
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; 
> dirCnt = 0
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Build file listing completed.
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 556079 for gss2002 on ha-hdfs:tech
> 19/05/29 14:06:10 ERROR tools.DistCp: 

[jira] [Updated] (HADOOP-16350) Ability to tell HDFS client not to request KMS Information from NameNode

2019-06-25 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-16350:

Attachment: (was: HADOOP-16350-branch-2.8.patch)

> Ability to tell HDFS client not to request KMS Information from NameNode
> 
>
> Key: HADOOP-16350
> URL: https://issues.apache.org/jira/browse/HADOOP-16350
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms
>Affects Versions: 2.8.3, 3.0.0, 2.7.6, 3.1.2
>Reporter: Greg Senia
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16350-branch-2.8.01.patch, 
> HADOOP-16350-branch-2.8.02.patch, HADOOP-16350.00.patch, 
> HADOOP-16350.01.patch, HADOOP-16350.02.patch, HADOOP-16350.03.patch, 
> HADOOP-16350.04.patch, HADOOP-16350.05.patch
>
>
> Before HADOOP-14104 Remote KMSServer URIs were not requested from the remote 
> NameNode and their associated remote KMSServer delegation token. Many 
> customers were using this as a security feature to prevent TDE/Encryption 
> Zone data from being distcped to remote clusters. But there was still a use 
> case to allow distcp of data residing in folders that are not being encrypted 
> with a KMSProvider/Encrypted Zone.
> So after upgrading to a version of Hadoop that contained HADOOP-14104 distcp 
> now fails as we along with other customers (HDFS-13696) DO NOT allow 
> KMSServer endpoints to be exposed out of our cluster network as data residing 
> in these TDE/Zones contain very critical data that cannot be distcped between 
> clusters.
> I propose adding a new code block with the following custom property 
> "hadoop.security.kms.client.allow.remote.kms" it will default to "true" so 
> keeping current feature of HADOOP-14104 but if specified to "false" will 
> allow this area of code to operate as it did before HADOOP-14104. I can see 
> the value in HADOOP-14104 but the way Hadoop worked before this JIRA/Issue 
> should of at least had an option specified to allow Hadoop/KMS code to 
> operate similar to how it did before by not requesting remote KMSServer URIs 
> which would than attempt to get a delegation token even if not operating on 
> encrypted zones.
> Error when KMS Server traffic is not allowed between cluster networks per 
> enterprise security standard which cannot be changed they denied the request 
> for exception so the only solution is to allow a feature to not attempt to 
> request tokens. 
> {code:java}
> $ hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
> -Dmapreduce.job.hdfs-servers.token-renewal.exclude=tech 
> hdfs:///processed/public/opendata/samples/distcp_test/distcp_file.txt 
> hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt
> 19/05/29 14:06:09 INFO tools.DistCp: Input Options: DistCpOptions
> {atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
> fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=100, 
> sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], 
> preserveRawXattrs=false, atomicWorkPath=null, logPath=null, 
> sourceFileListing=null, 
> sourcePaths=[hdfs:/processed/public/opendata/samples/distcp_test/distcp_file.txt],
>  
> targetPath=hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt,
>  targetPathExists=true, filtersFile='null', verboseLog=false}
> 19/05/29 14:06:09 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 5093920 for gss2002 on ha-hdfs:unit
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN 
> token 5093920 for gss2002)
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> kms-dt, Service: ha21d53en.unit.hdp.example.com:9292, Ident: (owner=gss2002, 
> renewer=yarn, realUser=, issueDate=1559153170120, maxDate=1559757970120, 
> sequenceNumber=237, masterKeyId=2)
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; 
> dirCnt = 0
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Build file listing completed.
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 556079 for gss2002 on ha-hdfs:tech
> 19/05/29 14:06:10 ERROR tools.DistCp: 

[jira] [Updated] (HADOOP-16350) Ability to tell HDFS client not to request KMS Information from NameNode

2019-06-24 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-16350:

Attachment: HADOOP-16350-branch-2.8.01.patch

> Ability to tell HDFS client not to request KMS Information from NameNode
> 
>
> Key: HADOOP-16350
> URL: https://issues.apache.org/jira/browse/HADOOP-16350
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms
>Affects Versions: 2.8.3, 3.0.0, 2.7.6, 3.1.2
>Reporter: Greg Senia
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16350-branch-2.8.01.patch, 
> HADOOP-16350-branch-2.8.patch, HADOOP-16350.00.patch, HADOOP-16350.01.patch, 
> HADOOP-16350.02.patch, HADOOP-16350.03.patch, HADOOP-16350.04.patch, 
> HADOOP-16350.05.patch
>
>
> Before HADOOP-14104 Remote KMSServer URIs were not requested from the remote 
> NameNode and their associated remote KMSServer delegation token. Many 
> customers were using this as a security feature to prevent TDE/Encryption 
> Zone data from being distcped to remote clusters. But there was still a use 
> case to allow distcp of data residing in folders that are not being encrypted 
> with a KMSProvider/Encrypted Zone.
> So after upgrading to a version of Hadoop that contained HADOOP-14104 distcp 
> now fails as we along with other customers (HDFS-13696) DO NOT allow 
> KMSServer endpoints to be exposed out of our cluster network as data residing 
> in these TDE/Zones contain very critical data that cannot be distcped between 
> clusters.
> I propose adding a new code block with the following custom property 
> "hadoop.security.kms.client.allow.remote.kms" it will default to "true" so 
> keeping current feature of HADOOP-14104 but if specified to "false" will 
> allow this area of code to operate as it did before HADOOP-14104. I can see 
> the value in HADOOP-14104 but the way Hadoop worked before this JIRA/Issue 
> should of at least had an option specified to allow Hadoop/KMS code to 
> operate similar to how it did before by not requesting remote KMSServer URIs 
> which would than attempt to get a delegation token even if not operating on 
> encrypted zones.
> Error when KMS Server traffic is not allowed between cluster networks per 
> enterprise security standard which cannot be changed they denied the request 
> for exception so the only solution is to allow a feature to not attempt to 
> request tokens. 
> {code:java}
> $ hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
> -Dmapreduce.job.hdfs-servers.token-renewal.exclude=tech 
> hdfs:///processed/public/opendata/samples/distcp_test/distcp_file.txt 
> hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt
> 19/05/29 14:06:09 INFO tools.DistCp: Input Options: DistCpOptions
> {atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
> fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=100, 
> sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], 
> preserveRawXattrs=false, atomicWorkPath=null, logPath=null, 
> sourceFileListing=null, 
> sourcePaths=[hdfs:/processed/public/opendata/samples/distcp_test/distcp_file.txt],
>  
> targetPath=hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt,
>  targetPathExists=true, filtersFile='null', verboseLog=false}
> 19/05/29 14:06:09 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 5093920 for gss2002 on ha-hdfs:unit
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN 
> token 5093920 for gss2002)
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> kms-dt, Service: ha21d53en.unit.hdp.example.com:9292, Ident: (owner=gss2002, 
> renewer=yarn, realUser=, issueDate=1559153170120, maxDate=1559757970120, 
> sequenceNumber=237, masterKeyId=2)
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; 
> dirCnt = 0
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Build file listing completed.
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 556079 for gss2002 on ha-hdfs:tech
> 19/05/29 14:06:10 ERROR tools.DistCp: Exception 

[jira] [Updated] (HADOOP-16350) Ability to tell HDFS client not to request KMS Information from NameNode

2019-06-24 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-16350:

Attachment: HADOOP-16350-branch-2.8.patch

> Ability to tell HDFS client not to request KMS Information from NameNode
> 
>
> Key: HADOOP-16350
> URL: https://issues.apache.org/jira/browse/HADOOP-16350
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms
>Affects Versions: 2.8.3, 3.0.0, 2.7.6, 3.1.2
>Reporter: Greg Senia
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16350-branch-2.8.patch, HADOOP-16350.00.patch, 
> HADOOP-16350.01.patch, HADOOP-16350.02.patch, HADOOP-16350.03.patch, 
> HADOOP-16350.04.patch, HADOOP-16350.05.patch
>
>
> Before HADOOP-14104 Remote KMSServer URIs were not requested from the remote 
> NameNode and their associated remote KMSServer delegation token. Many 
> customers were using this as a security feature to prevent TDE/Encryption 
> Zone data from being distcped to remote clusters. But there was still a use 
> case to allow distcp of data residing in folders that are not being encrypted 
> with a KMSProvider/Encrypted Zone.
> So after upgrading to a version of Hadoop that contained HADOOP-14104 distcp 
> now fails as we along with other customers (HDFS-13696) DO NOT allow 
> KMSServer endpoints to be exposed out of our cluster network as data residing 
> in these TDE/Zones contain very critical data that cannot be distcped between 
> clusters.
> I propose adding a new code block with the following custom property 
> "hadoop.security.kms.client.allow.remote.kms" it will default to "true" so 
> keeping current feature of HADOOP-14104 but if specified to "false" will 
> allow this area of code to operate as it did before HADOOP-14104. I can see 
> the value in HADOOP-14104 but the way Hadoop worked before this JIRA/Issue 
> should of at least had an option specified to allow Hadoop/KMS code to 
> operate similar to how it did before by not requesting remote KMSServer URIs 
> which would than attempt to get a delegation token even if not operating on 
> encrypted zones.
> Error when KMS Server traffic is not allowed between cluster networks per 
> enterprise security standard which cannot be changed they denied the request 
> for exception so the only solution is to allow a feature to not attempt to 
> request tokens. 
> {code:java}
> $ hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
> -Dmapreduce.job.hdfs-servers.token-renewal.exclude=tech 
> hdfs:///processed/public/opendata/samples/distcp_test/distcp_file.txt 
> hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt
> 19/05/29 14:06:09 INFO tools.DistCp: Input Options: DistCpOptions
> {atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
> fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=100, 
> sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], 
> preserveRawXattrs=false, atomicWorkPath=null, logPath=null, 
> sourceFileListing=null, 
> sourcePaths=[hdfs:/processed/public/opendata/samples/distcp_test/distcp_file.txt],
>  
> targetPath=hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt,
>  targetPathExists=true, filtersFile='null', verboseLog=false}
> 19/05/29 14:06:09 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 5093920 for gss2002 on ha-hdfs:unit
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN 
> token 5093920 for gss2002)
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> kms-dt, Service: ha21d53en.unit.hdp.example.com:9292, Ident: (owner=gss2002, 
> renewer=yarn, realUser=, issueDate=1559153170120, maxDate=1559757970120, 
> sequenceNumber=237, masterKeyId=2)
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; 
> dirCnt = 0
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Build file listing completed.
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 556079 for gss2002 on ha-hdfs:tech
> 19/05/29 14:06:10 ERROR tools.DistCp: Exception encountered 
> java.io.IOException: 

[jira] [Updated] (HADOOP-16350) Ability to tell HDFS client not to request KMS Information from NameNode

2019-06-21 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HADOOP-16350:
---
Summary: Ability to tell HDFS client not to request KMS Information from 
NameNode  (was: Ability to tell Hadoop not to request KMS Information from 
NameNode)

> Ability to tell HDFS client not to request KMS Information from NameNode
> 
>
> Key: HADOOP-16350
> URL: https://issues.apache.org/jira/browse/HADOOP-16350
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, kms
>Affects Versions: 2.8.3, 3.0.0, 2.7.6, 3.1.2
>Reporter: Greg Senia
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16350.00.patch, HADOOP-16350.01.patch, 
> HADOOP-16350.02.patch, HADOOP-16350.03.patch, HADOOP-16350.04.patch, 
> HADOOP-16350.05.patch
>
>
> Before HADOOP-14104 Remote KMSServer URIs were not requested from the remote 
> NameNode and their associated remote KMSServer delegation token. Many 
> customers were using this as a security feature to prevent TDE/Encryption 
> Zone data from being distcped to remote clusters. But there was still a use 
> case to allow distcp of data residing in folders that are not being encrypted 
> with a KMSProvider/Encrypted Zone.
> So after upgrading to a version of Hadoop that contained HADOOP-14104 distcp 
> now fails as we along with other customers (HDFS-13696) DO NOT allow 
> KMSServer endpoints to be exposed out of our cluster network as data residing 
> in these TDE/Zones contain very critical data that cannot be distcped between 
> clusters.
> I propose adding a new code block with the following custom property 
> "hadoop.security.kms.client.allow.remote.kms" it will default to "true" so 
> keeping current feature of HADOOP-14104 but if specified to "false" will 
> allow this area of code to operate as it did before HADOOP-14104. I can see 
> the value in HADOOP-14104 but the way Hadoop worked before this JIRA/Issue 
> should of at least had an option specified to allow Hadoop/KMS code to 
> operate similar to how it did before by not requesting remote KMSServer URIs 
> which would than attempt to get a delegation token even if not operating on 
> encrypted zones.
> Error when KMS Server traffic is not allowed between cluster networks per 
> enterprise security standard which cannot be changed they denied the request 
> for exception so the only solution is to allow a feature to not attempt to 
> request tokens. 
> {code:java}
> $ hadoop distcp -Ddfs.namenode.kerberos.principal.pattern=* 
> -Dmapreduce.job.hdfs-servers.token-renewal.exclude=tech 
> hdfs:///processed/public/opendata/samples/distcp_test/distcp_file.txt 
> hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt
> 19/05/29 14:06:09 INFO tools.DistCp: Input Options: DistCpOptions
> {atomicCommit=false, syncFolder=false, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
> fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=100, 
> sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], 
> preserveRawXattrs=false, atomicWorkPath=null, logPath=null, 
> sourceFileListing=null, 
> sourcePaths=[hdfs:/processed/public/opendata/samples/distcp_test/distcp_file.txt],
>  
> targetPath=hdfs://tech/processed/public/opendata/samples/distcp_test/distcp_file2.txt,
>  targetPathExists=true, filtersFile='null', verboseLog=false}
> 19/05/29 14:06:09 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 5093920 for gss2002 on ha-hdfs:unit
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> HDFS_DELEGATION_TOKEN, Service: ha-hdfs:unit, Ident: (HDFS_DELEGATION_TOKEN 
> token 5093920 for gss2002)
> 19/05/29 14:06:10 INFO security.TokenCache: Got dt for hdfs://unit; Kind: 
> kms-dt, Service: ha21d53en.unit.hdp.example.com:9292, Ident: (owner=gss2002, 
> renewer=yarn, realUser=, issueDate=1559153170120, maxDate=1559757970120, 
> sequenceNumber=237, masterKeyId=2)
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; 
> dirCnt = 0
> 19/05/29 14:06:10 INFO tools.SimpleCopyListing: Build file listing completed.
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO tools.DistCp: Number of paths in the copy list: 1
> 19/05/29 14:06:10 INFO client.AHSProxy: Connecting to Application History 
> server at ha21d53mn.unit.hdp.example.com/10.70.49.2:10200
> 19/05/29 14:06:10 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 556079 for gss2002 on