[jira] [Commented] (HADOOP-18610) ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842553#comment-17842553
 ] 

ASF GitHub Bot commented on HADOOP-18610:
-

anujmodi2021 opened a new pull request, #6787:
URL: https://github.com/apache/hadoop/pull/6787

   ### Description of PR
   Jira: https://issues.apache.org/jira/browse/HADOOP-18610 
   Code Ported from PR: https://github.com/apache/hadoop/pull/5953
   
   Add support for [Azure Active Directory (Azure AD) workload 
identities](https://learn.microsoft.com/en-us/azure/active-directory/workload-identities/workload-identities-overview)
 which integrate with the Kubernetes's native capabilities to federate with any 
external identity provider.
   
   This PR is based on Haifeng Chen's patch attached to 
[HADOOP-18610](https://issues.apache.org/jira/browse/HADOOP-18610). I fixed a 
few typos and linter errors but did not modify the core functionality.
   
   ### How was this patch tested?
   New ABFS OAuth test configuration added for WorkloadIdentityTokenProvider. 
Complete test suite was run against Azure Blob Storage in Central US region.




> ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS
> -
>
> Key: HADOOP-18610
> URL: https://issues.apache.org/jira/browse/HADOOP-18610
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.3.4
>Reporter: Haifeng Chen
>Assignee: Anuj Modi
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HADOOP-18610-preview.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In Jan 2023, Microsoft Azure AKS replaced its original pod-managed identity 
> with with [Azure Active Directory (Azure AD) workload 
> identities|https://learn.microsoft.com/en-us/azure/active-directory/develop/workload-identities-overview]
>  (preview), which integrate with the Kubernetes native capabilities to 
> federate with any external identity providers. This approach is simpler to 
> use and deploy.
> Refer to 
> [https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview|https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview.]
>  and [https://azure.github.io/azure-workload-identity/docs/introduction.html] 
> for more details.
> The basic use scenario is to access Azure cloud resources (such as cloud 
> storage) from Kubernetes (such as AKS) workload using Azure managed identity 
> federated with Kubernetes service account. The credential environment 
> variables in pod projected by Azure AD workload identity are like following:
> AZURE_AUTHORITY_HOST: (Injected by the webhook, 
> [https://login.microsoftonline.com/])
> AZURE_CLIENT_ID: (Injected by the webhook)
> AZURE_TENANT_ID: (Injected by the webhook)
> AZURE_FEDERATED_TOKEN_FILE: (Injected by the webhook, 
> /var/run/secrets/azure/tokens/azure-identity-token)
> The token in the file pointed by AZURE_FEDERATED_TOKEN_FILE is a JWT (JASON 
> Web Token) client assertion token which we can use to request to 
> AZURE_AUTHORITY_HOST (url is  AZURE_AUTHORITY_HOST + tenantId + 
> "/oauth2/v2.0/token")  for a AD token which can be used to directly access 
> the Azure cloud resources.
> This approach is very common and similar among cloud providers such as AWS 
> and GCP. Hadoop AWS integration has WebIdentityTokenCredentialProvider to 
> handle the same case.
> The existing MsiTokenProvider can only handle the managed identity associated 
> with Azure VM instance. We need to implement a WorkloadIdentityTokenProvider 
> which handle Azure Workload Identity case. For this, we need to add one 
> method (getTokenUsingJWTAssertion) in AzureADAuthenticator which will be used 
> by WorkloadIdentityTokenProvider.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18610) ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842543#comment-17842543
 ] 

ASF GitHub Bot commented on HADOOP-18610:
-

snvijaya commented on PR #5953:
URL: https://github.com/apache/hadoop/pull/5953#issuecomment-2087970204

   Thanks @creste . @anujmodi2021 Will pick this up.
   As the PR is raised from a forked repo, we will not be able to make changes 
to the same dev branch from which the PR is raised. Will re-raise the PR and 
address the comments.




> ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS
> -
>
> Key: HADOOP-18610
> URL: https://issues.apache.org/jira/browse/HADOOP-18610
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.3.4
>Reporter: Haifeng Chen
>Assignee: Anuj Modi
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HADOOP-18610-preview.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In Jan 2023, Microsoft Azure AKS replaced its original pod-managed identity 
> with with [Azure Active Directory (Azure AD) workload 
> identities|https://learn.microsoft.com/en-us/azure/active-directory/develop/workload-identities-overview]
>  (preview), which integrate with the Kubernetes native capabilities to 
> federate with any external identity providers. This approach is simpler to 
> use and deploy.
> Refer to 
> [https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview|https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview.]
>  and [https://azure.github.io/azure-workload-identity/docs/introduction.html] 
> for more details.
> The basic use scenario is to access Azure cloud resources (such as cloud 
> storage) from Kubernetes (such as AKS) workload using Azure managed identity 
> federated with Kubernetes service account. The credential environment 
> variables in pod projected by Azure AD workload identity are like following:
> AZURE_AUTHORITY_HOST: (Injected by the webhook, 
> [https://login.microsoftonline.com/])
> AZURE_CLIENT_ID: (Injected by the webhook)
> AZURE_TENANT_ID: (Injected by the webhook)
> AZURE_FEDERATED_TOKEN_FILE: (Injected by the webhook, 
> /var/run/secrets/azure/tokens/azure-identity-token)
> The token in the file pointed by AZURE_FEDERATED_TOKEN_FILE is a JWT (JASON 
> Web Token) client assertion token which we can use to request to 
> AZURE_AUTHORITY_HOST (url is  AZURE_AUTHORITY_HOST + tenantId + 
> "/oauth2/v2.0/token")  for a AD token which can be used to directly access 
> the Azure cloud resources.
> This approach is very common and similar among cloud providers such as AWS 
> and GCP. Hadoop AWS integration has WebIdentityTokenCredentialProvider to 
> handle the same case.
> The existing MsiTokenProvider can only handle the managed identity associated 
> with Azure VM instance. We need to implement a WorkloadIdentityTokenProvider 
> which handle Azure Workload Identity case. For this, we need to add one 
> method (getTokenUsingJWTAssertion) in AzureADAuthenticator which will be used 
> by WorkloadIdentityTokenProvider.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18610. ABFS OAuth2 Token Provider support for Azure Workload Identity [hadoop]

2024-04-30 Thread via GitHub


snvijaya commented on PR #5953:
URL: https://github.com/apache/hadoop/pull/5953#issuecomment-2087970204

   Thanks @creste . @anujmodi2021 Will pick this up.
   As the PR is raised from a forked repo, we will not be able to make changes 
to the same dev branch from which the PR is raised. Will re-raise the PR and 
address the comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18610) ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS

2024-04-30 Thread Sneha Vijayarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sneha Vijayarajan reassigned HADOOP-18610:
--

Assignee: Anuj Modi

> ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS
> -
>
> Key: HADOOP-18610
> URL: https://issues.apache.org/jira/browse/HADOOP-18610
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.3.4
>Reporter: Haifeng Chen
>Assignee: Anuj Modi
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HADOOP-18610-preview.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In Jan 2023, Microsoft Azure AKS replaced its original pod-managed identity 
> with with [Azure Active Directory (Azure AD) workload 
> identities|https://learn.microsoft.com/en-us/azure/active-directory/develop/workload-identities-overview]
>  (preview), which integrate with the Kubernetes native capabilities to 
> federate with any external identity providers. This approach is simpler to 
> use and deploy.
> Refer to 
> [https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview|https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview.]
>  and [https://azure.github.io/azure-workload-identity/docs/introduction.html] 
> for more details.
> The basic use scenario is to access Azure cloud resources (such as cloud 
> storage) from Kubernetes (such as AKS) workload using Azure managed identity 
> federated with Kubernetes service account. The credential environment 
> variables in pod projected by Azure AD workload identity are like following:
> AZURE_AUTHORITY_HOST: (Injected by the webhook, 
> [https://login.microsoftonline.com/])
> AZURE_CLIENT_ID: (Injected by the webhook)
> AZURE_TENANT_ID: (Injected by the webhook)
> AZURE_FEDERATED_TOKEN_FILE: (Injected by the webhook, 
> /var/run/secrets/azure/tokens/azure-identity-token)
> The token in the file pointed by AZURE_FEDERATED_TOKEN_FILE is a JWT (JASON 
> Web Token) client assertion token which we can use to request to 
> AZURE_AUTHORITY_HOST (url is  AZURE_AUTHORITY_HOST + tenantId + 
> "/oauth2/v2.0/token")  for a AD token which can be used to directly access 
> the Azure cloud resources.
> This approach is very common and similar among cloud providers such as AWS 
> and GCP. Hadoop AWS integration has WebIdentityTokenCredentialProvider to 
> handle the same case.
> The existing MsiTokenProvider can only handle the managed identity associated 
> with Azure VM instance. We need to implement a WorkloadIdentityTokenProvider 
> which handle Azure Workload Identity case. For this, we need to add one 
> method (getTokenUsingJWTAssertion) in AzureADAuthenticator which will be used 
> by WorkloadIdentityTokenProvider.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] MAPREDUCE-7474. Improve Manifest committer resilience [hadoop]

2024-04-30 Thread via GitHub


steveloughran commented on PR #6716:
URL: https://github.com/apache/hadoop/pull/6716#issuecomment-2086367792

   I've now moved to commitFile() to rename the task manifest, after doing a 
getFileStatus() call first...which means its iO cost is the same as a rename 
with recovery enabled. it does let us see what happened, which we log at WARN. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19156) ZooKeeper based state stores use different ZK address configs

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842452#comment-17842452
 ] 

ASF GitHub Bot commented on HADOOP-19156:
-

hadoop-yetus commented on PR #6767:
URL: https://github.com/apache/hadoop/pull/6767#issuecomment-2086063685

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 12 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  9s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  37m 52s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m  5s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  17m 52s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   4m 39s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   5m 48s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   5m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |  11m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  40m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 27s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  17m 34s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 35s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6767/4/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 242 unchanged - 0 fixed = 243 total (was 
242)  |
   | +1 :green_heart: |  mvnsite  |   6m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   5m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   5m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |  12m 48s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  21m  5s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   1m  9s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 23s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   4m 12s |  |  hadoop-yarn-server-common in 
the patch passed.  |
   | -1 :x: |  unit  | 109m 16s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6767/4/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt)
 |  hadoop-yarn-server-resourcemanager in the patch failed.  |
   | +1 :green_heart: |  unit  |  34m 14s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  1s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 463m  0s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6767/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6767 |
   | Optional Tests | dupname asflicense compile javac javadoc 

Re: [PR] HADOOP-19156. ZooKeeper based state stores use different ZK address c… [hadoop]

2024-04-30 Thread via GitHub


hadoop-yetus commented on PR #6767:
URL: https://github.com/apache/hadoop/pull/6767#issuecomment-2086063685

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 12 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  9s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  37m 52s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m  5s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  17m 52s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   4m 39s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   5m 48s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   5m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |  11m 39s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  40m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 27s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  17m 34s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 35s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6767/4/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 242 unchanged - 0 fixed = 243 total (was 
242)  |
   | +1 :green_heart: |  mvnsite  |   6m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   5m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   5m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |  12m 48s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  21m  5s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   1m  9s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 23s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   4m 12s |  |  hadoop-yarn-server-common in 
the patch passed.  |
   | -1 :x: |  unit  | 109m 16s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6767/4/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt)
 |  hadoop-yarn-server-resourcemanager in the patch failed.  |
   | +1 :green_heart: |  unit  |  34m 14s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  1s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 463m  0s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6767/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6767 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 465519d1596a 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | 

Re: [PR] MAPREDUCE-7474. Improve Manifest committer resilience [hadoop]

2024-04-30 Thread via GitHub


steveloughran commented on code in PR #6716:
URL: https://github.com/apache/hadoop/pull/6716#discussion_r1584877464


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md:
##
@@ -523,7 +509,7 @@ And optional settings for debugging/performance analysis
 
 ```
 spark.hadoop.mapreduce.outputcommitter.factory.scheme.abfs 
org.apache.hadoop.fs.azurebfs.commit.AzureManifestCommitterFactory
-spark.hadoop.fs.azure.io.rate.limit 1
+spark.hadoop.fs.azure.io.rate.limit 1000

Review Comment:
   good q. The default allocation for the entire cluster is 10K; this ensures 
that a single commit only uses at most 10% of it during renames. in contrast, 
s3 throttles by shard so only things down the same directory tree were 
impacted, and if they were only reading data, not throttled at all.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] MAPREDUCE-7474. Improve Manifest committer resilience [hadoop]

2024-04-30 Thread via GitHub


steveloughran commented on code in PR #6716:
URL: https://github.com/apache/hadoop/pull/6716#discussion_r1584841337


##
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/committer/manifest/stages/AbstractJobOrTaskStage.java:
##
@@ -582,19 +611,111 @@ protected final Path directoryMustExist(
* Save a task manifest or summary. This will be done by
* writing to a temp path and then renaming.
* If the destination path exists: Delete it.
+   * This will retry so that a rename failure from abfs load or IO errors
+   * will not fail the task.
* @param manifestData the manifest/success file
* @param tempPath temp path for the initial save
* @param finalPath final path for rename.
-   * @throws IOException failure to load/parse
+   * @return the manifest saved.
+   * @throws IOException failure to rename after retries.
*/
   @SuppressWarnings("unchecked")
-  protected final  void save(T manifestData,
+  protected final  T save(
+  final T manifestData,
   final Path tempPath,
   final Path finalPath) throws IOException {
-LOG.trace("{}: save('{}, {}, {}')", getName(), manifestData, tempPath, 
finalPath);
-trackDurationOfInvocation(getIOStatistics(), OP_SAVE_TASK_MANIFEST, () ->
-operations.save(manifestData, tempPath, true));
-renameFile(tempPath, finalPath);
+return saveManifest(() -> manifestData, tempPath, finalPath, 
OP_SAVE_TASK_MANIFEST);
+  }
+
+  /**
+   * Generate and save a task manifest or summary file.
+   * This is be done by writing to a temp path and then renaming.
+   * 
+   * If the destination path exists: Delete it before the rename.
+   * 
+   * This will retry so that a rename failure from abfs load or IO errors
+   * such as delete or save failure will not fail the task.
+   * 
+   * The {@code manifestSource} supplier is invoked to get the manifest data
+   * on every attempt.
+   * This permits statistics to be updated, including those of failures.
+   * @param manifestSource supplier the manifest/success file
+   * @param tempPath temp path for the initial save
+   * @param finalPath final path for rename.
+   * @param statistic statistic to use for timing
+   * @return the manifest saved.
+   * @throws IOException failure to save/delete/rename after retries.
+   */
+  @SuppressWarnings("unchecked")
+  protected final  T saveManifest(
+  final Supplier manifestSource,
+  final Path tempPath,
+  final Path finalPath,
+  String statistic) throws IOException {
+
+AtomicInteger retryCount = new AtomicInteger(0);
+RetryPolicy retryPolicy = retryUpToMaximumCountWithProportionalSleep(
+getStageConfig().getManifestSaveAttempts(),
+SAVE_SLEEP_INTERVAL,
+TimeUnit.MILLISECONDS);
+
+// loop until returning a value or raising an exception
+while (true) {
+  try {
+T manifestData = requireNonNull(manifestSource.get());
+trackDurationOfInvocation(getIOStatistics(), statistic, () -> {
+  LOG.info("{}: save manifest to {} then rename as {}'); retry 
count={}",
+  getName(), tempPath, finalPath, retryCount);
+
+  // delete temp path.
+  // even though this is written with overwrite=true, this extra 
recursive
+  // delete also handles a directory being there.
+  deleteRecursive(tempPath, OP_DELETE);
+
+  // save the temp file, overwriting any which remains from an earlier 
attempt
+  operations.save(manifestData, tempPath, true);
+
+  // delete the destination in case it exists either from a failed 
previous
+  // attempt or from a concurrent task commit.
+  delete(finalPath, true, OP_DELETE);
+
+  // rename temp to final
+  renameFile(tempPath, finalPath);

Review Comment:
   you mean use commitFile() after creating a file entry, so pushing more of 
the recovery down? we could do that. we won't have the etag of the create file 
though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18679) Add API for bulk/paged object deletion

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842385#comment-17842385
 ] 

ASF GitHub Bot commented on HADOOP-18679:
-

steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1584787287


##
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/bulkdelete.md:
##
@@ -0,0 +1,136 @@
+
+
+#  interface `BulkDelete`
+
+ Add API for bulk/paged object deletion
> --
>
> Key: HADOOP-18679
> URL: https://issues.apache.org/jira/browse/HADOOP-18679
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> iceberg and hbase could benefit from being able to give a list of individual 
> files to delete -files which may be scattered round the bucket for better 
> read peformance. 
> Add some new optional interface for an object store which allows a caller to 
> submit a list of paths to files to delete, where
> the expectation is
> * if a path is a file: delete
> * if a path is a dir, outcome undefined
> For s3 that'd let us build these into DeleteRequest objects, and submit, 
> without any probes first.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-30 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1584787287


##
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/bulkdelete.md:
##
@@ -0,0 +1,136 @@
+
+
+#  interface `BulkDelete`
+
+
+
+The `BulkDelete` interface provides an API to perform bulk delete of 
files/objects
+in an object store or filesystem.
+
+## Key Features
+
+* An API for submitting a list of paths to delete.
+* This list must be no larger than the "page size" supported by the client; 
This size is also exposed as a method.
+* Triggers a request to delete files at the specific paths.
+* Returns a list of which paths were reported as delete failures by the store.
+* Does not consider a nonexistent file to be a failure.
+* Does not offer any atomicity guarantees.
+* Idempotency guarantees are weak: retries may delete files newly created by 
other clients.
+* Provides no guarantees as to the outcome if a path references a directory.
+* Provides no guarantees that parent directories will exist after the call.
+
+
+The API is designed to match the semantics of the AWS S3 [Bulk 
Delete](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) 
REST API call, but it is not
+exclusively restricted to this store. This is why the "provides no guarantees"
+restrictions do not state what the outcome will be when executed on other 
stores.
+
+### Interface `org.apache.hadoop.fs.BulkDeleteSource`
+
+The interface `BulkDeleteSource` is offered by a FileSystem/FileContext class 
if
+it supports the API.
+
+```java
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDeleteSource {
+  default BulkDelete createBulkDelete(Path path)
+  throws UnsupportedOperationException, IllegalArgumentException, 
IOException;
+
+}
+
+```
+
+### Interface `org.apache.hadoop.fs.BulkDelete`
+
+This is the bulk delete implementation returned by the `createBulkDelete()` 
call.
+
+```java
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDelete extends IOStatisticsSource, Closeable {
+  int pageSize();
+  Path basePath();
+  List> bulkDelete(List paths)
+  throws IOException, IllegalArgumentException;
+
+}
+
+```
+
+### `bulkDelete(paths)`
+
+ Preconditions
+
+```python
+if length(paths) > pageSize: throw IllegalArgumentException
+```
+
+ Postconditions
+
+All paths which refer to files are removed from the set of files.
+```python
+FS'Files = FS.Files - [paths]
+```
+
+No other restrictions are placed upon the outcome.
+
+
+### Availability
+
+The `BulkDeleteSource` interface is exported by `FileSystem` and `FileContext` 
storage clients
+which is available for all FS via 
`org.apache.hadoop.fs.DefalutBulkDeleteSource`. For the
+ICEBERG integration to work seamlessly, all FS which supports delete() MUST 
leave the

Review Comment:
   say "for integration in applications like Apache Iceberg", all 
implementations of this interface MUST NOT reject the request but instead 
return a BulkDelete instance of size >= 1"



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DefaultBulkDeleteOperation.java:
##
@@ -0,0 +1,109 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Objects.requireNonNull;
+import static org.apache.hadoop.fs.BulkDeleteUtils.validateBulkDeletePaths;
+
+/**
+ * Default implementation of the {@link BulkDelete} interface.
+ */
+public class DefaultBulkDeleteOperation implements BulkDelete {
+
+private static Logger LOG = 
LoggerFactory.getLogger(DefaultBulkDeleteOperation.class);
+
+/** Default page size for bulk delete. */
+private static final int DEFAULT_PAGE_SIZE = 1;
+
+/** Base path for the bulk delete operation. */
+private final Path basePath;
+
+/** Delegate File system make actual delete calls. */
+private final 

[jira] [Commented] (HADOOP-18516) [ABFS]: Support fixed SAS token config in addition to Custom SASTokenProvider Implementation

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842376#comment-17842376
 ] 

ASF GitHub Bot commented on HADOOP-18516:
-

hadoop-yetus commented on PR #6552:
URL: https://github.com/apache/hadoop/pull/6552#issuecomment-2085299632

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 7 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  44m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 33s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  33m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  33m 52s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 20s | 
[/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6552/14/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt)
 |  hadoop-tools/hadoop-azure: The patch generated 1 new + 9 unchanged - 0 
fixed = 10 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  33m 24s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 10s |  |  hadoop-azure in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 129m  7s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6552/14/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6552 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
|
   | uname | Linux 0058a3fedccc 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 5db5372240aeb83bd48ab3e774273e507820cdf2 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 

Re: [PR] HADOOP-18516: [ABFS][Authentication] Support Fixed SAS Token for ABFS Authentication [hadoop]

2024-04-30 Thread via GitHub


hadoop-yetus commented on PR #6552:
URL: https://github.com/apache/hadoop/pull/6552#issuecomment-2085299632

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 7 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  44m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 33s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  33m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  33m 52s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 20s | 
[/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6552/14/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt)
 |  hadoop-tools/hadoop-azure: The patch generated 1 new + 9 unchanged - 0 
fixed = 10 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  33m 24s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 10s |  |  hadoop-azure in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 129m  7s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6552/14/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6552 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
|
   | uname | Linux 0058a3fedccc 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 5db5372240aeb83bd48ab3e774273e507820cdf2 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6552/14/testReport/ |
   | Max. process+thread count | 562 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
   | Console output | 

[jira] [Commented] (HADOOP-19152) Do not hard code security providers.

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842371#comment-17842371
 ] 

ASF GitHub Bot commented on HADOOP-19152:
-

steveloughran commented on code in PR #6739:
URL: https://github.com/apache/hadoop/pull/6739#discussion_r1584770438


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoUtils.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.crypto;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.bouncycastle.jce.provider.BouncyCastleProvider;
+import org.junit.Assert;
+import org.junit.Test;
+import org.slf4j.event.Level;
+
+import java.security.Provider;
+import java.security.Security;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_DEFAULT;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY;
+
+/** Test {@link CryptoUtils}. */
+public class TestCryptoUtils {
+  static {
+GenericTestUtils.setLogLevel(CryptoUtils.LOG, Level.TRACE);
+  }
+
+  @Test(timeout = 1_000)
+  public void testProviderName() {
+Assert.assertEquals(CryptoUtils.BOUNCY_CASTLE_PROVIDER_NAME, 
BouncyCastleProvider.PROVIDER_NAME);
+  }
+
+  static void assertRemoveProvider() {
+Security.removeProvider(BouncyCastleProvider.PROVIDER_NAME);
+
Assert.assertNull(Security.getProvider(BouncyCastleProvider.PROVIDER_NAME));
+  }
+
+  static void assertSetProvider(Configuration conf) {
+conf.set(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY, 
CryptoUtils.BOUNCY_CASTLE_PROVIDER_NAME);
+final String providerFromConf = CryptoUtils.getJceProvider(conf);
+Assert.assertEquals(CryptoUtils.BOUNCY_CASTLE_PROVIDER_NAME, 
providerFromConf);
+  }
+
+  @Test(timeout = 5_000)
+  public void testAutoAddDisabled() {
+assertRemoveProvider();
+
+final Configuration conf = new Configuration();
+conf.setBoolean(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY, false);
+
+assertSetProvider(conf);
+
+
Assert.assertNull(Security.getProvider(BouncyCastleProvider.PROVIDER_NAME));
+  }
+
+  @Test(timeout = 5_000)
+  public void testAutoAddEnabled() {
+assertRemoveProvider();
+
+final Configuration conf = new Configuration();
+Assert.assertTrue("true".equalsIgnoreCase(
+conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY)));

Review Comment:
   for all the asserts other than assertNull and assertEqual, which generate 
them automatically we do need useful error messages so when a test run fails we 
can debug it.
   
   this is where AssertJ assertions beat junit.
   
   ```
   
Assertions.assertThat(conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY))
.describedAs("security provider from configuration")
.isEqualToIgnoringCase("true")
   ```
   



##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoUtils.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.crypto;
+
+import org.apache.hadoop.conf.Configuration;
+import 

Re: [PR] HADOOP-19152. Do not hard code security providers. [hadoop]

2024-04-30 Thread via GitHub


steveloughran commented on code in PR #6739:
URL: https://github.com/apache/hadoop/pull/6739#discussion_r1584770438


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoUtils.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.crypto;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.bouncycastle.jce.provider.BouncyCastleProvider;
+import org.junit.Assert;
+import org.junit.Test;
+import org.slf4j.event.Level;
+
+import java.security.Provider;
+import java.security.Security;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_DEFAULT;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY;
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY;
+
+/** Test {@link CryptoUtils}. */
+public class TestCryptoUtils {
+  static {
+GenericTestUtils.setLogLevel(CryptoUtils.LOG, Level.TRACE);
+  }
+
+  @Test(timeout = 1_000)
+  public void testProviderName() {
+Assert.assertEquals(CryptoUtils.BOUNCY_CASTLE_PROVIDER_NAME, 
BouncyCastleProvider.PROVIDER_NAME);
+  }
+
+  static void assertRemoveProvider() {
+Security.removeProvider(BouncyCastleProvider.PROVIDER_NAME);
+
Assert.assertNull(Security.getProvider(BouncyCastleProvider.PROVIDER_NAME));
+  }
+
+  static void assertSetProvider(Configuration conf) {
+conf.set(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_KEY, 
CryptoUtils.BOUNCY_CASTLE_PROVIDER_NAME);
+final String providerFromConf = CryptoUtils.getJceProvider(conf);
+Assert.assertEquals(CryptoUtils.BOUNCY_CASTLE_PROVIDER_NAME, 
providerFromConf);
+  }
+
+  @Test(timeout = 5_000)
+  public void testAutoAddDisabled() {
+assertRemoveProvider();
+
+final Configuration conf = new Configuration();
+conf.setBoolean(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY, false);
+
+assertSetProvider(conf);
+
+
Assert.assertNull(Security.getProvider(BouncyCastleProvider.PROVIDER_NAME));
+  }
+
+  @Test(timeout = 5_000)
+  public void testAutoAddEnabled() {
+assertRemoveProvider();
+
+final Configuration conf = new Configuration();
+Assert.assertTrue("true".equalsIgnoreCase(
+conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY)));

Review Comment:
   for all the asserts other than assertNull and assertEqual, which generate 
them automatically we do need useful error messages so when a test run fails we 
can debug it.
   
   this is where AssertJ assertions beat junit.
   
   ```
   
Assertions.assertThat(conf.get(HADOOP_SECURITY_CRYPTO_JCE_PROVIDER_AUTO_ADD_KEY))
.describedAs("security provider from configuration")
.isEqualToIgnoringCase("true")
   ```
   



##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoUtils.java:
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.crypto;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.test.GenericTestUtils;
+import org.bouncycastle.jce.provider.BouncyCastleProvider;
+import org.junit.Assert;
+import org.junit.Test;
+import org.slf4j.event.Level;
+
+import java.security.Provider;
+import java.security.Security;
+
+import static 

[jira] [Resolved] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-30 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19146.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 

[jira] [Updated] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-30 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19146:

Priority: Minor  (was: Major)

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 
> 

[jira] [Commented] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842367#comment-17842367
 ] 

ASF GitHub Bot commented on HADOOP-19146:
-

steveloughran commented on PR #6723:
URL: https://github.com/apache/hadoop/pull/6723#issuecomment-2085237232

   merged to trunk and 3.4; doesn't go in to 3.3.x and I'm not sure whether its 
worth the effort




> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> 

Re: [PR] HADOOP-19146 noaa-cors-pds bucket access with global endpoint fails [hadoop]

2024-04-30 Thread via GitHub


steveloughran commented on PR #6723:
URL: https://github.com/apache/hadoop/pull/6723#issuecomment-2085237232

   merged to trunk and 3.4; doesn't go in to 3.3.x and I'm not sure whether its 
worth the effort


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18516) [ABFS]: Support fixed SAS token config in addition to Custom SASTokenProvider Implementation

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842352#comment-17842352
 ] 

ASF GitHub Bot commented on HADOOP-18516:
-

anujmodi2021 commented on PR #6552:
URL: https://github.com/apache/hadoop/pull/6552#issuecomment-2085145286

   Thanks @steveloughran for the review.
   This is good to be merged now.




> [ABFS]: Support fixed SAS token config in addition to Custom SASTokenProvider 
> Implementation
> 
>
> Key: HADOOP-18516
> URL: https://issues.apache.org/jira/browse/HADOOP-18516
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Sree Bhattacharyya
>Assignee: Anuj Modi
>Priority: Minor
>  Labels: pull-request-available
>
> This PR introduces a new configuration for Fixed SAS Tokens: 
> *"fs.azure.sas.fixed.token"*
> Using this new configuration, users can configure a fixed SAS Token in the 
> account settings files itself. Ideally, this should be used with SAS Tokens 
> that are scoped at a container or account level (Service or Account SAS), 
> which can be considered to be a constant for one account or container, over 
> multiple operations.
> The other method of using a SAS Token remains valid as well, where a user 
> provides a custom implementation of the SASTokenProvider interface, using 
> which a SAS Token are obtained.
> When an Account SAS Token is configured as the fixed SAS Token, and it is 
> used, it is ensured that operations are within the scope of the SAS Token.
> The code checks for whether the fixed token and the token provider class 
> implementation are configured. In the case of both being set, preference is 
> given to the custom SASTokenProvider implementation. It must be noted that if 
> such an implementation provides a SAS Token which has a lower scope than 
> Account SAS, some filesystem and service level operations might be out of 
> scope and may not succeed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18516) [ABFS]: Support fixed SAS token config in addition to Custom SASTokenProvider Implementation

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842351#comment-17842351
 ] 

ASF GitHub Bot commented on HADOOP-18516:
-

anujmodi2021 commented on PR #6552:
URL: https://github.com/apache/hadoop/pull/6552#issuecomment-2085142098

   --
    AGGREGATED TEST RESULT 
   
   
   HNS-OAuth
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 2
   [WARNING] Tests run: 620, Failures: 0, Errors: 0, Skipped: 76
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 54
   
   
   HNS-SharedKey
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 3
   [WARNING] Tests run: 620, Failures: 0, Errors: 0, Skipped: 28
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 41
   
   
   NonHNS-SharedKey
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 9
   [WARNING] Tests run: 604, Failures: 0, Errors: 0, Skipped: 268
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 44
   
   
   AppendBlob-HNS-OAuth
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 2
   [WARNING] Tests run: 620, Failures: 0, Errors: 0, Skipped: 78
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 78
   
   Time taken: 56 mins 59 secs.
   




> [ABFS]: Support fixed SAS token config in addition to Custom SASTokenProvider 
> Implementation
> 
>
> Key: HADOOP-18516
> URL: https://issues.apache.org/jira/browse/HADOOP-18516
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Sree Bhattacharyya
>Assignee: Anuj Modi
>Priority: Minor
>  Labels: pull-request-available
>
> This PR introduces a new configuration for Fixed SAS Tokens: 
> *"fs.azure.sas.fixed.token"*
> Using this new configuration, users can configure a fixed SAS Token in the 
> account settings files itself. Ideally, this should be used with SAS Tokens 
> that are scoped at a container or account level (Service or Account SAS), 
> which can be considered to be a constant for one account or container, over 
> multiple operations.
> The other method of using a SAS Token remains valid as well, where a user 
> provides a custom implementation of the SASTokenProvider interface, using 
> which a SAS Token are obtained.
> When an Account SAS Token is configured as the fixed SAS Token, and it is 
> used, it is ensured that operations are within the scope of the SAS Token.
> The code checks for whether the fixed token and the token provider class 
> implementation are configured. In the case of both being set, preference is 
> given to the custom SASTokenProvider implementation. It must be noted that if 
> such an implementation provides a SAS Token which has a lower scope than 
> Account SAS, some filesystem and service level operations might be out of 
> scope and may not succeed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18516: [ABFS][Authentication] Support Fixed SAS Token for ABFS Authentication [hadoop]

2024-04-30 Thread via GitHub


anujmodi2021 commented on PR #6552:
URL: https://github.com/apache/hadoop/pull/6552#issuecomment-2085145286

   Thanks @steveloughran for the review.
   This is good to be merged now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18516: [ABFS][Authentication] Support Fixed SAS Token for ABFS Authentication [hadoop]

2024-04-30 Thread via GitHub


anujmodi2021 commented on PR #6552:
URL: https://github.com/apache/hadoop/pull/6552#issuecomment-2085142098

   --
    AGGREGATED TEST RESULT 
   
   
   HNS-OAuth
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 2
   [WARNING] Tests run: 620, Failures: 0, Errors: 0, Skipped: 76
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 54
   
   
   HNS-SharedKey
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 3
   [WARNING] Tests run: 620, Failures: 0, Errors: 0, Skipped: 28
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 41
   
   
   NonHNS-SharedKey
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 9
   [WARNING] Tests run: 604, Failures: 0, Errors: 0, Skipped: 268
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 44
   
   
   AppendBlob-HNS-OAuth
   
   
   [WARNING] Tests run: 137, Failures: 0, Errors: 0, Skipped: 2
   [WARNING] Tests run: 620, Failures: 0, Errors: 0, Skipped: 78
   [WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 78
   
   Time taken: 56 mins 59 secs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] YARN-11693. Refactor Container Scheduler [hadoop]

2024-04-30 Thread via GitHub


hadoop-yetus commented on PR #6786:
URL: https://github.com/apache/hadoop/pull/6786#issuecomment-2085033047

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 53s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 54s |  |  trunk passed  |
   | -1 :x: |  shadedclient  |  24m  2s |  |  branch has errors when building 
and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | -1 :x: |  mvninstall  |   0m 14s | 
[/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6786/1/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt)
 |  hadoop-yarn-server-nodemanager in the patch failed.  |
   | -1 :x: |  compile  |   0m 16s | 
[/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6786/1/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-yarn-server-nodemanager in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  javac  |   0m 16s | 
[/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6786/1/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-yarn-server-nodemanager in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  compile  |   0m 15s | 
[/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6786/1/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-yarn-server-nodemanager in the patch failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | -1 :x: |  javac  |   0m 15s | 
[/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6786/1/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-yarn-server-nodemanager in the patch failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 14s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6786/1/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt)
 |  
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch 

[jira] [Commented] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842336#comment-17842336
 ] 

ASF GitHub Bot commented on HADOOP-19146:
-

steveloughran merged PR #6723:
URL: https://github.com/apache/hadoop/pull/6723




> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>  

Re: [PR] HADOOP-19146 noaa-cors-pds bucket access with global endpoint fails [hadoop]

2024-04-30 Thread via GitHub


steveloughran merged PR #6723:
URL: https://github.com/apache/hadoop/pull/6723


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18610) ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS

2024-04-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842326#comment-17842326
 ] 

ASF GitHub Bot commented on HADOOP-18610:
-

creste commented on PR #5953:
URL: https://github.com/apache/hadoop/pull/5953#issuecomment-2084911705

   @snvijaya - My team is no longer using Hadoop and has moved on to another 
project so I am unable to commit to completing this PR. Feel free to address 
the feedback and make any changes needed to get this merged.




> ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS
> -
>
> Key: HADOOP-18610
> URL: https://issues.apache.org/jira/browse/HADOOP-18610
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.3.4
>Reporter: Haifeng Chen
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HADOOP-18610-preview.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In Jan 2023, Microsoft Azure AKS replaced its original pod-managed identity 
> with with [Azure Active Directory (Azure AD) workload 
> identities|https://learn.microsoft.com/en-us/azure/active-directory/develop/workload-identities-overview]
>  (preview), which integrate with the Kubernetes native capabilities to 
> federate with any external identity providers. This approach is simpler to 
> use and deploy.
> Refer to 
> [https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview|https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview.]
>  and [https://azure.github.io/azure-workload-identity/docs/introduction.html] 
> for more details.
> The basic use scenario is to access Azure cloud resources (such as cloud 
> storage) from Kubernetes (such as AKS) workload using Azure managed identity 
> federated with Kubernetes service account. The credential environment 
> variables in pod projected by Azure AD workload identity are like following:
> AZURE_AUTHORITY_HOST: (Injected by the webhook, 
> [https://login.microsoftonline.com/])
> AZURE_CLIENT_ID: (Injected by the webhook)
> AZURE_TENANT_ID: (Injected by the webhook)
> AZURE_FEDERATED_TOKEN_FILE: (Injected by the webhook, 
> /var/run/secrets/azure/tokens/azure-identity-token)
> The token in the file pointed by AZURE_FEDERATED_TOKEN_FILE is a JWT (JASON 
> Web Token) client assertion token which we can use to request to 
> AZURE_AUTHORITY_HOST (url is  AZURE_AUTHORITY_HOST + tenantId + 
> "/oauth2/v2.0/token")  for a AD token which can be used to directly access 
> the Azure cloud resources.
> This approach is very common and similar among cloud providers such as AWS 
> and GCP. Hadoop AWS integration has WebIdentityTokenCredentialProvider to 
> handle the same case.
> The existing MsiTokenProvider can only handle the managed identity associated 
> with Azure VM instance. We need to implement a WorkloadIdentityTokenProvider 
> which handle Azure Workload Identity case. For this, we need to add one 
> method (getTokenUsingJWTAssertion) in AzureADAuthenticator which will be used 
> by WorkloadIdentityTokenProvider.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18610. ABFS OAuth2 Token Provider support for Azure Workload Identity [hadoop]

2024-04-30 Thread via GitHub


creste commented on PR #5953:
URL: https://github.com/apache/hadoop/pull/5953#issuecomment-2084911705

   @snvijaya - My team is no longer using Hadoop and has moved on to another 
project so I am unable to commit to completing this PR. Feel free to address 
the feedback and make any changes needed to get this merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[PR] YARN-11693. Refactor Container Scheduler [hadoop]

2024-04-30 Thread via GitHub


mohitgaggar opened a new pull request, #6786:
URL: https://github.com/apache/hadoop/pull/6786

   …, running container manager and container queue manager classes
   
   
   
   ### Description of PR
   
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [x ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [x ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] YARN-11685. Create a config to enable/disable cgroup v2 functionality [hadoop]

2024-04-30 Thread via GitHub


brumi1024 commented on PR #6770:
URL: https://github.com/apache/hadoop/pull/6770#issuecomment-2084814001

   Thanks @p-szucs for the fixes, the latest state LGTM. Merging to trunk.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] YARN-11685. Create a config to enable/disable cgroup v2 functionality [hadoop]

2024-04-30 Thread via GitHub


brumi1024 merged PR #6770:
URL: https://github.com/apache/hadoop/pull/6770


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[PR] MAPREDUCE-7475. Fixed non-idempotent unit tests [hadoop]

2024-04-30 Thread via GitHub


kaiyaok2 opened a new pull request, #6785:
URL: https://github.com/apache/hadoop/pull/6785

   ### Description of PR
   As described in https://issues.apache.org/jira/browse/MAPREDUCE-7475: 2 
tests are not idempotent and fails upon repeated execution within the same JVM 
instance due to self-induced state pollution. Specifically, these tests try to 
make the directory `TEST_ROOT_DIR` and write to it. The tests do not clean up 
(remove) the directory after execution. Therefore, in the second execution, 
`TEST_ROOT_DIR` would already exist and the exception `Could not create test 
dir` would be thrown. The tests shall be fixed as unit tests shall be 
idempotent.
   
   Below are the 2 non-idempotent tests:
   
   * `org.apache.hadoop.mapred.TestOldCombinerGrouping.testCombiner`
   * `org.apache.hadoop.mapreduce.TestNewCombinerGrouping.testCombiner`
   
   ##Sample Failure Message for 
`org.apache.hadoop.mapreduce.TestNewCombinerGrouping.testCombiner` (in the 2nd 
run of the test):
   ```
   java.lang.RuntimeException: Could not create test dir: 
/home/kaiyaok2/NIOExperiments/github.com/apache/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/build/test/data/e4ee268d-2946-43a2-93b5-f2b4ac647279
   at 
org.apache.hadoop.mapreduce.TestNewCombinerGrouping.testCombiner(TestNewCombinerGrouping.java:109)
   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
   ```
   
   ### How was this patch tested?
   After the patch, rerunning the tests in the same JVM does not produce any 
exceptions.
   
   ### Code changes:
   Check if the test directory already exists (and delete it if so) before 
calling `mkdirs()`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org