[ 
https://issues.apache.org/jira/browse/HADOOP-17799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695515#comment-17695515
 ] 

ASF GitHub Bot commented on HADOOP-17799:
-----------------------------------------

trakos opened a new pull request, #5447:
URL: https://github.com/apache/hadoop/pull/5447

   ### Description of PR
   
   `WebHdfsFileSystem` didn't provide any support for HTTP BASIC authentication 
(username/password). This patch adds that feature. When specifying filesystem 
URI, the credentials part (`user:pass@`) is now parsed properly and used for 
`Authorization` header.
   
   Additionally, base path specified in filesystem URL used to be ignored. This 
patch adds configuration option `dfs.client.webhdfs.use-base-path` that, when 
enabled, indicates that this path should be used as API prefix. This allows 
specifying `/gateway/gatewayname` when using WebHdfs for Apache Knox. When base 
path contains `/webhdfs/v1`, it is ignored, since we always append that.
   
   Option `dfs.client.webhdfs.use-base-path` defaults to false because it could 
introduce a backward compatibility break. Some WebHdfs users could have typos 
or something random as path, and before this patch it would simply be ignored. 
By setting the default to false, we make sure that it won't break any existing 
setup.
   
   Note that issue HADOOP-17799 is also about Kerberos auth. This patch only 
addresses the base path and basic authentication portion, I didn't investigate 
the Kerberos auth since we don't use in our setup.
   
   ### How was this patch tested?
   
   I tested it with WebHdfs secured by Apache Knox with basic authorization, 
but without Kerberos. I had a test script that would perform a file upload. I 
set `dfs.client.webhdfs.use_basepath` to true, and used 
`swebhdfs://admin:admin-password@localhost:8443/gateway/docker/` as filesystem 
URI. Without my patch, both the `admin:admin-password` credentials and 
`/gateway/docker` API base path would be ignored. With it, file upload worked.
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> Improve the GitHub pull request template
> ----------------------------------------
>
>                 Key: HADOOP-17799
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17799
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: build, documentation
>            Reporter: Akira Ajisaka
>            Assignee: Akira Ajisaka
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The current Hadoop pull request template can be improved.
> - Require some information (e.g. 
> https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE)
> - Checklists (e.g. 
> https://github.com/apache/nifi/blob/main/.github/PULL_REQUEST_TEMPLATE.md)
> - Move current notice to comment (i.e. surround with <!-- and  -->)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to