[
https://issues.apache.org/jira/browse/HADOOP-17799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695515#comment-17695515
]
ASF GitHub Bot commented on HADOOP-17799:
-----------------------------------------
trakos opened a new pull request, #5447:
URL: https://github.com/apache/hadoop/pull/5447
### Description of PR
`WebHdfsFileSystem` didn't provide any support for HTTP BASIC authentication
(username/password). This patch adds that feature. When specifying filesystem
URI, the credentials part (`user:pass@`) is now parsed properly and used for
`Authorization` header.
Additionally, base path specified in filesystem URL used to be ignored. This
patch adds configuration option `dfs.client.webhdfs.use-base-path` that, when
enabled, indicates that this path should be used as API prefix. This allows
specifying `/gateway/gatewayname` when using WebHdfs for Apache Knox. When base
path contains `/webhdfs/v1`, it is ignored, since we always append that.
Option `dfs.client.webhdfs.use-base-path` defaults to false because it could
introduce a backward compatibility break. Some WebHdfs users could have typos
or something random as path, and before this patch it would simply be ignored.
By setting the default to false, we make sure that it won't break any existing
setup.
Note that issue HADOOP-17799 is also about Kerberos auth. This patch only
addresses the base path and basic authentication portion, I didn't investigate
the Kerberos auth since we don't use in our setup.
### How was this patch tested?
I tested it with WebHdfs secured by Apache Knox with basic authorization,
but without Kerberos. I had a test script that would perform a file upload. I
set `dfs.client.webhdfs.use_basepath` to true, and used
`swebhdfs://admin:admin-password@localhost:8443/gateway/docker/` as filesystem
URI. Without my patch, both the `admin:admin-password` credentials and
`/gateway/docker` API base path would be ignored. With it, file upload worked.
### For code changes:
- [x] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
> Improve the GitHub pull request template
> ----------------------------------------
>
> Key: HADOOP-17799
> URL: https://issues.apache.org/jira/browse/HADOOP-17799
> Project: Hadoop Common
> Issue Type: Task
> Components: build, documentation
> Reporter: Akira Ajisaka
> Assignee: Akira Ajisaka
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> The current Hadoop pull request template can be improved.
> - Require some information (e.g.
> https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE)
> - Checklists (e.g.
> https://github.com/apache/nifi/blob/main/.github/PULL_REQUEST_TEMPLATE.md)
> - Move current notice to comment (i.e. surround with <!-- and -->)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]