[
https://issues.apache.org/jira/browse/HADOOP-19696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18040587#comment-18040587
]
ASF GitHub Bot commented on HADOOP-19696:
-----------------------------------------
pan3793 commented on PR #8094:
URL: https://github.com/apache/hadoop/pull/8094#issuecomment-3575282738
>> How does it affect `hadoop classpath`? As far as I know, many downstream
projects require adding the output of `hadoop classpath` to their classpath.
>
> - does nothing for maven/gradle imports of hadoop-cloud-storage
> - if you add hadoop/common/* and hadoop/commmon/lib/* to your CP you may
now get more cloud connectors
I specifically mean the output of the command `hadoop classpath`, which is
used in many projects, e.g., Flink, to guide users for Hadoop integration.
> - Important Make sure that the `HADOOP_CLASSPATH` environment variable is
set up (it can be > checked by running `echo $HADOOP_CLASSPATH`). If not, set
it up using
>
> ```bash
> export HADOOP_CLASSPATH=`hadoop classpath`
> ```
https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/deployment/resource-providers/yarn/
but according to the answer to the next question, it seems fine.
>> Which maven profiles will be enabled by the Apache binary release? in
other words, how to reproduce it from the source release locally?
>
> It doesn't set any of the new profiles, meaning the hadoop-* jars will be
there, abfs will work (limited dependendencies) but anything else needs extra
jars added.
@steveloughran, thanks for your detailed explanation. I don't have more
questions
> hadoop binary distribution to move cloud connectors to hadoop common/lib
> ------------------------------------------------------------------------
>
> Key: HADOOP-19696
> URL: https://issues.apache.org/jira/browse/HADOOP-19696
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure, fs/gcs, fs/huawei, fs/s3
> Affects Versions: 3.4.2
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
>
> Place all the cloud connector hadoop-* artifacts and dependencies into
> hadoop/common/lib so that the stores can be directly accessed.
> * filesystem operations against abfs, s3a, gcs, etc don't need any effort
> setting things up.
> * Releases without the aws bundle.jar can be trivially updated by adding any
> version of the sdk libraries to the common/lib dir.
> This adds a lot more stuff into the distribution, so I'm doing the following
> design
> * all hadoop-* modules in common/lib
> * minimal dependencies for hadoop-azure and hadoop-gcs (once we get those
> right!)
> * hadoop-aws: everything except bundle.jar
> * other connectors: only included with explicit profiles.
> ASF releases will support azure out the box, the others once you add the
> dependencies. And anyone can build their own release with everything
> One concern here, we make hadoop-cloud-storage artifact incomplete at pulling
> in things when depended on. We may need a separate module for the distro
> setup.
> Noticed during this that the hadoop-tos component is shaded and includes
> stuff (httpclient5) that we need under control. Filed HADOOP-19708 and
> incorporating here.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]