azagrebin commented on a change in pull request #10932: [FLINK-15614][docs]
Consolidate Hadoop documentation
URL: https://github.com/apache/flink/pull/10932#discussion_r372393162
##########
File path: docs/ops/deployment/hadoop.md
##########
@@ -38,13 +38,18 @@ Referencing the HDFS configuration in the [Flink
configuration]({{ site.baseurl
Another way to provide the Hadoop configuration is to have it on the class
path of the Flink process, see more details below.
-## Adding Hadoop Classpaths
+## Providing Hadoop classes
-The required classes to use Hadoop should be available in the `lib/` folder of
the Flink installation
-(on all machines running Flink) unless Flink is built with [Hadoop shaded
dependencies]({{ site.baseurl }}/flinkDev/building.html#pre-bundled-versions).
+In order to use Hadoop features (e.g., YARN, HDFS) it is ncessary to provide
Flink with the required Hadoop classes,
+as these are not bundled by default.
-If putting the files into the directory is not possible, Flink also respects
-the `HADOOP_CLASSPATH` environment variable to add Hadoop jar files to the
classpath.
+This can be done in 2 ways:
+* Adding the Hadoop classpath to Flink
Review comment:
Are there no expected dependency clashes in case of just exporting
`HADOOP_CLASSPATH`?
In other words, why is the relocation needed for `/lib` but not for
`HADOOP_CLASSPATH`?
I also somewhat liked the previous idea of mentioning that the first way
should be a recommended way to go and only in case of problems (giving
examples) go to the option 2. Is it still the case?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services