StephanEwen commented on issue #6663:  [FLINK-10209][build] Exclude jdk.tools 
dependency from hadoop 
URL: https://github.com/apache/flink/pull/6663#issuecomment-424054104
 
 
   This context may make my line of thinking easier to understand:
   
   The hadoop-shaded module is a convenience artifact, one where we discussed 
previously to phase it out eventually. It is used (1) to compile against (for 
HDFS / YARN / Kerberos code) and (2) to add as a jar to the lib folder.
   
     - Concerning a general exclusion tor the jdk.tools dependency: Since we 
don't compile Hadoop itself and we don't redistribute that dependency (it is a 
system dependency) I cannot see how a general exclusion would be a problem. It 
simplifies the build files, which is something really good.
   
     - We should encourage use of HADOOP_CLASSPATH rather than use of our 
Hadoop fat jar anyways. That reduces the value of the second use of the 
hadoop-shaded  project, the packaging into the dist lib folder. If we purely go 
for the HADOOP_CLASSPATH variant, we could remove that project all together and 
simply have a provided or optional Hadoop dependency.
   
     - The fat hadoop jar is used for client side functionality only, and since 
version 2, Hadoop claims to have a stable setup (HDFS protocol, Kerberos 
config, etc.) , so we don't need each major/minor version, but one of every 
major version should work. We should not need the vendor specific versions 
either. And, there is still the HADOOP_CLASSPATH workaround in case any of the 
vendor-specific versions has a compatibility problem after all.
   
     - Concerning moving Hadoop to flink-shaded: We don't have to find a setup 
that converges across Hadoop versions, that is exactly the point. We pick some 
Hadoop versions for which we want to build convenience jars and converge these 
manually or by shading.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to