[ 
https://issues.apache.org/jira/browse/FLINK-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004780#comment-16004780
 ] 

ASF GitHub Bot commented on FLINK-5998:
---------------------------------------

Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/3856
  
    I think the fix works in almost all cases.
    There's only one problem: Your change causes the `flink-shaded-hadoop2` 
artifact on maven central to not expose any dependencies anymore. So a module 
referencing this (like `flink-java`) will not see what `flink-shaded-hadoop2` 
contains. This can cause problems like having classes in the classpath multiple 
times. Maven can not "manage" the dependencies in that case, because it does 
not know what's `flink-shaded-hadoop2`.
    
    I don't have a good answer how to solve this.
    Some ideas:
    - relocate all Hadoop dependencies in the `flink-shaded-hadoop2` artifact. 
Then, we won't run into the original problem anymore. I've tried this once, but 
I was running into problems getting the YARN tests running afterwards
    - introduce a special "flink-shaded-hadoop2-dist" module that prepares a 
fat dist hadoop jar for the binary distribution. This way, we can differentiate 
between `flink-shaded-hadoop2` as a dependency and for the binary. But I think 
this will lead to problems when building `flink-dist`...
    - merge this PR as is and hope that the problems don't occur (I think this 
is mostly relevant for people using Hadoop code in their user jar, for example 
when doing some Hadoop compatibility stuff)


> Un-fat Hadoop from Flink fat jar
> --------------------------------
>
>                 Key: FLINK-5998
>                 URL: https://issues.apache.org/jira/browse/FLINK-5998
>             Project: Flink
>          Issue Type: Improvement
>          Components: Build System
>            Reporter: Robert Metzger
>            Assignee: Haohui Mai
>             Fix For: 1.3.0
>
>
> As a first step towards FLINK-2268, I would suggest to put all hadoop 
> dependencies into a jar separate from Flink's fat jar.
> This would allow users to put a custom Hadoop jar in there, or even deploy 
> Flink without a Hadoop fat jar at all in environments where Hadoop is 
> provided (EMR).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to