Hi Max,

Optional:
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar
Absent:
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar

Step for reproducing:
1) CDH hadoop cluster
2) download and unpack flink-1.1.3-bin-hadoop26-scala_2.10.tgz
3) create YARN container:  HADOOP_USER_NAME=flink YARN_CONF_DIR=yarn-conf/
./bin/yarn-session.sh -n 1 -jm 1024 -tm 1024 -z /pr/test1
4) set ApplicationMaster and make one shaded jar
5) execute: java -cp target/example.jar  ClassloaderTest

p.s. source code for validation:
https://gist.github.com/xhumanoid/83e45ff8a9209b1d7115bda33d18c966

Thanks,
Alexey Diomin

2016-12-05 18:55 GMT+04:00 Maximilian Michels <[email protected]>:

> Hi Alexey,
>
> I suspected that Hadoop's classpath which is added after the main
> Flink classpath might sneak in an older version of Guava. However,
> when your jar is executed, the Flink class loader will make sure that
> your classes (including Beam's) will be loaded first. Normally,
> Hadoop's Guava would not make it in that way because it comes later in
> the classpath. However, there's probably some Hadoop code executed
> before, that loads the Guava classes.
>
> It is problematic that the Kafka module of Beam does not shade Guava
> when you use it in combination with Hadoop which does not shade Guava
> either. I couldn't manage to reproduce it on my cluster setup. The fix
> which Alexey suggested, should make it work because it relocates your
> Guava classes and thereby avoids a clash with Hadoop's version.
>
> I would still be interested in the output of the code I posted earlier
> on your cluster. Just to confirm the suspicion.
>
> Thanks,
> Max
>
> On Mon, Dec 5, 2016 at 2:22 PM, Alexey Demin <[email protected]> wrote:
> > Hi Max
> >
> > No, it's main problem in flink runner
> > I investigate problem and found than AppClassLoader load this classes:
> >
> > com.google.common.base.Absent
> > com.google.common.base.Function
> > com.google.common.base.Optional
> > com.google.common.base.Optional$Absent
> > com.google.common.base.Optional$Present
> >
> > But Optional$Absent existed only in old version guava, after commit "Make
> > Optional GWT serializable" 3e82ff72b19d4230eba795b2760036eb18dfd4ff
> guava
> > split Optional on 3 different classes and made constructor for Optional
> > package private. Optional$Absent and Absent can be loaded together from
> > different versions of guava, but can't work.
> >
> > but because FlinkUserCodeClassLoader have parent on AppClassLoader we
> have
> > problem with guava versions from hadoop classpath in cloudera
> distributions.
> >
> > My config:
> > FlinkPipelineOptions options =
> > PipelineOptionsFactory.as(FlinkPipelineOptions.class);
> > options.setRunner(FlinkRunner.class);
> > options.setFlinkMaster("<host:port>");
> >
> > and execution command: java -jar runner-shaded.jar
> >
> > Flink running on cloudera yarn.
> >
> > My small hotfix it's addition configuration shaded plugin for my
> > application:
> >
> > <relocations>
> >     <relocation>
> >         <pattern>com.google.common</pattern>
> >         <shadedPattern>my.shaded.com.google.common</shadedPattern>
> >     </relocation>
> > </relocations>
> >
> > It remove conflict with AppClassLoader version of guava and alway use
> > correct version from my application.
> >
> > Thanks,
> > Alexey Diomin
> >
> >
> > 2016-12-05 14:50 GMT+04:00 Maximilian Michels <[email protected]>:
> >>
> >> Hi Wayne,
> >>
> >> That seems like you ran that from the IDE. I'm interested how the output
> >> would look like on the cluster where you experienced your problems.
> >>
> >> Thanks,
> >> Max
> >>
> >> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins <[email protected]> wrote:
> >>>
> >>> Hi Max,
> >>>
> >>> Here's the output:
> >>> ---------------------
> >>> Optional:
> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./
> target/beam-starter-0.2.jar
> >>> Absent:
> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./
> target/beam-starter-0.2.jar
> >>> ---------------------
> >>>
> >>> Thanks for your help!
> >>> Wayne
> >>>
> >>>
> >>>
> >>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:
> >>>>
> >>>> Hi Wayne,
> >>>>
> >>>> Thanks for getting back to me. Could you compile a new version of your
> >>>> job with the following in your main method?
> >>>>
> >>>> URL location1 =
> >>>>
> >>>> com.google.common.base.Optional.class.getProtectionDomain().
> getCodeSource().getLocation();
> >>>> System.out.println("Optional: " + location1);
> >>>> URL location2 =
> >>>>
> >>>> Class.forName("com.google.common.base.Optional").
> getProtectionDomain().getCodeSource().getLocation();
> >>>> System.out.println("Absent: " + location2);
> >>>>
> >>>>
> >>>> Could you run this on your cluster node with the flink command? This
> >>>> should give us a hint from where the Guava library is bootstrapped.
> >>>>
> >>>> Thanks,
> >>>> Max
> >>>>
> >>>>
> >>>> On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins <[email protected]>
> wrote:
> >>>>>
> >>>>> Hi Max,
> >>>>>
> >>>>> Here is the result from the "flink run" launcher node (devbox):
> >>>>> -----------------------
> >>>>> root@devbox:~# echo
> >>>>>
> >>>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${
> HBASE_CONF_DIR}
> >>>>> :/etc/hadoop-conf:/etc/yarn-conf:
> >>>>> -----------------------
> >>>>>
> >>>>> Here is the result from one of the Cloudera YARN nodes as root:
> >>>>> -----------------------
> >>>>> [root@hadoop0 ~]# echo
> >>>>>
> >>>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${
> HBASE_CONF_DIR}
> >>>>> :::
> >>>>> -----------------------
> >>>>>
> >>>>> Here is the result from one of the Cloudera YARN nodes as yarn:
> >>>>> -----------------------
> >>>>> [yarn@hadoop0 ~]$ echo
> >>>>>
> >>>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${
> HBASE_CONF_DIR}
> >>>>> :::
> >>>>> -----------------------
> >>>>>
> >>>>>
> >>>>> Note that both the yarn-session.sh and the flink run commands are run
> >>>>> as
> >>>>> root on devbox.
> >>>>>
> >>>>> Software version details:
> >>>>> devbox has these versions of the client software:
> >>>>> flink-1.1.2
> >>>>> hadoop-2.6.0
> >>>>> kafka_2.11-0.9.0.1
> >>>>> (also reproduced the problem with kafka_2.10-0.9.0.1)
> >>>>>
> >>>>> The cluster (providing YARN) is:
> >>>>> CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
> >>>>> Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)
> >>>>>
> >>>>> Thanks for your help!
> >>>>> Wayne
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 2016-12-01 12:54 PM, Maximilian Michels wrote:
> >>>>>
> >>>>> What is the output of the following on the nodes? I have a suspision
> >>>>> that something sneaks in from one of the classpath variables that
> >>>>> Flink picks up:
> >>>>>
> >>>>> echo
> >>>>>
> >>>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${
> HBASE_CONF_DIR}
> >>>>>
> >>>>> On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins <[email protected]>
> >>>>> wrote:
> >>>>>
> >>>>> Hi Max,
> >>>>>
> >>>>> I rebuilt my sandbox with Beam 0.3.0-incubating and Flink 1.1.2 and
> I'm
> >>>>> still seeing the following error message with the StreamWordCount
> demo
> >>>>> code:
> >>>>>
> >>>>> Caused by: java.lang.IllegalAccessError: tried to access method
> >>>>> com.google.common.base.Optional.<init>()V from class
> >>>>> com.google.common.base.Absent
> >>>>>          at com.google.common.base.Absent.<init>(Absent.java:35)
> >>>>>          at com.google.common.base.Absent.<clinit>(Absent.java:33)
> >>>>>          at sun.misc.Unsafe.ensureClassInitialized(Native Method)
> >>>>> ...
> >>>>>
> >>>>>
> >>>>> (snip)
> >>>
> >>>
> >>
> >
>

Reply via email to