Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Alexey Demin
Hi Max

No, it's main problem in flink runner
I investigate problem and found than AppClassLoader load this classes:

com.google.common.base.Absent
com.google.common.base.Function
com.google.common.base.Optional
com.google.common.base.Optional$Absent
com.google.common.base.Optional$Present

But Optional$Absent existed only in old version guava, after commit "Make
Optional GWT serializable" 3e82ff72b19d4230eba795b2760036eb18dfd4ff guava
split Optional on 3 different classes and made constructor for Optional
package private. Optional$Absent and Absent can be loaded together from
different versions of guava, but can't work.

but because FlinkUserCodeClassLoader have parent on AppClassLoader we have
problem with guava versions from hadoop classpath in cloudera distributions.

My config:
FlinkPipelineOptions options =
PipelineOptionsFactory.as(FlinkPipelineOptions.class);
options.setRunner(FlinkRunner.class);
options.setFlinkMaster("");

and execution command: java -jar runner-shaded.jar

Flink running on cloudera yarn.

My small hotfix it's addition configuration shaded plugin for my
application:



com.google.common
my.shaded.com.google.common



It remove conflict with AppClassLoader version of guava and alway use
correct version from my application.

Thanks,
Alexey Diomin


2016-12-05 14:50 GMT+04:00 Maximilian Michels :

> Hi Wayne,
>
> That seems like you ran that from the IDE. I'm interested how the output
> would look like on the cluster where you experienced your problems.
>
> Thanks,
> Max
>
> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins  wrote:
>
>> Hi Max,
>>
>> Here's the output:
>> -
>> Optional: file:/home/wayneco/workspace/beam-starter/beam-starter/./tar
>> get/beam-starter-0.2.jar
>> Absent: file:/home/wayneco/workspace/beam-starter/beam-starter/./tar
>> get/beam-starter-0.2.jar
>> -
>>
>> Thanks for your help!
>> Wayne
>>
>>
>>
>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:
>>
>>> Hi Wayne,
>>>
>>> Thanks for getting back to me. Could you compile a new version of your
>>> job with the following in your main method?
>>>
>>> URL location1 =
>>> com.google.common.base.Optional.class.getProtectionDomain().
>>> getCodeSource().getLocation();
>>> System.out.println("Optional: " + location1);
>>> URL location2 =
>>> Class.forName("com.google.common.base.Optional").getProtecti
>>> onDomain().getCodeSource().getLocation();
>>> System.out.println("Absent: " + location2);
>>>
>>>
>>> Could you run this on your cluster node with the flink command? This
>>> should give us a hint from where the Guava library is bootstrapped.
>>>
>>> Thanks,
>>> Max
>>>
>>>
>>> On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins  wrote:
>>>
 Hi Max,

 Here is the result from the "flink run" launcher node (devbox):
 ---
 root@devbox:~# echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}
 :/etc/hadoop-conf:/etc/yarn-conf:
 ---

 Here is the result from one of the Cloudera YARN nodes as root:
 ---
 [root@hadoop0 ~]# echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}
 :::
 ---

 Here is the result from one of the Cloudera YARN nodes as yarn:
 ---
 [yarn@hadoop0 ~]$ echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}
 :::
 ---


 Note that both the yarn-session.sh and the flink run commands are run as
 root on devbox.

 Software version details:
 devbox has these versions of the client software:
 flink-1.1.2
 hadoop-2.6.0
 kafka_2.11-0.9.0.1
 (also reproduced the problem with kafka_2.10-0.9.0.1)

 The cluster (providing YARN) is:
 CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
 Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)

 Thanks for your help!
 Wayne



 On 2016-12-01 12:54 PM, Maximilian Michels wrote:

 What is the output of the following on the nodes? I have a suspision
 that something sneaks in from one of the classpath variables that
 Flink picks up:

 echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}

 On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins 
 wrote:

 Hi Max,

 I rebuilt my sandbox with Beam 0.3.0-incubating and Flink 1.1.2 and I'm
 still seeing the following error message with the StreamWordCount demo
 code:

 Caused by: java.lang.IllegalAccessError: tried to access method
 com.google.common.base.Optional.()V from class
 com.google.common.base.Absent
  at com.google.common.base.Absent.(Absent.java:35)
  at com.google.common.base.Absent.(Absent.java:33)
 

Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Alexey Demin
Hi Max,

Optional:
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar
Absent:
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar

Step for reproducing:
1) CDH hadoop cluster
2) download and unpack flink-1.1.3-bin-hadoop26-scala_2.10.tgz
3) create YARN container:  HADOOP_USER_NAME=flink YARN_CONF_DIR=yarn-conf/
./bin/yarn-session.sh -n 1 -jm 1024 -tm 1024 -z /pr/test1
4) set ApplicationMaster and make one shaded jar
5) execute: java -cp target/example.jar  ClassloaderTest

p.s. source code for validation:
https://gist.github.com/xhumanoid/83e45ff8a9209b1d7115bda33d18c966

Thanks,
Alexey Diomin

2016-12-05 18:55 GMT+04:00 Maximilian Michels <m...@apache.org>:

> Hi Alexey,
>
> I suspected that Hadoop's classpath which is added after the main
> Flink classpath might sneak in an older version of Guava. However,
> when your jar is executed, the Flink class loader will make sure that
> your classes (including Beam's) will be loaded first. Normally,
> Hadoop's Guava would not make it in that way because it comes later in
> the classpath. However, there's probably some Hadoop code executed
> before, that loads the Guava classes.
>
> It is problematic that the Kafka module of Beam does not shade Guava
> when you use it in combination with Hadoop which does not shade Guava
> either. I couldn't manage to reproduce it on my cluster setup. The fix
> which Alexey suggested, should make it work because it relocates your
> Guava classes and thereby avoids a clash with Hadoop's version.
>
> I would still be interested in the output of the code I posted earlier
> on your cluster. Just to confirm the suspicion.
>
> Thanks,
> Max
>
> On Mon, Dec 5, 2016 at 2:22 PM, Alexey Demin <diomi...@gmail.com> wrote:
> > Hi Max
> >
> > No, it's main problem in flink runner
> > I investigate problem and found than AppClassLoader load this classes:
> >
> > com.google.common.base.Absent
> > com.google.common.base.Function
> > com.google.common.base.Optional
> > com.google.common.base.Optional$Absent
> > com.google.common.base.Optional$Present
> >
> > But Optional$Absent existed only in old version guava, after commit "Make
> > Optional GWT serializable" 3e82ff72b19d4230eba795b2760036eb18dfd4ff
> guava
> > split Optional on 3 different classes and made constructor for Optional
> > package private. Optional$Absent and Absent can be loaded together from
> > different versions of guava, but can't work.
> >
> > but because FlinkUserCodeClassLoader have parent on AppClassLoader we
> have
> > problem with guava versions from hadoop classpath in cloudera
> distributions.
> >
> > My config:
> > FlinkPipelineOptions options =
> > PipelineOptionsFactory.as(FlinkPipelineOptions.class);
> > options.setRunner(FlinkRunner.class);
> > options.setFlinkMaster("");
> >
> > and execution command: java -jar runner-shaded.jar
> >
> > Flink running on cloudera yarn.
> >
> > My small hotfix it's addition configuration shaded plugin for my
> > application:
> >
> > 
> > 
> > com.google.common
> > my.shaded.com.google.common
> > 
> > 
> >
> > It remove conflict with AppClassLoader version of guava and alway use
> > correct version from my application.
> >
> > Thanks,
> > Alexey Diomin
> >
> >
> > 2016-12-05 14:50 GMT+04:00 Maximilian Michels <m...@apache.org>:
> >>
> >> Hi Wayne,
> >>
> >> That seems like you ran that from the IDE. I'm interested how the output
> >> would look like on the cluster where you experienced your problems.
> >>
> >> Thanks,
> >> Max
> >>
> >> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins <wayn...@dades.ca> wrote:
> >>>
> >>> Hi Max,
> >>>
> >>> Here's the output:
> >>> -
> >>> Optional:
> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./
> target/beam-starter-0.2.jar
> >>> Absent:
> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./
> target/beam-starter-0.2.jar
> >>> -
> >>>
> >>> Thanks for your help!
> >>> Wayne
> >>>
> >>>
> >>>
> >>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:
> >>>>
> >>>> Hi Wayne,
> >>>>
> >>>> Thanks for getting back to me. Could you compile a new version of your
> >>>> job with the following in you