Re: java.lang.IllegalAccessError with KafkaIO

2016-12-06 Thread Maximilian Michels
Glad we were able to solve your problem. This is a very common problem
in the Java world when different versions of the same library are
loaded in the course of combining project libraries.

Alexey and me both were on the right track. As it turns out, it was
Guava v11 loaded from CDH that caused the clash with Beam's Guava v18.
Shading away Beam's Guava resolves the issue.

I've created an issue to solve the root issues of the class conflict:
https://issues.apache.org/jira/browse/BEAM-1092

Thanks,
Max

On Mon, Dec 5, 2016 at 10:26 PM, Wayne Collins  wrote:
> That's the culprit and Alexey's suggested shading resolved the problem for
> me.
>
> Max and Alexey, thanks for all you help on this one!
> Wayne
>
>
>
> On 2016-12-05 11:02 AM, Alexey Demin wrote:
>
> Hi Max,
>
> Optional:
> file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar
> Absent:
> file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar
>
> Step for reproducing:
> 1) CDH hadoop cluster
> 2) download and unpack flink-1.1.3-bin-hadoop26-scala_2.10.tgz
> 3) create YARN container:  HADOOP_USER_NAME=flink YARN_CONF_DIR=yarn-conf/
> ./bin/yarn-session.sh -n 1 -jm 1024 -tm 1024 -z /pr/test1
> 4) set ApplicationMaster and make one shaded jar
> 5) execute: java -cp target/example.jar  ClassloaderTest
>
> p.s. source code for validation:
> https://gist.github.com/xhumanoid/83e45ff8a9209b1d7115bda33d18c966
>
> Thanks,
> Alexey Diomin
>
> 2016-12-05 18:55 GMT+04:00 Maximilian Michels :
>>
>> Hi Alexey,
>>
>> I suspected that Hadoop's classpath which is added after the main
>> Flink classpath might sneak in an older version of Guava. However,
>> when your jar is executed, the Flink class loader will make sure that
>> your classes (including Beam's) will be loaded first. Normally,
>> Hadoop's Guava would not make it in that way because it comes later in
>> the classpath. However, there's probably some Hadoop code executed
>> before, that loads the Guava classes.
>>
>> It is problematic that the Kafka module of Beam does not shade Guava
>> when you use it in combination with Hadoop which does not shade Guava
>> either. I couldn't manage to reproduce it on my cluster setup. The fix
>> which Alexey suggested, should make it work because it relocates your
>> Guava classes and thereby avoids a clash with Hadoop's version.
>>
>> I would still be interested in the output of the code I posted earlier
>> on your cluster. Just to confirm the suspicion.
>>
>> Thanks,
>> Max
>>
>> On Mon, Dec 5, 2016 at 2:22 PM, Alexey Demin  wrote:
>> > Hi Max
>> >
>> > No, it's main problem in flink runner
>> > I investigate problem and found than AppClassLoader load this classes:
>> >
>> > com.google.common.base.Absent
>> > com.google.common.base.Function
>> > com.google.common.base.Optional
>> > com.google.common.base.Optional$Absent
>> > com.google.common.base.Optional$Present
>> >
>> > But Optional$Absent existed only in old version guava, after commit
>> > "Make
>> > Optional GWT serializable" 3e82ff72b19d4230eba795b2760036eb18dfd4ff
>> > guava
>> > split Optional on 3 different classes and made constructor for Optional
>> > package private. Optional$Absent and Absent can be loaded together from
>> > different versions of guava, but can't work.
>> >
>> > but because FlinkUserCodeClassLoader have parent on AppClassLoader we
>> > have
>> > problem with guava versions from hadoop classpath in cloudera
>> > distributions.
>> >
>> > My config:
>> > FlinkPipelineOptions options =
>> > PipelineOptionsFactory.as(FlinkPipelineOptions.class);
>> > options.setRunner(FlinkRunner.class);
>> > options.setFlinkMaster("");
>> >
>> > and execution command: java -jar runner-shaded.jar
>> >
>> > Flink running on cloudera yarn.
>> >
>> > My small hotfix it's addition configuration shaded plugin for my
>> > application:
>> >
>> > 
>> > 
>> > com.google.common
>> > my.shaded.com.google.common
>> > 
>> > 
>> >
>> > It remove conflict with AppClassLoader version of guava and alway use
>> > correct version from my application.
>> >
>> > Thanks,
>> > Alexey Diomin
>> >
>> >
>> > 2016-12-05 14:50 GMT+04:00 Maximilian Michels :
>> >>
>> >> Hi Wayne,
>> >>
>> >> That seems like you ran that from the IDE. I'm interested how the
>> >> output
>> >> would look like on the cluster where you experienced your problems.
>> >>
>> >> Thanks,
>> >> Max
>> >>
>> >> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins  wrote:
>> >>>
>> >>> Hi Max,
>> >>>
>> >>> Here's the output:
>> >>> -
>> >>> Optional:
>> >>>
>> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
>> >>> Absent:
>> >>>
>> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
>> >>> -
>> >>>
>> >>> Thanks for your help!
>> >>> Wayne
>> >>>
>> >>>
>> >>>
>> >>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:
>> 
>>  Hi Wayne,
>> 
>>  Thanks for getting back to me. Co

Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Wayne Collins
That's the culprit and Alexey's suggested shading resolved the problem 
for me.


Max and Alexey, thanks for all you help on this one!
Wayne


On 2016-12-05 11:02 AM, Alexey Demin wrote:

Hi Max,

Optional: 
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar
Absent: 
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar


Step for reproducing:
1) CDH hadoop cluster
2) download and unpack flink-1.1.3-bin-hadoop26-scala_2.10.tgz
3) create YARN container:  HADOOP_USER_NAME=flink 
YARN_CONF_DIR=yarn-conf/ ./bin/yarn-session.sh -n 1 -jm 1024 -tm 1024 
-z /pr/test1

4) set ApplicationMaster and make one shaded jar
5) execute: java -cp target/example.jar  ClassloaderTest

p.s. source code for validation:
https://gist.github.com/xhumanoid/83e45ff8a9209b1d7115bda33d18c966

Thanks,
Alexey Diomin

2016-12-05 18:55 GMT+04:00 Maximilian Michels >:


Hi Alexey,

I suspected that Hadoop's classpath which is added after the main
Flink classpath might sneak in an older version of Guava. However,
when your jar is executed, the Flink class loader will make sure that
your classes (including Beam's) will be loaded first. Normally,
Hadoop's Guava would not make it in that way because it comes later in
the classpath. However, there's probably some Hadoop code executed
before, that loads the Guava classes.

It is problematic that the Kafka module of Beam does not shade Guava
when you use it in combination with Hadoop which does not shade Guava
either. I couldn't manage to reproduce it on my cluster setup. The fix
which Alexey suggested, should make it work because it relocates your
Guava classes and thereby avoids a clash with Hadoop's version.

I would still be interested in the output of the code I posted earlier
on your cluster. Just to confirm the suspicion.

Thanks,
Max

On Mon, Dec 5, 2016 at 2:22 PM, Alexey Demin mailto:diomi...@gmail.com>> wrote:
> Hi Max
>
> No, it's main problem in flink runner
> I investigate problem and found than AppClassLoader load this
classes:
>
> com.google.common.base.Absent
> com.google.common.base.Function
> com.google.common.base.Optional
> com.google.common.base.Optional$Absent
> com.google.common.base.Optional$Present
>
> But Optional$Absent existed only in old version guava, after
commit "Make
> Optional GWT serializable"
3e82ff72b19d4230eba795b2760036eb18dfd4ff guava
> split Optional on 3 different classes and made constructor for
Optional
> package private. Optional$Absent and Absent can be loaded
together from
> different versions of guava, but can't work.
>
> but because FlinkUserCodeClassLoader have parent on
AppClassLoader we have
> problem with guava versions from hadoop classpath in cloudera
distributions.
>
> My config:
> FlinkPipelineOptions options =
> PipelineOptionsFactory.as(FlinkPipelineOptions.class);
> options.setRunner(FlinkRunner.class);
> options.setFlinkMaster("");
>
> and execution command: java -jar runner-shaded.jar
>
> Flink running on cloudera yarn.
>
> My small hotfix it's addition configuration shaded plugin for my
> application:
>
> 
> 
> com.google.common
> my.shaded.com
.google.common
> 
> 
>
> It remove conflict with AppClassLoader version of guava and
alway use
> correct version from my application.
>
> Thanks,
> Alexey Diomin
>
>
> 2016-12-05 14:50 GMT+04:00 Maximilian Michels mailto:m...@apache.org>>:
>>
>> Hi Wayne,
>>
>> That seems like you ran that from the IDE. I'm interested how
the output
>> would look like on the cluster where you experienced your problems.
>>
>> Thanks,
>> Max
>>
>> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins mailto:wayn...@dades.ca>> wrote:
>>>
>>> Hi Max,
>>>
>>> Here's the output:
>>> -
>>> Optional:
>>>

file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
>>> Absent:
>>>

file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
>>> -
>>>
>>> Thanks for your help!
>>> Wayne
>>>
>>>
>>>
>>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:

 Hi Wayne,

 Thanks for getting back to me. Could you compile a new
version of your
 job with the following in your main method?

 URL location1 =



com.google.common.base.Optional.class.getProtectionDomain().getCodeSource().getLocation();
 System.out.println("Optional: " + location1);
 URL location2 =



Class.forName("com.google.common.base.Optiona

Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Wayne Collins

Hi Max,

That was the output from "flink run -c 
com.dataradiant.beam.examples.StreamWordCount ./target/beam-starter-0.2.jar"


From the Eclipse IDE , I get:
Optional: 
file:/home/wayneco/.m2/repository/com/google/guava/guava/19.0/guava-19.0.jar
Absent: 
file:/home/wayneco/.m2/repository/com/google/guava/guava/19.0/guava-19.0.jar


The demo code and pom.xml needs some tweaking to launch the flink job 
itself (i.e. java -cp ...) and I wanted to keep the code as close to 
original as possible.


Trying Alexey's suggested shading now...

Thanks,
Wayne


On 2016-12-05 05:50 AM, Maximilian Michels wrote:

Hi Wayne,

That seems like you ran that from the IDE. I'm interested how the 
output would look like on the cluster where you experienced your problems.


Thanks,
Max

On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins > wrote:


Hi Max,

Here's the output:
-
Optional:

file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
Absent:

file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
-

Thanks for your help!
Wayne



On 2016-12-02 08:42 AM, Maximilian Michels wrote:

Hi Wayne,

Thanks for getting back to me. Could you compile a new version
of your
job with the following in your main method?

URL location1 =

com.google.common.base.Optional.class.getProtectionDomain().getCodeSource().getLocation();
System.out.println("Optional: " + location1);
URL location2 =

Class.forName("com.google.common.base.Optional").getProtectionDomain().getCodeSource().getLocation();
System.out.println("Absent: " + location2);


Could you run this on your cluster node with the flink
command? This
should give us a hint from where the Guava library is
bootstrapped.

Thanks,
Max


On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins
mailto:wayn...@dades.ca>> wrote:

Hi Max,

Here is the result from the "flink run" launcher node
(devbox):
---
root@devbox:~# echo

${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
:/etc/hadoop-conf:/etc/yarn-conf:
---

Here is the result from one of the Cloudera YARN nodes as
root:
---
[root@hadoop0 ~]# echo

${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
:::
---

Here is the result from one of the Cloudera YARN nodes as
yarn:
---
[yarn@hadoop0 ~]$ echo

${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
:::
---


Note that both the yarn-session.sh and the flink run
commands are run as
root on devbox.

Software version details:
devbox has these versions of the client software:
flink-1.1.2
hadoop-2.6.0
kafka_2.11-0.9.0.1
(also reproduced the problem with kafka_2.10-0.9.0.1)

The cluster (providing YARN) is:
CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)

Thanks for your help!
Wayne



On 2016-12-01 12:54 PM, Maximilian Michels wrote:

What is the output of the following on the nodes? I have a
suspision
that something sneaks in from one of the classpath
variables that
Flink picks up:

echo

${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}

On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins
mailto:wayn...@dades.ca>> wrote:

Hi Max,

I rebuilt my sandbox with Beam 0.3.0-incubating and Flink
1.1.2 and I'm
still seeing the following error message with the
StreamWordCount demo code:

Caused by: java.lang.IllegalAccessError: tried to access
method
com.google.common.base.Optional.()V from class
com.google.common.base.Absent
 at
com.google.common.base.Absent.(Absent.java:35)
 at
com.google.common.base.Absent.(Absent.java:33)
 at sun.misc.Unsafe.ensureClassInitialized(Native
Method)
...


(snip)







Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Alexey Demin
Hi Max,

Optional:
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar
Absent:
file:/opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/jars/guava-11.0.2.jar

Step for reproducing:
1) CDH hadoop cluster
2) download and unpack flink-1.1.3-bin-hadoop26-scala_2.10.tgz
3) create YARN container:  HADOOP_USER_NAME=flink YARN_CONF_DIR=yarn-conf/
./bin/yarn-session.sh -n 1 -jm 1024 -tm 1024 -z /pr/test1
4) set ApplicationMaster and make one shaded jar
5) execute: java -cp target/example.jar  ClassloaderTest

p.s. source code for validation:
https://gist.github.com/xhumanoid/83e45ff8a9209b1d7115bda33d18c966

Thanks,
Alexey Diomin

2016-12-05 18:55 GMT+04:00 Maximilian Michels :

> Hi Alexey,
>
> I suspected that Hadoop's classpath which is added after the main
> Flink classpath might sneak in an older version of Guava. However,
> when your jar is executed, the Flink class loader will make sure that
> your classes (including Beam's) will be loaded first. Normally,
> Hadoop's Guava would not make it in that way because it comes later in
> the classpath. However, there's probably some Hadoop code executed
> before, that loads the Guava classes.
>
> It is problematic that the Kafka module of Beam does not shade Guava
> when you use it in combination with Hadoop which does not shade Guava
> either. I couldn't manage to reproduce it on my cluster setup. The fix
> which Alexey suggested, should make it work because it relocates your
> Guava classes and thereby avoids a clash with Hadoop's version.
>
> I would still be interested in the output of the code I posted earlier
> on your cluster. Just to confirm the suspicion.
>
> Thanks,
> Max
>
> On Mon, Dec 5, 2016 at 2:22 PM, Alexey Demin  wrote:
> > Hi Max
> >
> > No, it's main problem in flink runner
> > I investigate problem and found than AppClassLoader load this classes:
> >
> > com.google.common.base.Absent
> > com.google.common.base.Function
> > com.google.common.base.Optional
> > com.google.common.base.Optional$Absent
> > com.google.common.base.Optional$Present
> >
> > But Optional$Absent existed only in old version guava, after commit "Make
> > Optional GWT serializable" 3e82ff72b19d4230eba795b2760036eb18dfd4ff
> guava
> > split Optional on 3 different classes and made constructor for Optional
> > package private. Optional$Absent and Absent can be loaded together from
> > different versions of guava, but can't work.
> >
> > but because FlinkUserCodeClassLoader have parent on AppClassLoader we
> have
> > problem with guava versions from hadoop classpath in cloudera
> distributions.
> >
> > My config:
> > FlinkPipelineOptions options =
> > PipelineOptionsFactory.as(FlinkPipelineOptions.class);
> > options.setRunner(FlinkRunner.class);
> > options.setFlinkMaster("");
> >
> > and execution command: java -jar runner-shaded.jar
> >
> > Flink running on cloudera yarn.
> >
> > My small hotfix it's addition configuration shaded plugin for my
> > application:
> >
> > 
> > 
> > com.google.common
> > my.shaded.com.google.common
> > 
> > 
> >
> > It remove conflict with AppClassLoader version of guava and alway use
> > correct version from my application.
> >
> > Thanks,
> > Alexey Diomin
> >
> >
> > 2016-12-05 14:50 GMT+04:00 Maximilian Michels :
> >>
> >> Hi Wayne,
> >>
> >> That seems like you ran that from the IDE. I'm interested how the output
> >> would look like on the cluster where you experienced your problems.
> >>
> >> Thanks,
> >> Max
> >>
> >> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins  wrote:
> >>>
> >>> Hi Max,
> >>>
> >>> Here's the output:
> >>> -
> >>> Optional:
> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./
> target/beam-starter-0.2.jar
> >>> Absent:
> >>> file:/home/wayneco/workspace/beam-starter/beam-starter/./
> target/beam-starter-0.2.jar
> >>> -
> >>>
> >>> Thanks for your help!
> >>> Wayne
> >>>
> >>>
> >>>
> >>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:
> 
>  Hi Wayne,
> 
>  Thanks for getting back to me. Could you compile a new version of your
>  job with the following in your main method?
> 
>  URL location1 =
> 
>  com.google.common.base.Optional.class.getProtectionDomain().
> getCodeSource().getLocation();
>  System.out.println("Optional: " + location1);
>  URL location2 =
> 
>  Class.forName("com.google.common.base.Optional").
> getProtectionDomain().getCodeSource().getLocation();
>  System.out.println("Absent: " + location2);
> 
> 
>  Could you run this on your cluster node with the flink command? This
>  should give us a hint from where the Guava library is bootstrapped.
> 
>  Thanks,
>  Max
> 
> 
>  On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins 
> wrote:
> >
> > Hi Max,
> >
> > Here is the result from the "flink run" launcher node (devbox):
> > ---
> > root@devbox:~# echo
> >
> > ${HADOOP_

Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Maximilian Michels
Hi Alexey,

I suspected that Hadoop's classpath which is added after the main
Flink classpath might sneak in an older version of Guava. However,
when your jar is executed, the Flink class loader will make sure that
your classes (including Beam's) will be loaded first. Normally,
Hadoop's Guava would not make it in that way because it comes later in
the classpath. However, there's probably some Hadoop code executed
before, that loads the Guava classes.

It is problematic that the Kafka module of Beam does not shade Guava
when you use it in combination with Hadoop which does not shade Guava
either. I couldn't manage to reproduce it on my cluster setup. The fix
which Alexey suggested, should make it work because it relocates your
Guava classes and thereby avoids a clash with Hadoop's version.

I would still be interested in the output of the code I posted earlier
on your cluster. Just to confirm the suspicion.

Thanks,
Max

On Mon, Dec 5, 2016 at 2:22 PM, Alexey Demin  wrote:
> Hi Max
>
> No, it's main problem in flink runner
> I investigate problem and found than AppClassLoader load this classes:
>
> com.google.common.base.Absent
> com.google.common.base.Function
> com.google.common.base.Optional
> com.google.common.base.Optional$Absent
> com.google.common.base.Optional$Present
>
> But Optional$Absent existed only in old version guava, after commit "Make
> Optional GWT serializable" 3e82ff72b19d4230eba795b2760036eb18dfd4ff guava
> split Optional on 3 different classes and made constructor for Optional
> package private. Optional$Absent and Absent can be loaded together from
> different versions of guava, but can't work.
>
> but because FlinkUserCodeClassLoader have parent on AppClassLoader we have
> problem with guava versions from hadoop classpath in cloudera distributions.
>
> My config:
> FlinkPipelineOptions options =
> PipelineOptionsFactory.as(FlinkPipelineOptions.class);
> options.setRunner(FlinkRunner.class);
> options.setFlinkMaster("");
>
> and execution command: java -jar runner-shaded.jar
>
> Flink running on cloudera yarn.
>
> My small hotfix it's addition configuration shaded plugin for my
> application:
>
> 
> 
> com.google.common
> my.shaded.com.google.common
> 
> 
>
> It remove conflict with AppClassLoader version of guava and alway use
> correct version from my application.
>
> Thanks,
> Alexey Diomin
>
>
> 2016-12-05 14:50 GMT+04:00 Maximilian Michels :
>>
>> Hi Wayne,
>>
>> That seems like you ran that from the IDE. I'm interested how the output
>> would look like on the cluster where you experienced your problems.
>>
>> Thanks,
>> Max
>>
>> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins  wrote:
>>>
>>> Hi Max,
>>>
>>> Here's the output:
>>> -
>>> Optional:
>>> file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
>>> Absent:
>>> file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
>>> -
>>>
>>> Thanks for your help!
>>> Wayne
>>>
>>>
>>>
>>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:

 Hi Wayne,

 Thanks for getting back to me. Could you compile a new version of your
 job with the following in your main method?

 URL location1 =

 com.google.common.base.Optional.class.getProtectionDomain().getCodeSource().getLocation();
 System.out.println("Optional: " + location1);
 URL location2 =

 Class.forName("com.google.common.base.Optional").getProtectionDomain().getCodeSource().getLocation();
 System.out.println("Absent: " + location2);


 Could you run this on your cluster node with the flink command? This
 should give us a hint from where the Guava library is bootstrapped.

 Thanks,
 Max


 On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins  wrote:
>
> Hi Max,
>
> Here is the result from the "flink run" launcher node (devbox):
> ---
> root@devbox:~# echo
>
> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
> :/etc/hadoop-conf:/etc/yarn-conf:
> ---
>
> Here is the result from one of the Cloudera YARN nodes as root:
> ---
> [root@hadoop0 ~]# echo
>
> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
> :::
> ---
>
> Here is the result from one of the Cloudera YARN nodes as yarn:
> ---
> [yarn@hadoop0 ~]$ echo
>
> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
> :::
> ---
>
>
> Note that both the yarn-session.sh and the flink run commands are run
> as
> root on devbox.
>
> Software version details:
> devbox has these versions of the client software:
> flink-1.1.2
> hadoop-2.6.0
> kafka_2.11-0.9.0.1
> (also reproduced the problem with kaf

Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Alexey Demin
Hi Max

No, it's main problem in flink runner
I investigate problem and found than AppClassLoader load this classes:

com.google.common.base.Absent
com.google.common.base.Function
com.google.common.base.Optional
com.google.common.base.Optional$Absent
com.google.common.base.Optional$Present

But Optional$Absent existed only in old version guava, after commit "Make
Optional GWT serializable" 3e82ff72b19d4230eba795b2760036eb18dfd4ff guava
split Optional on 3 different classes and made constructor for Optional
package private. Optional$Absent and Absent can be loaded together from
different versions of guava, but can't work.

but because FlinkUserCodeClassLoader have parent on AppClassLoader we have
problem with guava versions from hadoop classpath in cloudera distributions.

My config:
FlinkPipelineOptions options =
PipelineOptionsFactory.as(FlinkPipelineOptions.class);
options.setRunner(FlinkRunner.class);
options.setFlinkMaster("");

and execution command: java -jar runner-shaded.jar

Flink running on cloudera yarn.

My small hotfix it's addition configuration shaded plugin for my
application:



com.google.common
my.shaded.com.google.common



It remove conflict with AppClassLoader version of guava and alway use
correct version from my application.

Thanks,
Alexey Diomin


2016-12-05 14:50 GMT+04:00 Maximilian Michels :

> Hi Wayne,
>
> That seems like you ran that from the IDE. I'm interested how the output
> would look like on the cluster where you experienced your problems.
>
> Thanks,
> Max
>
> On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins  wrote:
>
>> Hi Max,
>>
>> Here's the output:
>> -
>> Optional: file:/home/wayneco/workspace/beam-starter/beam-starter/./tar
>> get/beam-starter-0.2.jar
>> Absent: file:/home/wayneco/workspace/beam-starter/beam-starter/./tar
>> get/beam-starter-0.2.jar
>> -
>>
>> Thanks for your help!
>> Wayne
>>
>>
>>
>> On 2016-12-02 08:42 AM, Maximilian Michels wrote:
>>
>>> Hi Wayne,
>>>
>>> Thanks for getting back to me. Could you compile a new version of your
>>> job with the following in your main method?
>>>
>>> URL location1 =
>>> com.google.common.base.Optional.class.getProtectionDomain().
>>> getCodeSource().getLocation();
>>> System.out.println("Optional: " + location1);
>>> URL location2 =
>>> Class.forName("com.google.common.base.Optional").getProtecti
>>> onDomain().getCodeSource().getLocation();
>>> System.out.println("Absent: " + location2);
>>>
>>>
>>> Could you run this on your cluster node with the flink command? This
>>> should give us a hint from where the Guava library is bootstrapped.
>>>
>>> Thanks,
>>> Max
>>>
>>>
>>> On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins  wrote:
>>>
 Hi Max,

 Here is the result from the "flink run" launcher node (devbox):
 ---
 root@devbox:~# echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}
 :/etc/hadoop-conf:/etc/yarn-conf:
 ---

 Here is the result from one of the Cloudera YARN nodes as root:
 ---
 [root@hadoop0 ~]# echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}
 :::
 ---

 Here is the result from one of the Cloudera YARN nodes as yarn:
 ---
 [yarn@hadoop0 ~]$ echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}
 :::
 ---


 Note that both the yarn-session.sh and the flink run commands are run as
 root on devbox.

 Software version details:
 devbox has these versions of the client software:
 flink-1.1.2
 hadoop-2.6.0
 kafka_2.11-0.9.0.1
 (also reproduced the problem with kafka_2.10-0.9.0.1)

 The cluster (providing YARN) is:
 CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
 Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)

 Thanks for your help!
 Wayne



 On 2016-12-01 12:54 PM, Maximilian Michels wrote:

 What is the output of the following on the nodes? I have a suspision
 that something sneaks in from one of the classpath variables that
 Flink picks up:

 echo
 ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
 ASE_CONF_DIR}

 On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins 
 wrote:

 Hi Max,

 I rebuilt my sandbox with Beam 0.3.0-incubating and Flink 1.1.2 and I'm
 still seeing the following error message with the StreamWordCount demo
 code:

 Caused by: java.lang.IllegalAccessError: tried to access method
 com.google.common.base.Optional.()V from class
 com.google.common.base.Absent
  at com.google.common.base.Absent.(Absent.java:35)
  at com.google.common.base.Absent.(Absent.java:33)
  at sun.misc.Unsafe.ensureClassInitialized(Native Method)
 

Re: java.lang.IllegalAccessError with KafkaIO

2016-12-05 Thread Maximilian Michels
Hi Wayne,

That seems like you ran that from the IDE. I'm interested how the output
would look like on the cluster where you experienced your problems.

Thanks,
Max

On Fri, Dec 2, 2016 at 7:27 PM, Wayne Collins  wrote:

> Hi Max,
>
> Here's the output:
> -
> Optional: file:/home/wayneco/workspace/beam-starter/beam-starter/./tar
> get/beam-starter-0.2.jar
> Absent: file:/home/wayneco/workspace/beam-starter/beam-starter/./tar
> get/beam-starter-0.2.jar
> -
>
> Thanks for your help!
> Wayne
>
>
>
> On 2016-12-02 08:42 AM, Maximilian Michels wrote:
>
>> Hi Wayne,
>>
>> Thanks for getting back to me. Could you compile a new version of your
>> job with the following in your main method?
>>
>> URL location1 =
>> com.google.common.base.Optional.class.getProtectionDomain().
>> getCodeSource().getLocation();
>> System.out.println("Optional: " + location1);
>> URL location2 =
>> Class.forName("com.google.common.base.Optional").getProtecti
>> onDomain().getCodeSource().getLocation();
>> System.out.println("Absent: " + location2);
>>
>>
>> Could you run this on your cluster node with the flink command? This
>> should give us a hint from where the Guava library is bootstrapped.
>>
>> Thanks,
>> Max
>>
>>
>> On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins  wrote:
>>
>>> Hi Max,
>>>
>>> Here is the result from the "flink run" launcher node (devbox):
>>> ---
>>> root@devbox:~# echo
>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
>>> ASE_CONF_DIR}
>>> :/etc/hadoop-conf:/etc/yarn-conf:
>>> ---
>>>
>>> Here is the result from one of the Cloudera YARN nodes as root:
>>> ---
>>> [root@hadoop0 ~]# echo
>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
>>> ASE_CONF_DIR}
>>> :::
>>> ---
>>>
>>> Here is the result from one of the Cloudera YARN nodes as yarn:
>>> ---
>>> [yarn@hadoop0 ~]$ echo
>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
>>> ASE_CONF_DIR}
>>> :::
>>> ---
>>>
>>>
>>> Note that both the yarn-session.sh and the flink run commands are run as
>>> root on devbox.
>>>
>>> Software version details:
>>> devbox has these versions of the client software:
>>> flink-1.1.2
>>> hadoop-2.6.0
>>> kafka_2.11-0.9.0.1
>>> (also reproduced the problem with kafka_2.10-0.9.0.1)
>>>
>>> The cluster (providing YARN) is:
>>> CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
>>> Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)
>>>
>>> Thanks for your help!
>>> Wayne
>>>
>>>
>>>
>>> On 2016-12-01 12:54 PM, Maximilian Michels wrote:
>>>
>>> What is the output of the following on the nodes? I have a suspision
>>> that something sneaks in from one of the classpath variables that
>>> Flink picks up:
>>>
>>> echo
>>> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HB
>>> ASE_CONF_DIR}
>>>
>>> On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins  wrote:
>>>
>>> Hi Max,
>>>
>>> I rebuilt my sandbox with Beam 0.3.0-incubating and Flink 1.1.2 and I'm
>>> still seeing the following error message with the StreamWordCount demo
>>> code:
>>>
>>> Caused by: java.lang.IllegalAccessError: tried to access method
>>> com.google.common.base.Optional.()V from class
>>> com.google.common.base.Absent
>>>  at com.google.common.base.Absent.(Absent.java:35)
>>>  at com.google.common.base.Absent.(Absent.java:33)
>>>  at sun.misc.Unsafe.ensureClassInitialized(Native Method)
>>> ...
>>>
>>>
>>> (snip)
>>>
>>
>


Re: java.lang.IllegalAccessError with KafkaIO

2016-12-02 Thread Wayne Collins

Hi Max,

Here's the output:
-
Optional: 
file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar
Absent: 
file:/home/wayneco/workspace/beam-starter/beam-starter/./target/beam-starter-0.2.jar

-

Thanks for your help!
Wayne



On 2016-12-02 08:42 AM, Maximilian Michels wrote:

Hi Wayne,

Thanks for getting back to me. Could you compile a new version of your
job with the following in your main method?

URL location1 =
com.google.common.base.Optional.class.getProtectionDomain().getCodeSource().getLocation();
System.out.println("Optional: " + location1);
URL location2 =
Class.forName("com.google.common.base.Optional").getProtectionDomain().getCodeSource().getLocation();
System.out.println("Absent: " + location2);


Could you run this on your cluster node with the flink command? This
should give us a hint from where the Guava library is bootstrapped.

Thanks,
Max


On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins  wrote:

Hi Max,

Here is the result from the "flink run" launcher node (devbox):
---
root@devbox:~# echo
${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
:/etc/hadoop-conf:/etc/yarn-conf:
---

Here is the result from one of the Cloudera YARN nodes as root:
---
[root@hadoop0 ~]# echo
${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
:::
---

Here is the result from one of the Cloudera YARN nodes as yarn:
---
[yarn@hadoop0 ~]$ echo
${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
:::
---


Note that both the yarn-session.sh and the flink run commands are run as
root on devbox.

Software version details:
devbox has these versions of the client software:
flink-1.1.2
hadoop-2.6.0
kafka_2.11-0.9.0.1
(also reproduced the problem with kafka_2.10-0.9.0.1)

The cluster (providing YARN) is:
CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)

Thanks for your help!
Wayne



On 2016-12-01 12:54 PM, Maximilian Michels wrote:

What is the output of the following on the nodes? I have a suspision
that something sneaks in from one of the classpath variables that
Flink picks up:

echo
${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}

On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins  wrote:

Hi Max,

I rebuilt my sandbox with Beam 0.3.0-incubating and Flink 1.1.2 and I'm
still seeing the following error message with the StreamWordCount demo code:

Caused by: java.lang.IllegalAccessError: tried to access method
com.google.common.base.Optional.()V from class
com.google.common.base.Absent
 at com.google.common.base.Absent.(Absent.java:35)
 at com.google.common.base.Absent.(Absent.java:33)
 at sun.misc.Unsafe.ensureClassInitialized(Native Method)
...


(snip)




Re: java.lang.IllegalAccessError with KafkaIO

2016-12-02 Thread Maximilian Michels
Hi Wayne,

Thanks for getting back to me. Could you compile a new version of your
job with the following in your main method?

URL location1 =
com.google.common.base.Optional.class.getProtectionDomain().getCodeSource().getLocation();
System.out.println("Optional: " + location1);
URL location2 =
Class.forName("com.google.common.base.Optional").getProtectionDomain().getCodeSource().getLocation();
System.out.println("Absent: " + location2);


Could you run this on your cluster node with the flink command? This
should give us a hint from where the Guava library is bootstrapped.

Thanks,
Max


On Thu, Dec 1, 2016 at 7:54 PM, Wayne Collins  wrote:
> Hi Max,
>
> Here is the result from the "flink run" launcher node (devbox):
> ---
> root@devbox:~# echo
> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
> :/etc/hadoop-conf:/etc/yarn-conf:
> ---
>
> Here is the result from one of the Cloudera YARN nodes as root:
> ---
> [root@hadoop0 ~]# echo
> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
> :::
> ---
>
> Here is the result from one of the Cloudera YARN nodes as yarn:
> ---
> [yarn@hadoop0 ~]$ echo
> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
> :::
> ---
>
>
> Note that both the yarn-session.sh and the flink run commands are run as
> root on devbox.
>
> Software version details:
> devbox has these versions of the client software:
> flink-1.1.2
> hadoop-2.6.0
> kafka_2.11-0.9.0.1
> (also reproduced the problem with kafka_2.10-0.9.0.1)
>
> The cluster (providing YARN) is:
> CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
> Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)
>
> Thanks for your help!
> Wayne
>
>
>
> On 2016-12-01 12:54 PM, Maximilian Michels wrote:
>
> What is the output of the following on the nodes? I have a suspision
> that something sneaks in from one of the classpath variables that
> Flink picks up:
>
> echo
> ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}
>
> On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins  wrote:
>
> Hi Max,
>
> I rebuilt my sandbox with Beam 0.3.0-incubating and Flink 1.1.2 and I'm
> still seeing the following error message with the StreamWordCount demo code:
>
> Caused by: java.lang.IllegalAccessError: tried to access method
> com.google.common.base.Optional.()V from class
> com.google.common.base.Absent
> at com.google.common.base.Absent.(Absent.java:35)
> at com.google.common.base.Absent.(Absent.java:33)
> at sun.misc.Unsafe.ensureClassInitialized(Native Method)
> ...
>
>
> (snip)


Re: java.lang.IllegalAccessError with KafkaIO

2016-12-01 Thread Wayne Collins

Hi Max,

Here is the result from the "flink run" launcher node (devbox):
---
root@devbox:~# echo 
${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}

:/etc/hadoop-conf:/etc/yarn-conf:
---

Here is the result from one of the Cloudera YARN nodes as root:
---
[root@hadoop0 ~]# echo 
${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}

:::
---

Here is the result from one of the Cloudera YARN nodes as yarn:
---
[yarn@hadoop0 ~]$ echo 
${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}

:::
---


Note that both the yarn-session.sh and the flink run commands are run as 
root on devbox.


Software version details:
devbox has these versions of the client software:
flink-1.1.2
hadoop-2.6.0
kafka_2.11-0.9.0.1
(also reproduced the problem with kafka_2.10-0.9.0.1)

The cluster (providing YARN) is:
CDH5 - 5.8.2-1.cdh5.8.2.p0.3 (Hadoop 2.6.0)
Kafka - 2.0.2-1.2.0.2.p0.5 (Kafka 0.9.0)

Thanks for your help!
Wayne



On 2016-12-01 12:54 PM, Maximilian Michels wrote:

What is the output of the following on the nodes? I have a suspision
that something sneaks in from one of the classpath variables that
Flink picks up:

echo ${HADOOP_CLASSPATH}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}:${HBASE_CONF_DIR}

On Tue, Nov 29, 2016 at 9:17 PM, Wayne Collins  wrote:

Hi Max,

I rebuilt my sandbox with Beam 0.3.0-incubating and Flink 1.1.2 and I'm
still seeing the following error message with the StreamWordCount demo code:

Caused by: java.lang.IllegalAccessError: tried to access method
com.google.common.base.Optional.()V from class
com.google.common.base.Absent
 at com.google.common.base.Absent.(Absent.java:35)
 at com.google.common.base.Absent.(Absent.java:33)
 at sun.misc.Unsafe.ensureClassInitialized(Native Method)
...



(snip)


Re: java.lang.IllegalAccessError with KafkaIO

2016-12-01 Thread Maximilian Michels
+- net.jcip:jcip-annotations:jar:1.0:compile
> [INFO]   \- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted
> for conflict with 1.3.9)
> [INFO]
> 
> [INFO] BUILD SUCCESS
> [INFO]
> 
> [INFO] Total time: 2.945 s
> [INFO] Finished at: 2016-11-29T14:55:24-05:00
> [INFO] Final Memory: 18M/194M
> [INFO]
> --------
>
> -
>
>
>
>
> On 2016-11-28 02:09 PM, Wayne Collins wrote:
>>
>> Hi Max,
>>
>> I pulled the binaries from the website but it looks like I pulled the
>> wrong version for 0.3.0 (Flink 1.1.3 instead of 1.1.2).
>> The message was the same for all combinations in testing my code but I
>> didn't try all the combinations with Emanuele's demo code in the sandbox.
>>
>> I'll rebuild my sandbox environment with just Beam 0.3.0 and Flink 1.1.2
>> and post the message and associated maven dependency tree.
>>
>> Thanks for your input!
>> Wayne
>>
>>
>> On 2016-11-28 01:35 PM, Maximilian Michels wrote:
>>>
>>> Hi Wayne,
>>>
>>> That seems like conflicting Guava versions in the classpath. Did you
>>> use Flink binaries from the website or did you compile Flink yourself?
>>> Is the error message the same across all Flink Beam combinations? If
>>> you use 0.2.0 the Flink version has to be 1.0.3. If you use 0.3.0, you
>>> will have to run it on Flink 1.1.2 because these are the Flink
>>> versions which the Beam releases are built upon. This will also be
>>> part of the upcoming documentation.
>>>
>>> For displaying conflicting versions, `mvn dependency:tree -Dverbose`
>>> provides a way to display conflicting versions of dependencies. It
>>> seems like Guava 18 and 19 are both used as dependencies which could
>>> result in a conflict (if shading is not set up properly). That's just
>>> a quick assessment but I would like to look further into the issue.
>>>
>>>
>>> -Max
>>>
>>>
>>> On Fri, Nov 25, 2016 at 4:48 PM, Wayne Collins  wrote:
>>>>
>>>> Hi Stephen,
>>>>
>>>> Thanks for the suggestions!
>>>> I had come to the same conclusion but haven't been able to shade or set
>>>> "provides" to work around it.
>>>> The "mvn dependency:tree" looks good to me...
>>>>
>>>> Can anyone share a pom dependencies/exclusions fragment that is working
>>>> for
>>>> them with KafkaIO on a Flink/Yarn cluster?
>>>>
>>>> Thanks,
>>>> Wayne
>>>>
>>>>
>>>> Subject:
>>>> Re: java.lang.IllegalAccessError with KafkaIO
>>>> From:
>>>> Stephan Ewen 
>>>> Date:
>>>> 2016-11-24 02:38 PM
>>>> To:
>>>> user@beam.incubator.apache.org, d...@beam.incubator.apache.org
>>>>
>>>> I think have seen these kind of errors before.
>>>> It is the JVM checking upon class loading (lazy linking) that a
>>>> private/protected access is valid.
>>>>
>>>> It could be that this violation is because there are multiple versions
>>>> of
>>>> that class loaded - some in the SystemClassLoader and some in the user
>>>> library jar file.
>>>>
>>>> Fixing this probably needs some dependency cleanup, for example properly
>>>> setting dependencies as "provided" that should not be packaged again
>>>> into
>>>> the user code jar. Also, reducing visible dependencies by shading them
>>>> away
>>>> helps.
>>>>
>>>>
>>>> (snip)
>>
>>
>


Re: java.lang.IllegalAccessError with KafkaIO

2016-11-29 Thread Wayne Collins
[INFO] | |  |  \- (org.slf4j:slf4j-api:jar:1.7.2:compile - omitted 
for conflict with 1.7.7)
[INFO] | |  +- (org.scala-lang:scala-library:jar:2.10.4:compile - 
omitted for duplicate)
[INFO] | |  +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - 
omitted for duplicate)

[INFO] | |  \- com.101tec:zkclient:jar:0.3:compile
[INFO] | | \- (org.apache.zookeeper:zookeeper:jar:3.3.1:compile 
- omitted for conflict with 3.4.6)
[INFO] | +- (com.google.code.findbugs:jsr305:jar:1.3.9:compile - 
omitted for duplicate)
[INFO] | +- (org.apache.commons:commons-lang3:jar:3.3.2:compile - 
omitted for duplicate)
[INFO] | +- (org.slf4j:slf4j-api:jar:1.7.7:compile - omitted for 
duplicate)
[INFO] | +- (org.slf4j:slf4j-log4j12:jar:1.7.7:compile - omitted for 
duplicate)

[INFO] | +- (log4j:log4j:jar:1.2.17:compile - omitted for duplicate)
[INFO] | \- (org.apache.flink:force-shading:jar:1.1.2:compile - 
omitted for duplicate)
[INFO] \- 
org.apache.beam:beam-sdks-java-io-kafka:jar:0.3.0-incubating:compile
[INFO]+- 
(org.apache.beam:beam-sdks-java-core:jar:0.3.0-incubating:compile - 
omitted for duplicate)

[INFO]+- org.apache.kafka:kafka-clients:jar:0.9.0.1:compile
[INFO]|  +- (org.slf4j:slf4j-api:jar:1.7.6:compile - omitted for 
conflict with 1.7.7)
[INFO]|  +- (org.xerial.snappy:snappy-java:jar:1.1.1.7:compile - 
omitted for conflict with 1.1.2.1)

[INFO]|  \- net.jpountz.lz4:lz4:jar:1.2.0:compile
[INFO]+- org.slf4j:slf4j-api:jar:1.7.14:compile
[INFO]+- joda-time:joda-time:jar:2.4:compile
[INFO]+- 
com.fasterxml.jackson.core:jackson-annotations:jar:2.7.2:compile

[INFO]+- com.google.guava:guava:jar:19.0:compile
[INFO]\- com.google.code.findbugs:annotations:jar:3.0.1:compile
[INFO]   +- net.jcip:jcip-annotations:jar:1.0:compile
[INFO]   \- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - 
omitted for conflict with 1.3.9)
[INFO] 


[INFO] BUILD SUCCESS
[INFO] 


[INFO] Total time: 2.945 s
[INFO] Finished at: 2016-11-29T14:55:24-05:00
[INFO] Final Memory: 18M/194M
[INFO] 



-



On 2016-11-28 02:09 PM, Wayne Collins wrote:

Hi Max,

I pulled the binaries from the website but it looks like I pulled the 
wrong version for 0.3.0 (Flink 1.1.3 instead of 1.1.2).
The message was the same for all combinations in testing my code but I 
didn't try all the combinations with Emanuele's demo code in the sandbox.


I'll rebuild my sandbox environment with just Beam 0.3.0 and Flink 
1.1.2 and post the message and associated maven dependency tree.


Thanks for your input!
Wayne


On 2016-11-28 01:35 PM, Maximilian Michels wrote:

Hi Wayne,

That seems like conflicting Guava versions in the classpath. Did you
use Flink binaries from the website or did you compile Flink yourself?
Is the error message the same across all Flink Beam combinations? If
you use 0.2.0 the Flink version has to be 1.0.3. If you use 0.3.0, you
will have to run it on Flink 1.1.2 because these are the Flink
versions which the Beam releases are built upon. This will also be
part of the upcoming documentation.

For displaying conflicting versions, `mvn dependency:tree -Dverbose`
provides a way to display conflicting versions of dependencies. It
seems like Guava 18 and 19 are both used as dependencies which could
result in a conflict (if shading is not set up properly). That's just
a quick assessment but I would like to look further into the issue.


-Max


On Fri, Nov 25, 2016 at 4:48 PM, Wayne Collins  wrote:

Hi Stephen,

Thanks for the suggestions!
I had come to the same conclusion but haven't been able to shade or set
"provides" to work around it.
The "mvn dependency:tree" looks good to me...

Can anyone share a pom dependencies/exclusions fragment that is 
working for

them with KafkaIO on a Flink/Yarn cluster?

Thanks,
Wayne


Subject:
Re: java.lang.IllegalAccessError with KafkaIO
From:
Stephan Ewen 
Date:
2016-11-24 02:38 PM
To:
user@beam.incubator.apache.org, d...@beam.incubator.apache.org

I think have seen these kind of errors before.
It is the JVM checking upon class loading (lazy linking) that a
private/protected access is valid.

It could be that this violation is because there are multiple 
versions of

that class loaded - some in the SystemClassLoader and some in the user
library jar file.

Fixing this probably needs some dependency cleanup, for example 
properly
setting dependencies as "provided" that should not be packaged again 
into
the user code jar. Also, reducing visible dependencies by shading 
them away

helps.


(snip)






Re: java.lang.IllegalAccessError with KafkaIO

2016-11-28 Thread Wayne Collins

Hi Max,

I pulled the binaries from the website but it looks like I pulled the 
wrong version for 0.3.0 (Flink 1.1.3 instead of 1.1.2).
The message was the same for all combinations in testing my code but I 
didn't try all the combinations with Emanuele's demo code in the sandbox.


I'll rebuild my sandbox environment with just Beam 0.3.0 and Flink 1.1.2 
and post the message and associated maven dependency tree.


Thanks for your input!
Wayne


On 2016-11-28 01:35 PM, Maximilian Michels wrote:

Hi Wayne,

That seems like conflicting Guava versions in the classpath. Did you
use Flink binaries from the website or did you compile Flink yourself?
Is the error message the same across all Flink Beam combinations? If
you use 0.2.0 the Flink version has to be 1.0.3. If you use 0.3.0, you
will have to run it on Flink 1.1.2 because these are the Flink
versions which the Beam releases are built upon. This will also be
part of the upcoming documentation.

For displaying conflicting versions, `mvn dependency:tree -Dverbose`
provides a way to display conflicting versions of dependencies. It
seems like Guava 18 and 19 are both used as dependencies which could
result in a conflict (if shading is not set up properly). That's just
a quick assessment but I would like to look further into the issue.


-Max


On Fri, Nov 25, 2016 at 4:48 PM, Wayne Collins  wrote:

Hi Stephen,

Thanks for the suggestions!
I had come to the same conclusion but haven't been able to shade or set
"provides" to work around it.
The "mvn dependency:tree" looks good to me...

Can anyone share a pom dependencies/exclusions fragment that is working for
them with KafkaIO on a Flink/Yarn cluster?

Thanks,
Wayne


Subject:
Re: java.lang.IllegalAccessError with KafkaIO
From:
Stephan Ewen 
Date:
2016-11-24 02:38 PM
To:
user@beam.incubator.apache.org, d...@beam.incubator.apache.org

I think have seen these kind of errors before.
It is the JVM checking upon class loading (lazy linking) that a
private/protected access is valid.

It could be that this violation is because there are multiple versions of
that class loaded - some in the SystemClassLoader and some in the user
library jar file.

Fixing this probably needs some dependency cleanup, for example properly
setting dependencies as "provided" that should not be packaged again into
the user code jar. Also, reducing visible dependencies by shading them away
helps.


(snip)




Re: java.lang.IllegalAccessError with KafkaIO

2016-11-28 Thread Maximilian Michels
Hi Wayne,

That seems like conflicting Guava versions in the classpath. Did you
use Flink binaries from the website or did you compile Flink yourself?
Is the error message the same across all Flink Beam combinations? If
you use 0.2.0 the Flink version has to be 1.0.3. If you use 0.3.0, you
will have to run it on Flink 1.1.2 because these are the Flink
versions which the Beam releases are built upon. This will also be
part of the upcoming documentation.

For displaying conflicting versions, `mvn dependency:tree -Dverbose`
provides a way to display conflicting versions of dependencies. It
seems like Guava 18 and 19 are both used as dependencies which could
result in a conflict (if shading is not set up properly). That's just
a quick assessment but I would like to look further into the issue.


-Max


On Fri, Nov 25, 2016 at 4:48 PM, Wayne Collins  wrote:
> Hi Stephen,
>
> Thanks for the suggestions!
> I had come to the same conclusion but haven't been able to shade or set
> "provides" to work around it.
> The "mvn dependency:tree" looks good to me...
>
> Can anyone share a pom dependencies/exclusions fragment that is working for
> them with KafkaIO on a Flink/Yarn cluster?
>
> Thanks,
> Wayne
>
>
> Subject:
> Re: java.lang.IllegalAccessError with KafkaIO
> From:
> Stephan Ewen 
> Date:
> 2016-11-24 02:38 PM
> To:
> user@beam.incubator.apache.org, d...@beam.incubator.apache.org
>
> I think have seen these kind of errors before.
> It is the JVM checking upon class loading (lazy linking) that a
> private/protected access is valid.
>
> It could be that this violation is because there are multiple versions of
> that class loaded - some in the SystemClassLoader and some in the user
> library jar file.
>
> Fixing this probably needs some dependency cleanup, for example properly
> setting dependencies as "provided" that should not be packaged again into
> the user code jar. Also, reducing visible dependencies by shading them away
> helps.
>
>
> (snip)


Re: java.lang.IllegalAccessError with KafkaIO

2016-11-25 Thread Wayne Collins

Hi Stephen,

Thanks for the suggestions!
I had come to the same conclusion but haven't been able to shade or set 
"provides" to work around it.

The "mvn dependency:tree" looks good to me...

Can anyone share a pom dependencies/exclusions fragment that is working 
for them with KafkaIO on a Flink/Yarn cluster?


Thanks,
Wayne



Subject:
Re: java.lang.IllegalAccessError with KafkaIO
From:
Stephan Ewen 
Date:
2016-11-24 02:38 PM

To:
user@beam.incubator.apache.org, d...@beam.incubator.apache.org


I think have seen these kind of errors before.
It is the JVM checking upon class loading (lazy linking) that a 
private/protected access is valid.


It could be that this violation is because there are multiple versions 
of that class loaded - some in the SystemClassLoader and some in the 
user library jar file.


Fixing this probably needs some dependency cleanup, for example 
properly setting dependencies as "provided" that should not be 
packaged again into the user code jar. Also, reducing visible 
dependencies by shading them away helps.




(snip)



Re: java.lang.IllegalAccessError with KafkaIO

2016-11-24 Thread Stephan Ewen
I think have seen these kind of errors before.
It is the JVM checking upon class loading (lazy linking) that a
private/protected access is valid.

It could be that this violation is because there are multiple versions of
that class loaded - some in the SystemClassLoader and some in the user
library jar file.

Fixing this probably needs some dependency cleanup, for example properly
setting dependencies as "provided" that should not be packaged again into
the user code jar. Also, reducing visible dependencies by shading them away
helps.



On Thu, Nov 24, 2016 at 8:18 PM, Wayne Collins  wrote:

> Hi,
>
> I have run into an issue with launching Beam applications that use KafkaIO
> on a flink cluster:
> "java.lang.IllegalAccessError: tried to access method
> com.google.common.base.Optional.()V from class
> com.google.common.base.Absent"
> (full output, and pom.xml below)
>
> Other Beam applications launch correctly and previous versions that used
> the FlinkKafkaConsumer also worked correctly.
> Running directly from Eclipse works fine.
>
> I've reproduced the error using the code from Emanuele Cesena's excellent
> Beam/Flink demo: https://github.com/ecesena/beam-starter.
> His WordCount example runs correctly but StreamWordCount fails.
> This occurs with any combination of Beam 0.1.0, 0.2.0, and
> 0.3.0-incubating with Flink 1.0.3 and 1.1.3
>
> At first glance it looks like something needs to be shaded but so far no
> joy on that front.
> Can anyone point out what I've missed?
>
> Thanks,
> Wayne
>
>
> Here's the output from attempting to run the Beam app:
>
> ---
> # flink run -c com.dataradiant.beam.examples.StreamWordCount
> ./target/beam-starter-0.2.jar
> Found YARN properties file /tmp/.yarn-properties-root
> Using JobManager address from YARN properties
> hadoop1.dades.ca/192.168.124.101:39432
> 11/24/2016 13:56:36 Job execution switched to status RUNNING.
> 11/24/2016 13:56:36 Source: Read(UnboundedKafkaSource) ->
> AnonymousParDo -> AnonymousParDo -> Flat Map -> ParDo(ExtractWords) ->
> AnonymousParDo(1/1) switched to SCHEDULED
> 11/24/2016 13:56:36 Source: Read(UnboundedKafkaSource) ->
> AnonymousParDo -> AnonymousParDo -> Flat Map -> ParDo(ExtractWords) ->
> AnonymousParDo(1/1) switched to DEPLOYING
> 11/24/2016 13:56:36 GroupByWindowWithCombiner -> AnonymousParDo(1/1)
> switched to SCHEDULED
> 11/24/2016 13:56:36 GroupByWindowWithCombiner -> AnonymousParDo(1/1)
> switched to DEPLOYING
> 11/24/2016 13:56:36 GroupByWindowWithCombiner -> AnonymousParDo(1/1)
> switched to RUNNING
> 11/24/2016 13:56:36 Source: Read(UnboundedKafkaSource) ->
> AnonymousParDo -> AnonymousParDo -> Flat Map -> ParDo(ExtractWords) ->
> AnonymousParDo(1/1) switched to RUNNING
> 11/24/2016 13:56:37 Source: Read(UnboundedKafkaSource) ->
> AnonymousParDo -> AnonymousParDo -> Flat Map -> ParDo(ExtractWords) ->
> AnonymousParDo(1/1) switched to FAILED
> java.lang.IllegalAccessError: tried to access method
> com.google.common.base.Optional.()V from class
> com.google.common.base.Absent
> at com.google.common.base.Absent.(Absent.java:35)
> at com.google.common.base.Absent.(Absent.java:33)
> at sun.misc.Unsafe.ensureClassInitialized(Native Method)
> at sun.reflect.UnsafeFieldAccessorFactory.newFieldAccessor(Unsa
> feFieldAccessorFactory.java:43)
> at sun.reflect.ReflectionFactory.newFieldAccessor(ReflectionFac
> tory.java:140)
> at java.lang.reflect.Field.acquireFieldAccessor(Field.java:1057)
> at java.lang.reflect.Field.getFieldAccessor(Field.java:1038)
> at java.lang.reflect.Field.getLong(Field.java:591)
> at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
> java:1663)
> at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:468)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.
> java:602)
> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
> .java:1622)
> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.
> java:1517)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
> am.java:1771)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java
> :1350)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
> m.java:1990)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.
> java:1915)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
> am.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java
> :1350)
> a