[
https://issues.apache.org/jira/browse/BEAM-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650811#comment-16650811
]
Luke Cwik commented on BEAM-5710:
---------------------------------
There were a large number of dependency updates from 2.6.0 to 2.7.0 related to
gRPC and transitive dependencies including Netty. Its likely the 2.7.0 worker
shaded these dependencies incorrectly and that is why your hitting this problem.
> Compatibility issues with netty 4.1.28 + tcnative 2.0.12 in beam 2.7.0 on
> dataflow
> ----------------------------------------------------------------------------------
>
> Key: BEAM-5710
> URL: https://issues.apache.org/jira/browse/BEAM-5710
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.7.0
> Reporter: Steve Niemitz
> Assignee: Boyuan Zhang
> Priority: Major
>
> I have a beam job that runs in dataflow. The job uses BigtableIO to read
> from bigtable. Transitively, it pulls in netty 4.1.28 and netty-tcnative
> 2.0.12. This has worked fine in the past (beam 2.4.0 and 2.6.0), however in
> beam 2.7.0, the job now crashes the JVM when BigtableIO attempts to
> initialize GCP, which attempts to initialize netty-tcnative.
> I found a very similar bug reported to netty here:
> [https://github.com/netty/netty/issues/8337]
> Also, the problem seems to go away if I downgrade our tcnative version to
> 2.0.10.
>
> The error is a segfault attempting to call aprMajorVersion() in tcnative:
> {code:java}
> Stack: [0x00007fda4b8f2000,0x00007fda4b9f3000], sp=0x00007fda4b9efd78, free
> space=1015k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)
> C 0x00007fda4aac1db0
> j
> io.netty.internal.tcnative.Library.initialize(Ljava/lang/String;Ljava/lang/String;)Z+31
> j io.netty.handler.ssl.OpenSsl.initializeTcNative(Ljava/lang/String;)Z+3
> j io.netty.handler.ssl.OpenSsl.<clinit>()V+225
> v ~StubRoutines::call_stub
> V [libjvm.so+0x672446] JavaCalls::call_helper(JavaValue*, methodHandle*,
> JavaCallArguments*, Thread*)+0x1056
> V [libjvm.so+0x624827]
> InstanceKlass::call_class_initializer_impl(instanceKlassHandle, Thread*)+0xd7
> V [libjvm.so+0x626e3c] InstanceKlass::initialize_impl(instanceKlassHandle,
> Thread*)+0x1ac
> V [libjvm.so+0x627201] InstanceKlass::initialize(Thread*)+0x41
> V [libjvm.so+0x7ad066] LinkResolver::resolve_static_call(CallInfo&,
> KlassHandle&, Symbol*, Symbol*, KlassHandle, bool, bool, Thread*)+0x246
> V [libjvm.so+0x7ad2ef] LinkResolver::resolve_invokestatic(CallInfo&,
> constantPoolHandle, int, Thread*)+0x23f
> V [libjvm.so+0x7ae3a1] LinkResolver::resolve_invoke(CallInfo&, Handle,
> constantPoolHandle, int, Bytecodes::Code, Thread*)+0x4f1
> V [libjvm.so+0x66bf72] InterpreterRuntime::resolve_invoke(JavaThread*,
> Bytecodes::Code)+0x1b2
> j
> io.grpc.netty.GrpcSslContexts.defaultSslProvider()Lio/netty/handler/ssl/SslProvider;+0
> {code}
> Additionally, if I run my job using the beam 2.6.0 container
> (--workerHarnessContainerImage=dataflow.gcr.io/v1beta3/beam-java-batch:beam-2.6.0),
> it also succeeds.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)