Steve Niemitz created BEAM-6249:
-----------------------------------

             Summary: Vendored gRPC doesn't seem to work with dataflow
                 Key: BEAM-6249
                 URL: https://issues.apache.org/jira/browse/BEAM-6249
             Project: Beam
          Issue Type: Bug
          Components: runner-dataflow
    Affects Versions: 2.9.0
            Reporter: Steve Niemitz
            Assignee: Tyler Akidau


I attempted to migrate an existing pipeline (that worked in 2.8.0) to 2.9.0.  
This pipeline is using the experimental streaming engine 
(–experiments=enable_streaming_engine).

The pipeline fails to start with these logs:
{code:java}
D  Unable to load the library 
'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64', trying other 
loading mechanism. 
D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64 cannot be 
loaded from java.libary.path, now trying export to -Dio.netty.native.workdir: 
/tmp 
D  Unable to load the library 
'/tmp/liborg_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_646918605450681921540.so',
 trying other loading mechanism. 
D  Unable to load the library 'netty_tcnative_linux_x86_64', trying next 
name... 
D  Unable to load the library 
'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64_fedora', 
trying other loading mechanism. 
D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64_fedora 
cannot be loaded from java.libary.path, now trying export to 
-Dio.netty.native.workdir: /tmp 
D  Unable to load the library 'netty_tcnative_linux_x86_64_fedora', trying next 
name... 
D  Unable to load the library 
'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_x86_64', trying other 
loading mechanism. 
D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_x86_64 cannot be loaded 
from java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
D  Unable to load the library 'netty_tcnative_x86_64', trying next name... 
D  Unable to load the library 
'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative', trying other loading 
mechanism. 
D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative cannot be loaded from 
java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
D  Unable to load the library 'netty_tcnative', trying next name... 
D  Failed to load netty-tcnative; OpenSslEngine will be unavailable, unless the 
application has already loaded the symbols by some other means. See 
http://netty.io/wiki/forked-tomcat-native.html for more information. 
D  Failed to initialize netty-tcnative; OpenSslEngine will be unavailable. See 
http://netty.io/wiki/forked-tomcat-native.html for more information. 
I  netty-tcnative unavailable (this may be normal) 
I  Conscrypt not found (this may be normal) 
I  Jetty ALPN unavailable (this may be normal) 
E  Uncaught exception in main thread. Exiting with status code 1. 
W  Please use a logger instead of System.out or System.err.
Please switch to using org.slf4j.Logger.
See: https://cloud.google.com/dataflow/pipelines/logging 
E  Uncaught exception in main thread. Exiting with status code 1. 
E  java.lang.IllegalStateException: Could not find TLS ALPN provider; no 
working netty-tcnative, Conscrypt, or Jetty NPN/ALPN available 
E       at 
org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.defaultSslProvider(GrpcSslContexts.java:256)
 
E       at 
org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:171)
 
E       at 
org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:120)
 
E       at 
org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.remoteChannel(GrpcWindmillServer.java:343)
 
E       at 
org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.initializeWindmillService(GrpcWindmillServer.java:312)
 
{code}
 

The interesting part is in the netty load failure, the stack trace is:
{code:java}
exception: "java.lang.UnsatisfiedLinkError at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:276)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:187)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadFirstAvailable(NativeLibraryLoader.java:85)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.handler.ssl.OpenSsl.loadTcNative(OpenSsl.java:430)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.handler.ssl.OpenSsl.<clinit>(OpenSsl.java:97)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.defaultSslProvider(GrpcSslContexts.java:242)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:171)
 at 
org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:120)
 at 
org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.remoteChannel(GrpcWindmillServer.java:343)
 at 
org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.initializeWindmillService(GrpcWindmillServer.java:312)
 at 
org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.setWindmillServiceEndpoints(GrpcWindmillServer.java:192)
 at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.getConfigFromDataflowService(StreamingDataflowWorker.java:1528)
 at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.getConfig(StreamingDataflowWorker.java:1583)
 at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.getGlobalConfig(StreamingDataflowWorker.java:1568)
 at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.schedulePeriodicGlobalConfigRequests(StreamingDataflowWorker.java:1543)
 at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.start(StreamingDataflowWorker.java:704)
 at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.main(StreamingDataflowWorker.java:228)
 Caused by: java.lang.reflect.InvocationTargetException at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263)
 at java.security.AccessController.doPrivileged(Native Method) at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255)
 ... 17 more Caused by: java.lang.NoClassDefFoundError: 
org/apache/beam/vendor/grpc/v1/13/1/io/netty/internal/tcnative/Library at 
java.lang.ClassLoader$NativeLibrary.load(Native Method) at 
java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941) at 
java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824) at 
java.lang.Runtime.load0(Runtime.java:809) at 
java.lang.System.load(System.java:1086) at 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:36)
 ... 24 more Caused by: java.lang.ClassNotFoundException: 
org.apache.beam.vendor.grpc.v1.13.1.io.netty.internal.tcnative.Library at 
java.net.URLClassLoader.findClass(URLClassLoader.java:381) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 30 more{code}
 

Notice that the class attempting to be loaded is:

org.apache.beam.vendor.grpc.v1.13.1.io.netty.internal.tcnative.Library, but 
it's actually defined in the jar as 
org.apache.beam.vendor.grpc.v1_13_1.io.netty.internal.tcnative.Library.

I traced this back to the jni interop code in tcnative:

[https://github.com/netty/netty-tcnative/blob/master/openssl-dynamic/src/main/c/jnilib.c#L266]

Here it replaces all _ in the package prefix with /, which won't work here.  
The fix seems like it would be to repackage the vendored gRPC with a different 
prefix that doesn't contain underscores.

I'm curious how this ever worked though?  Maybe the streaming engine is the 
only thing using this vendored gRPC code?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to