Adrian Muraru created SPARK-27610: ------------------------------------- Summary: Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL Key: SPARK-27610 URL: https://issues.apache.org/jira/browse/SPARK-27610 Project: Spark Issue Type: Improvement Components: Shuffle Affects Versions: 2.4.2 Reporter: Adrian Muraru
Enabling netty epoll mode in yarn shuffle service ({{spark.shuffle.io.mode=EPOLL}}) makes the NN to abort. Checking the stracktrace, it seems that while the io.netty package is shaded, the native libraries provided by netty-all are not: {noformat} Caused by: java.io.FileNotFoundException: META-INF/native/liborg_spark_project_netty_transport_native_epoll_x86_64.so{noformat} *Full stack trace:* {noformat} 2019-04-24 23:14:46,372 ERROR [main] nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(639)) - Error starting NodeManager java.lang.UnsatisfiedLinkError: failed to load the required native library at org.spark_project.io.netty.channel.epoll.Epoll.ensureAvailability(Epoll.java:81) at org.spark_project.io.netty.channel.epoll.EpollEventLoop.<clinit>(EpollEventLoop.java:55) at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:134) at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:35) at org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:84) at org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:58) at org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:47) at org.spark_project.io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:59) at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:104) at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:91) at org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:68) at org.apache.spark.network.util.NettyUtils.createEventLoop(NettyUtils.java:52) at org.apache.spark.network.server.TransportServer.init(TransportServer.java:95) at org.apache.spark.network.server.TransportServer.<init>(TransportServer.java:75) at org.apache.spark.network.TransportContext.createServer(TransportContext.java:108) at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:186) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:147) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:268) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684) Caused by: java.lang.UnsatisfiedLinkError: could not load a native library: org_spark_project_netty_transport_native_epoll_x86_64 at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205) at org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:207) at org.spark_project.io.netty.channel.epoll.Native.<clinit>(Native.java:65) at org.spark_project.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:33) ... 26 more Suppressed: java.lang.UnsatisfiedLinkError: could not load a native library: org_spark_project_netty_transport_native_epoll at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205) at org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:210) ... 28 more Caused by: java.io.FileNotFoundException: META-INF/native/liborg_spark_project_netty_transport_native_epoll.so at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161) ... 29 more Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124) ... 29 more Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.spark_project.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263) at java.security.AccessController.doPrivileged(Native Method) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233) ... 30 more Caused by: java.io.FileNotFoundException: META-INF/native/liborg_spark_project_netty_transport_native_epoll_x86_64.so at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:161) ... 29 more Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll_x86_64 in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:243) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:124) ... 29 more Suppressed: java.lang.UnsatisfiedLinkError: no org_spark_project_netty_transport_native_epoll_x86_64 in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at org.spark_project.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.spark_project.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:263) at java.security.AccessController.doPrivileged(Native Method) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:255) at org.spark_project.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org