[jira] [Assigned] (SPARK-27610) Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

2019-05-07 Thread Marcelo Vanzin (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin reassigned SPARK-27610:
--

Assignee: Adrian Muraru

> Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL
> -
>
> Key: SPARK-27610
> URL: https://issues.apache.org/jira/browse/SPARK-27610
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 2.4.2
>Reporter: Adrian Muraru
>Assignee: Adrian Muraru
>Priority: Minor
> Fix For: 3.0.0
>
>
> Enabling netty epoll mode in yarn shuffle service 
> ({{spark.shuffle.io.mode=EPOLL}}) makes the Yarn NodeManager to abort.
>  Checking the stracktrace, it seems that while the io.netty package is 
> shaded, the native libraries provided by netty-all are not:
>   
> {noformat}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/liborg_spark_project_netty_transport_native_epoll_x86_64.so{noformat}
> *Full stack trace:*
> {noformat}
> 2019-04-24 23:14:46,372 ERROR [main] nodemanager.NodeManager 
> (NodeManager.java:initAndStartNodeManager(639)) - Error starting NodeManager
> java.lang.UnsatisfiedLinkError: failed to load the required native library
> at 
> org.spark_project.io.netty.channel.epoll.Epoll.ensureAvailability(Epoll.java:81)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoop.(EpollEventLoop.java:55)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:134)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:35)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:84)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:58)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:47)
> at 
> org.spark_project.io.netty.channel.MultithreadEventLoopGroup.(MultithreadEventLoopGroup.java:59)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:104)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:91)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:68)
> at 
> org.apache.spark.network.util.NettyUtils.createEventLoop(NettyUtils.java:52)
> at 
> org.apache.spark.network.server.TransportServer.init(TransportServer.java:95)
> at 
> org.apache.spark.network.server.TransportServer.(TransportServer.java:75)
> at 
> org.apache.spark.network.TransportContext.createServer(TransportContext.java:108)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:186)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:147)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:268)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
> Caused by: java.lang.UnsatisfiedLinkError: could not load a native library: 
> org_spark_project_netty_transport_native_epoll_x86_64
> at 
> org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
> at 
> org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:207)
> at 
> org.spark_project.io.netty.channel.epoll.Native.(Native.java:65)
> at org.spark_project.io.netty.channel.epoll.Epoll.(Epoll.java:33)
> ... 26 more
> Suppressed: java.lang.UnsatisfiedLinkError: could not load a native 
> library: org_spark_project_netty_transport_native_epoll
> at 
> org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
> at 
> 

[jira] [Assigned] (SPARK-27610) Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

2019-04-30 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27610:


Assignee: Apache Spark

> Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL
> -
>
> Key: SPARK-27610
> URL: https://issues.apache.org/jira/browse/SPARK-27610
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 2.4.2
>Reporter: Adrian Muraru
>Assignee: Apache Spark
>Priority: Minor
>
> Enabling netty epoll mode in yarn shuffle service 
> ({{spark.shuffle.io.mode=EPOLL}}) makes the Yarn NodeManager to abort.
>  Checking the stracktrace, it seems that while the io.netty package is 
> shaded, the native libraries provided by netty-all are not:
>   
> {noformat}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/liborg_spark_project_netty_transport_native_epoll_x86_64.so{noformat}
> *Full stack trace:*
> {noformat}
> 2019-04-24 23:14:46,372 ERROR [main] nodemanager.NodeManager 
> (NodeManager.java:initAndStartNodeManager(639)) - Error starting NodeManager
> java.lang.UnsatisfiedLinkError: failed to load the required native library
> at 
> org.spark_project.io.netty.channel.epoll.Epoll.ensureAvailability(Epoll.java:81)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoop.(EpollEventLoop.java:55)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:134)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:35)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:84)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:58)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:47)
> at 
> org.spark_project.io.netty.channel.MultithreadEventLoopGroup.(MultithreadEventLoopGroup.java:59)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:104)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:91)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:68)
> at 
> org.apache.spark.network.util.NettyUtils.createEventLoop(NettyUtils.java:52)
> at 
> org.apache.spark.network.server.TransportServer.init(TransportServer.java:95)
> at 
> org.apache.spark.network.server.TransportServer.(TransportServer.java:75)
> at 
> org.apache.spark.network.TransportContext.createServer(TransportContext.java:108)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:186)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:147)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:268)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
> Caused by: java.lang.UnsatisfiedLinkError: could not load a native library: 
> org_spark_project_netty_transport_native_epoll_x86_64
> at 
> org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
> at 
> org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:207)
> at 
> org.spark_project.io.netty.channel.epoll.Native.(Native.java:65)
> at org.spark_project.io.netty.channel.epoll.Epoll.(Epoll.java:33)
> ... 26 more
> Suppressed: java.lang.UnsatisfiedLinkError: could not load a native 
> library: org_spark_project_netty_transport_native_epoll
> at 
> org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
> at 
> org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:210)
> 

[jira] [Assigned] (SPARK-27610) Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL

2019-04-30 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27610:


Assignee: (was: Apache Spark)

> Yarn external shuffle service fails to start when spark.shuffle.io.mode=EPOLL
> -
>
> Key: SPARK-27610
> URL: https://issues.apache.org/jira/browse/SPARK-27610
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 2.4.2
>Reporter: Adrian Muraru
>Priority: Minor
>
> Enabling netty epoll mode in yarn shuffle service 
> ({{spark.shuffle.io.mode=EPOLL}}) makes the Yarn NodeManager to abort.
>  Checking the stracktrace, it seems that while the io.netty package is 
> shaded, the native libraries provided by netty-all are not:
>   
> {noformat}
> Caused by: java.io.FileNotFoundException: 
> META-INF/native/liborg_spark_project_netty_transport_native_epoll_x86_64.so{noformat}
> *Full stack trace:*
> {noformat}
> 2019-04-24 23:14:46,372 ERROR [main] nodemanager.NodeManager 
> (NodeManager.java:initAndStartNodeManager(639)) - Error starting NodeManager
> java.lang.UnsatisfiedLinkError: failed to load the required native library
> at 
> org.spark_project.io.netty.channel.epoll.Epoll.ensureAvailability(Epoll.java:81)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoop.(EpollEventLoop.java:55)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:134)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:35)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:84)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:58)
> at 
> org.spark_project.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:47)
> at 
> org.spark_project.io.netty.channel.MultithreadEventLoopGroup.(MultithreadEventLoopGroup.java:59)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:104)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:91)
> at 
> org.spark_project.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:68)
> at 
> org.apache.spark.network.util.NettyUtils.createEventLoop(NettyUtils.java:52)
> at 
> org.apache.spark.network.server.TransportServer.init(TransportServer.java:95)
> at 
> org.apache.spark.network.server.TransportServer.(TransportServer.java:75)
> at 
> org.apache.spark.network.TransportContext.createServer(TransportContext.java:108)
> at 
> org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:186)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:147)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:268)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
> Caused by: java.lang.UnsatisfiedLinkError: could not load a native library: 
> org_spark_project_netty_transport_native_epoll_x86_64
> at 
> org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
> at 
> org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:207)
> at 
> org.spark_project.io.netty.channel.epoll.Native.(Native.java:65)
> at org.spark_project.io.netty.channel.epoll.Epoll.(Epoll.java:33)
> ... 26 more
> Suppressed: java.lang.UnsatisfiedLinkError: could not load a native 
> library: org_spark_project_netty_transport_native_epoll
> at 
> org.spark_project.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:205)
> at 
> org.spark_project.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:210)
> ... 28 more
>