Re: spark job submisson on yarn-cluster mode failing
Hi, I am facing below error msg now. please help me. 2016-01-21 16:06:14,123 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /xxx.xx.xx.xx:50010 for block, add to deadNodes and continue. java.nio.channels.ClosedByInterruptException java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:658) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101) at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576) at org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1460) at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:773) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847) at java.io.DataInputStream.read(DataInputStream.java:100) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:84) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Thanks Soniya On Thu, Jan 21, 2016 at 5:42 PM, Ted Yuwrote: > Please also check AppMaster log. > > Thanks > > On Jan 21, 2016, at 3:51 AM, Akhil Das wrote: > > Can you look in the executor logs and see why the sparkcontext is being > shutdown? Similar discussion happened here previously. > http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html > > Thanks > Best Regards > > On Thu, Jan 21, 2016 at 5:11 PM, Soni spark > wrote: > >> Hi Friends, >> >> I spark job is successfully running on local mode but failing on cluster >> mode. Below is the error message i am getting. anyone can help me. >> >> >> >> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection. >> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started >> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver >> onStart >> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver >> to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED >> SIGNAL 15: SIGTERM* >> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking >> stop(stopGracefully=false) from shutdown hook >> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 >> receivers >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver >> with message: Stopped by driver: >> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver >> onStop >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering >> receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered >> receiver for stream 0: Stopped by driver* >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0 >> 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator >> 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> >> Thanks >> >> Soniya >> >> >
Re: spark job submisson on yarn-cluster mode failing
Can you look in the executor logs and see why the sparkcontext is being shutdown? Similar discussion happened here previously. http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html Thanks Best Regards On Thu, Jan 21, 2016 at 5:11 PM, Soni sparkwrote: > Hi Friends, > > I spark job is successfully running on local mode but failing on cluster > mode. Below is the error message i am getting. anyone can help me. > > > > 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection. > 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started > 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver > onStart > 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver > to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL > 15: SIGTERM* > 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking > stop(stopGracefully=false) from shutdown hook > 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 > receivers > 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal > 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver > with message: Stopped by driver: > 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped > 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop > 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering > receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered > receiver for stream 0: Stopped by driver* > 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0 > 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator > 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context > initialization ... > > Thanks > > Soniya > >
Re: spark job submisson on yarn-cluster mode failing
Please also check AppMaster log. Thanks > On Jan 21, 2016, at 3:51 AM, Akhil Daswrote: > > Can you look in the executor logs and see why the sparkcontext is being > shutdown? Similar discussion happened here previously. > http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html > > Thanks > Best Regards > >> On Thu, Jan 21, 2016 at 5:11 PM, Soni spark wrote: >> Hi Friends, >> >> I spark job is successfully running on local mode but failing on cluster >> mode. Below is the error message i am getting. anyone can help me. >> >> >> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection. >> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started >> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver >> onStart >> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver >> to be stopped >> 16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM >> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking >> stop(stopGracefully=false) from shutdown hook >> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 >> receivers >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver >> with message: Stopped by driver: >> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver >> onStop >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering >> receiver 0 >> 16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered receiver for >> stream 0: Stopped by driver >> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0 >> 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator >> 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> >> Thanks >> Soniya >
spark job submisson on yarn-cluster mode failing
Hi Friends, I spark job is successfully running on local mode but failing on cluster mode. Below is the error message i am getting. anyone can help me. 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection. 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver onStart 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for receiver to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM* 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 receivers 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver with message: Stopped by driver: 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver onStop 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering receiver 0*16/01/21 16:38:10 ERROR scheduler.ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver* 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopped receiver 0 16/01/21 16:38:10 INFO receiver.BlockGenerator: Stopping BlockGenerator 16/01/21 16:38:10 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... Thanks Soniya
Re: spark job submisson on yarn-cluster mode failing
Exception below is at WARN level. Can you check hdfs healthiness ? Which hadoop version are you using ? There should be other fatal error if your job failed. Cheers On Thu, Jan 21, 2016 at 4:50 AM, Soni sparkwrote: > Hi, > > I am facing below error msg now. please help me. > > 2016-01-21 16:06:14,123 WARN org.apache.hadoop.hdfs.DFSClient: Failed to > connect to /xxx.xx.xx.xx:50010 for block, add to deadNodes and continue. > java.nio.channels.ClosedByInterruptException > java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:658) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) > at > org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101) > at > org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576) > at > org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1460) > at > org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:773) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847) > at java.io.DataInputStream.read(DataInputStream.java:100) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:84) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:265) > at > org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > > Thanks > Soniya > > On Thu, Jan 21, 2016 at 5:42 PM, Ted Yu wrote: > >> Please also check AppMaster log. >> >> Thanks >> >> On Jan 21, 2016, at 3:51 AM, Akhil Das >> wrote: >> >> Can you look in the executor logs and see why the sparkcontext is being >> shutdown? Similar discussion happened here previously. >> http://apache-spark-user-list.1001560.n3.nabble.com/RECEIVED-SIGNAL-15-SIGTERM-td23668.html >> >> Thanks >> Best Regards >> >> On Thu, Jan 21, 2016 at 5:11 PM, Soni spark >> wrote: >> >>> Hi Friends, >>> >>> I spark job is successfully running on local mode but failing on cluster >>> mode. Below is the error message i am getting. anyone can help me. >>> >>> >>> >>> 16/01/21 16:38:07 INFO twitter4j.TwitterStreamImpl: Establishing connection. >>> 16/01/21 16:38:07 INFO twitter.TwitterReceiver: Twitter receiver started >>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Called receiver >>> onStart >>> 16/01/21 16:38:07 INFO receiver.ReceiverSupervisorImpl: Waiting for >>> receiver to be stopped*16/01/21 16:38:10 ERROR yarn.ApplicationMaster: >>> RECEIVED SIGNAL 15: SIGTERM* >>> 16/01/21 16:38:10 INFO streaming.StreamingContext: Invoking >>> stop(stopGracefully=false) from shutdown hook >>> 16/01/21 16:38:10 INFO scheduler.ReceiverTracker: Sent stop signal to all 1 >>> receivers >>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Received stop signal >>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Stopping receiver >>> with message: Stopped by driver: >>> 16/01/21 16:38:10 INFO twitter.TwitterReceiver: Twitter receiver stopped >>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Called receiver >>> onStop >>> 16/01/21 16:38:10 INFO receiver.ReceiverSupervisorImpl: Deregistering >>> receiver 0*16/01/21