Found some warn and error messages in driver log file: 2016-07-04 04:49:50,106 [main] WARN DataNucleus.General- Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR version s of the same plugin in the classpath. The URL "file:/opt/mapr/hive/hive-1.2/lib/datanucleus-api-jdo-4.2.1.jar" is already registered, and you are trying to registe r an identical plugin located at URL "file:/opt/mapr/spark/spark-1.6.1/lib/datanucleus-api-jdo-4.2.1.jar." 2016-07-04 04:49:50,115 [main] WARN DataNucleus.General- Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/hive/hive-1.2/lib/datanucleus-core-4.1.6.jar" is already registered, and you are trying to register an identi cal plugin located at URL "file:/opt/mapr/spark/spark-1.6.1/lib/datanucleus-core-4.1.6.jar." 2016-07-04 04:49:50,136 [main] WARN DataNucleus.General- Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR ver sions of the same plugin in the classpath. The URL "file:/opt/mapr/hive/hive-1.2/lib/datanucleus-rdbms-4.1.7.jar" is already registered, and you are trying to regis ter an identical plugin located at URL "file:/opt/mapr/spark/spark-1.6.1/lib/datanucleus-rdbms-4.1.7.jar." 2016-07-04 04:49:57,387 [main] WARN org.apache.hadoop.hive.metastore.ObjectStore- Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 2016-07-04 04:49:57,563 [main] WARN org.apache.hadoop.hive.metastore.ObjectStore- Failed to get database default, returning NoSuchObjectException 2016-07-04 04:50:32,046 [main] WARN org.apache.spark.SparkConf- The configuration key 'spark.kryoserializer.buffer.max.mb' has been deprecated as of Spark 1.4 and and may be removed in the future. Please use the new key 'spark.kryoserializer.buffer.max' instead. 2016-07-04 04:50:32,048 [main] WARN org.apache.spark.SparkConf- The configuration key 'spark.kryoserializer.buffer.max.mb' has been deprecated as of Spark 1.4 and and may be removed in the future. Please use the new key 'spark.kryoserializer.buffer.max' instead. 2016-07-04 04:50:32,170 [main] WARN DataNucleus.General- Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR version s of the same plugin in the classpath. The URL "file:/opt/mapr/hive/hive-1.2/lib/datanucleus-api-jdo-4.2.1.jar" is already registered, and you are trying to registe r an identical plugin located at URL "file:/opt/mapr/spark/spark-1.6.1/lib/datanucleus-api-jdo-4.2.1.jar." 2016-07-04 04:50:32,173 [main] WARN DataNucleus.General- Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/hive/hive-1.2/lib/datanucleus-core-4.1.6.jar" is already registered, and you are trying to register an identi cal plugin located at URL "file:/opt/mapr/spark/spark-1.6.1/lib/datanucleus-core-4.1.6.jar." 2016-07-04 04:50:32,183 [main] WARN DataNucleus.General- Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR ver sions of the same plugin in the classpath. The URL "file:/opt/mapr/hive/hive-1.2/lib/datanucleus-rdbms-4.1.7.jar" is already registered, and you are trying to regis ter an identical plugin located at URL "file:/opt/mapr/spark/spark-1.6.1/lib/datanucleus-rdbms-4.1.7.jar." 2016-07-04 04:50:35,678 [main] WARN org.apache.hadoop.hive.metastore.ObjectStore- Failed to get database default, returning NoSuchObjectException 2016-07-04 05:46:50,052 [main] WARN org.apache.spark.SparkConf- The configuration key 'spark.kryoserializer.buffer.max.mb' has been deprecated as of Spark 1.4 and and may be removed in the future. Please use the new key 'spark.kryoserializer.buffer.max' instead. 2016-07-04 05:50:09,944 [Thread-27-EventThread] WARN com.mapr.util.zookeeper.ZKDataRetrieval- ZK Reset due to SessionExpiration for ZK: hostname:5181,hostname:5181,hostname:5181 2016-07-04 05:11:53,972 [dispatcher-event-loop-0] ERROR org.apache.spark.scheduler.LiveListenerBus- Dropping SparkListenerEvent because no remaining room in event q ueue. This likely means one of the SparkListeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler.
On Tue, Jul 5, 2016 at 4:45 AM, kishore kumar <akishore...@gmail.com> wrote: > Find the log from rm below, before FNFE there is no earlier errors in > driver log, > > 16/07/04 00:27:56 INFO mapreduce.TableInputFormatBase: Input split length: > 0 bytes. > 16/07/04 00:27:56 INFO executor.Executor: Executor is trying to kill task > 56.0 in stage 2437.0 (TID 328047) > 16/07/04 00:27:56 INFO executor.Executor: Executor killed task 266.0 in > stage 2433.0 (TID 328005) > 16/07/04 00:27:56 INFO executor.Executor: Executor killed task 206.0 in > stage 2433.0 (TID 327977) > 16/07/04 00:27:56 INFO executor.Executor: Executor killed task 318.0 in > stage 2433.0 (TID 328006) > 16/07/04 00:27:57 INFO executor.Executor: Executor killed task 56.0 in > stage 2437.0 (TID 328047) > 16/07/04 00:27:57 INFO executor.CoarseGrainedExecutorBackend: Driver > commanded a shutdown > 16/07/04 00:27:57 INFO storage.MemoryStore: MemoryStore cleared > 16/07/04 00:27:57 INFO storage.BlockManager: BlockManager stopped > 16/07/04 00:27:57 WARN executor.CoarseGrainedExecutorBackend: An unknown ( > driver.domain.com:56055) driver disconnected. > 16/07/04 00:27:57 ERROR executor.CoarseGrainedExecutorBackend: Driver > xx:xx:xx:xx:56055 disassociated! Shutting down. > 16/07/04 00:27:57 INFO util.ShutdownHookManager: Shutdown hook called > 16/07/04 00:27:57 INFO util.ShutdownHookManager: Deleting directory > /opt/mapr/tmp/hadoop-tmp/hadoop-mapr/nm-local-dir/usercache/user/appcache/application_1467474162580_29353/spark-9c0bfccc-74c3-4541-a2fd-19101e47b49a > End of LogType:stderr > > > On Mon, Jul 4, 2016 at 4:21 PM, Jacek Laskowski <ja...@japila.pl> wrote: > >> Can you share some stats from Web UI just before the failure? Any earlier >> errors before FNFE? >> >> Jacek >> On 4 Jul 2016 12:34 p.m., "kishore kumar" <akishore...@gmail.com> wrote: >> >>> @jacek: It is running on yarn-client mode, our code don't support >>> running in yarn-cluster mode and the job is running for around an hour and >>> giving the exception. >>> >>> @karhi: yarn application status is successful, resourcemanager logs did >>> not give any failure info except >>> 16/07/04 00:27:57 INFO executor.CoarseGrainedExecutorBackend: Driver >>> commanded a shutdown >>> 16/07/04 00:27:57 INFO storage.MemoryStore: MemoryStore cleared >>> 16/07/04 00:27:57 INFO storage.BlockManager: BlockManager stopped >>> 16/07/04 00:27:57 WARN executor.CoarseGrainedExecutorBackend: An unknown >>> (slave1.domain.com:56055) driver disconnected. >>> 16/07/04 00:27:57 ERROR executor.CoarseGrainedExecutorBackend: Driver >>> 173.36.88.26:56055 disassociated! Shutting down. >>> 16/07/04 00:27:57 INFO util.ShutdownHookManager: Shutdown hook called >>> 16/07/04 00:27:57 INFO util.ShutdownHookManager: Deleting directory >>> /opt/mapr/tmp/hadoop-tmp/hadoop-mapr/nm-local-dir/usercache/user/appcache/application_1467474162580_29353/spark-9c0bfccc-74c3-4541-a2fd-19101e47b49a >>> End of LogType:stderr >>> >>> >>> On Mon, Jul 4, 2016 at 3:20 PM, Jacek Laskowski <ja...@japila.pl> wrote: >>> >>>> Hi, >>>> >>>> You seem to be using yarn. Is this cluster or client deploy mode? Have >>>> you seen any other exceptions before? How long did the application run >>>> before the exception? >>>> >>>> Pozdrawiam, >>>> Jacek Laskowski >>>> ---- >>>> https://medium.com/@jaceklaskowski/ >>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>>> Follow me at https://twitter.com/jaceklaskowski >>>> >>>> >>>> On Mon, Jul 4, 2016 at 10:57 AM, kishore kumar <akishore...@gmail.com> >>>> wrote: >>>> > We've upgraded spark version from 1.2 to 1.6 still the same problem, >>>> > >>>> > Exception in thread "main" org.apache.spark.SparkException: Job >>>> aborted due >>>> > to stage failure: Task 286 in stage >>>> > 2397.0 failed 4 times, most recent failure: Lost task 286.3 in stage >>>> 2397.0 >>>> > (TID 314416, salve-06.domain.com): java.io.FileNotFoundException: >>>> > /opt/mapr/tmp/h >>>> > >>>> adoop-tmp/hadoop-mapr/nm-local-dir/usercache/user1/appcache/application_1467474162580_29353/blockmgr-bd075392-19c2-4cb8-8033-0fe54d683c8f/12/shuffle_530_286_0.inde >>>> > x.c374502a-4cf2-4052-abcf-42977f1623d0 (No such file or directory) >>>> > >>>> > Kindly help me to get rid from this. >>>> > >>>> > On Sun, Jun 5, 2016 at 9:43 AM, kishore kumar <akishore...@gmail.com> >>>> wrote: >>>> >> >>>> >> Hi, >>>> >> >>>> >> Could anyone help me about this error ? why this error comes ? >>>> >> >>>> >> Thanks, >>>> >> KishoreKuamr. >>>> >> >>>> >> On Fri, Jun 3, 2016 at 9:12 PM, kishore kumar <akishore...@gmail.com >>>> > >>>> >> wrote: >>>> >>> >>>> >>> Hi Jeff Zhang, >>>> >>> >>>> >>> Thanks for response, could you explain me why this error occurs ? >>>> >>> >>>> >>> On Fri, Jun 3, 2016 at 6:15 PM, Jeff Zhang <zjf...@gmail.com> >>>> wrote: >>>> >>>> >>>> >>>> One quick solution is to use spark 1.6.1. >>>> >>>> >>>> >>>> On Fri, Jun 3, 2016 at 8:35 PM, kishore kumar < >>>> akishore...@gmail.com> >>>> >>>> wrote: >>>> >>>>> >>>> >>>>> Could anyone help me on this issue ? >>>> >>>>> >>>> >>>>> On Tue, May 31, 2016 at 8:00 PM, kishore kumar < >>>> akishore...@gmail.com> >>>> >>>>> wrote: >>>> >>>>>> >>>> >>>>>> Hi, >>>> >>>>>> >>>> >>>>>> We installed spark1.2.1 in single node, running a job in >>>> yarn-client >>>> >>>>>> mode on yarn which loads data into hbase and elasticsearch, >>>> >>>>>> >>>> >>>>>> the error which we are encountering is >>>> >>>>>> Exception in thread "main" org.apache.spark.SparkException: Job >>>> >>>>>> aborted due to stage failure: Task 38 in stage 26800.0 failed 4 >>>> times, most >>>> >>>>>> recent failure: Lost task 38.3 in stage 26800.0 (TID 4990082, >>>> >>>>>> hdprd-c01-r04-03): java.io.FileNotFoundException: >>>> >>>>>> >>>> /opt/mapr/tmp/hadoop-tmp/hadoop-mapr/nm-local-dir/usercache/sparkuser/appcache/application_1463194314221_211370/spark-3cc37dc7-fa3c-4b98-aa60-0acdfc79c725/28/shuffle_8553_38_0.index >>>> >>>>>> (No such file or directory) >>>> >>>>>> >>>> >>>>>> any idea about this error ? >>>> >>>>>> -- >>>> >>>>>> Thanks, >>>> >>>>>> Kishore. >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> -- >>>> >>>>> Thanks, >>>> >>>>> Kishore. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Best Regards >>>> >>>> >>>> >>>> Jeff Zhang >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> Thanks, >>>> >>> Kishore. >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> Thanks, >>>> >> Kishore. >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > Thanks, >>>> > Kishore. >>>> >>> >>> >>> >>> -- >>> Thanks, >>> Kishore. >>> >> > > > -- > Thanks, > Kishore. > -- Thanks, Kishore.