hi
We have integrated spark with yarn cluster.
To test for long running cluster, we ran below script to launch application,
an example programme as given at bottom.

#Script which runs example infinitely
/while true
do
echo "start $times"
/opt/ficlient/Spark/spark/bin/spark-submit --class
com.example.cfgtest.SparkPi --master yarn-client --driver-java-options
'-Dlog4j.configuration=file:"./log4j.properties"
-Dzookeeper.server.principal=zookeeper/hadoop.hadoop.com --executor-memory
1G --num-executors 3 --driver-memory 1G --executor-cores 5 --queue QueueD
/opt/example/example.jar 100
echo "finish $times"
let ++times
done/

After running for 1 or 2 days our jvm crashes and we get an error like
*#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (sharedRuntime.cpp:834), pid=1325, tid=0x00007f599f312700
#  fatal error: exception happened outside interpreter, nmethods and vtable
stubs at pc 0x00007f59c36b16b1
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build
1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode
linux-amd64 compressed oops)
# Core dump written. Default location: /opt/ashok/crash_test/core or
core.1325 (max size 1 kB). To ensure a full core dump, try "ulimit -c
unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0x00007f59ac0f4000):  JavaThread "SparkListenerBus" daemon
[_thread_in_Java, id=7566, stack(0x00007f599f212000,0x00007f599f313000)]

Stack: [0x00007f599f212000,0x00007f599f313000],  sp=0x00007f599f30fb10, 
free space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
V  [libjvm.so+0xac826a]  VMError::report_and_die()+0x2ba
V  [libjvm.so+0x4fd089]  report_fatal(char const*, int, char const*)+0x59
V  [libjvm.so+0x9c391a] 
SharedRuntime::continuation_for_implicit_exception(JavaThread*, unsigned
char*, SharedRuntime::ImplicitExceptionKind)+0x33a
V  [libjvm.so+0x92bbfa]  JVM_handle_linux_signal+0x48a
V  [libjvm.so+0x921e13]  signalHandler(int, siginfo*, void*)+0x43
C  [libpthread.so.0+0xf850]
j 
org.apache.spark.serializer.KryoSerializerInstance.borrowKryo()Lcom/esotericsoftware/kryo/Kryo;+11
*
This issue we have got many times and it was not always 
org.apache.spark.serializer.KryoSerializerInstance, the current thread where
jvm got crashed.

Linux environment : suse 11.4
Java version :jdk1.8.0_131

Below is example we have tried.

/import org.apache.commons.logging.LogFactory

import scala.math.random

import org.apache.spark._

/** Computes an approximation to pi */
object SparkPi {
  val LOG = LogFactory.getLog("SparkPi")
  def main(args: Array[String]) {
    System.setProperty("spark.serializer",
"org.apache.spark.serializer.KryoSerializer")
    val conf = new SparkConf().setAppName("Spark Pi")
    val spark = new SparkContext(conf)


    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow

    val rdd = spark.parallelize(1 until n, slices)
    val shuffleRdd = rdd.repartition(200)

    val count = shuffleRdd.map { i =>
        val x = random * 2 - 1
        val y = random * 2 - 1
        if (x*x + y*y < 1) 1 else 0
      }.reduce(_ + _)
    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }
}/

Has anybody faced this issue. Any suggestion to resolve it.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to