Re: Spark Job Hangs on our production cluster

Jeff Zhang Tue, 11 Aug 2015 18:24:00 -0700

Logs would be helpful to diagnose. Could you attach the logs ?



On Wed, Aug 12, 2015 at 5:19 AM, java8964 <java8...@hotmail.com> wrote:

> The executor's memory is reset by "--executor-memory 24G" for spark-shell.
>
> The one from the spark-env.sh is just for default setting.
>
> I can confirm from the Spark UI the executor heap is set as 24G.
>
> Thanks
>
> Yong
>
> ------------------------------
> From: igor.ber...@gmail.com
> Date: Tue, 11 Aug 2015 23:31:59 +0300
> Subject: Re: Spark Job Hangs on our production cluster
> To: java8...@hotmail.com
> CC: user@spark.apache.org
>
>
> how do u want to process 1T of data when you set your executor memory to
> be 2g?
> look at spark ui, metrics of tasks...if any
> look at spark logs on executor machine under work dir(unless you
> configured log4j)
>
>
> I think your executors are thrashing or spilling to disk. check memory
> metrics/swapping
>
> On 11 August 2015 at 23:19, java8964 <java8...@hotmail.com> wrote:
>
> Currently we have a IBM BigInsight cluster with 1 namenode + 1 JobTracker
> + 42 data/task nodes, which runs with BigInsight V3.0.0.2, corresponding
> with Hadoop 2.2.0 with MR1.
>
> Since IBM BigInsight doesn't come with Spark, so we build Spark 1.2.2 with
> Hadoop 2.2.0 + Hive 0.12 by ourselves, and deploy it on the same cluster.
>
> The IBM Biginsight comes with IBM jdk 1.7, but during our experience on
> stage environment, we found out Spark works better with Oracle JVM. So we
> run spark under Oracle JDK 1.7.0_79.
>
> Now on production, we are facing a issue we never faced, nor can reproduce
> on our staging cluster.
>
> We are using Spark Standalone cluster, and here is the basic
> configurations:
>
> more spark-env.sh
> export JAVA_HOME=/opt/java
> export PATH=$JAVA_HOME/bin:$PATH
> export HADOOP_CONF_DIR=/opt/ibm/biginsights/hadoop-conf/
> export
> SPARK_CLASSPATH=/opt/ibm/biginsights/IHC/lib/ibm-compression.jar:/opt/ibm/biginsights/hive/lib
> /db2jcc4-10.6.jar
> export
> SPARK_LOCAL_DIRS=/data1/spark/local,/data2/spark/local,/data3/spark/local
> export SPARK_MASTER_WEBUI_PORT=8081
> export SPARK_MASTER_IP=host1
> export SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=42"
> export SPARK_WORKER_MEMORY=24g
> export SPARK_WORKER_CORES=6
> export SPARK_WORKER_DIR=/tmp/spark/work
> export SPARK_DRIVER_MEMORY=2g
> export SPARK_EXECUTOR_MEMORY=2g
>
> more spark-defaults.conf
> spark.master spark://host1:7077
> spark.eventLog.enabled true
> spark.eventLog.dir hdfs://host1:9000/spark/eventLog
> spark.serializer org.apache.spark.serializer.KryoSerializer
> spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps
>
> We are using AVRO file format a lot, and we have these 2 datasets, one is
> about 96G, and the other one is a little over 1T. Since we are using AVRO,
> so we also built spark-avro of commit "
> a788c9fce51b0ec1bb4ce88dc65c1d55aaa675b8
> <https://github.com/databricks/spark-avro/tree/a788c9fce51b0ec1bb4ce88dc65c1d55aaa675b8>",
> which is the latest version supporting Spark 1.2.x.
>
> Here is the problem we are facing on our production cluster, even the
> following simple spark-shell commands will hang in our production cluster:
>
> import org.apache.spark.sql.SQLContext
> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
> import com.databricks.spark.avro._
> val bigData = sqlContext.avroFile("hdfs://namenode:9000/bigData/")
> bigData.registerTempTable("bigData")
> bigData.count()
>
> From the console,  we saw following:
> [Stage 0:>
> (44 + 42) / 7800]
>
> no update for more than 30 minutes and longer.
>
> The big dataset with 1T should generate 7800 HDFS block, but Spark's HDFS
> client looks like having problem to read them. Since we are running Spark
> on the data nodes, all the Spark tasks are running as "NODE_LOCAL" on
> locality level.
>
> If I go to the data/task node which Spark tasks hang, and use the JStack
> to dump the thread, I got the following on the top:
>
> 015-08-11 15:38:38
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode):
>
> "Attach Listener" daemon prio=10 tid=0x00007f0660589000 nid=0x1584d
> waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>
> "org.apache.hadoop.hdfs.PeerCache@4a88ec00" daemon prio=10
> tid=0x00007f06508b7800 nid=0x13302 waiting on condition [0x00007f060be94000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.hdfs.PeerCache.run(PeerCache.java:252)
>         at org.apache.hadoop.hdfs.PeerCache.access$000(PeerCache.java:39)
>         at org.apache.hadoop.hdfs.PeerCache$1.run(PeerCache.java:135)
>         at java.lang.Thread.run(Thread.java:745)
>
> "shuffle-client-1" daemon prio=10 tid=0x00007f0650687000 nid=0x132fc
> runnable [0x00007f060d198000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         - locked <0x000000067bf47710> (a
> io.netty.channel.nio.SelectedSelectionKeySet)
>         - locked <0x000000067bf374e8> (a
> java.util.Collections$UnmodifiableSet)
>         - locked <0x000000067bf373d0> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
>         at java.lang.Thread.run(Thread.java:745)
>
> Meantime, I can confirm our Hadoop/HDFS cluster works fine, as the
> MapReduce jobs also run without any problem, and "Hadoop fs" command works
> fine in the BigInsight.
>
> I attached the jstack output with this email, but I don't know what could
> be the root reason.
> The same Spark shell command works fine, if I point to the small dataset,
> instead of big dataset. The small dataset will have around 800 HDFS blocks,
> and Spark finishes without any problem.
>
> Here are some facts I know:
>
> 1) Since the BigInsight is running on IBM JDK, so I make the Spark run
> under the same JDK, same problem for BigData set.
> 2) I even changed "--total-executor-cores" to 42, which will make each
> executor runs with one core (as we have 42 Spark workers), to avoid any
> multithreads, but still no luck.
> 3) This problem of scanning 1T data hanging is NOT 100% for sure
> happening. Sometime I didn't see it, but more than 50% I will see it if I
> try.
> 4) We never met this issue on our stage cluster, but it has only (1
> namenode + 1 jobtracker + 3 data/task nodes), and the same dataset is only
> 160G on it.
> 5) While the Spark java processing hanging, I didn't see any exception or
> issue on the HDFS data node log.
>
> Does anyone have any clue about this?
>
> Thanks
>
> Yong
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>


-- 
Best Regards

Jeff Zhang

Re: Spark Job Hangs on our production cluster

Reply via email to