I'm seeing an RPC timeout with the 2.11 build, but not the Hadoop1, 2.10 build: The following session with two uses of sc.parallize causes it almost every the time. Occasionally I don't see the stack trace and I don't see it with just a single sc.parallize, even the bigger, second one. When the error occurs, it does pause for about two minutes with no output before the stack trace. I elided some output; why all the non-log4j warnings occur at startup is another question:
$ pwd /Users/deanwampler/projects/spark/spark-1.6.0-bin-hadoop1-scala2.11 $ ./bin/spark-shell log4j:WARN No appenders could be found for logger (org.apache.hadoop.security.Groups). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties To adjust logging level use sc.setLogLevel("INFO") Spark context available as sc. 15/11/23 13:01:45 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 15/11/23 13:01:45 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 15/11/23 13:01:49 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0 15/11/23 13:01:49 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException 15/11/23 13:01:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/11/23 13:01:50 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) 15/11/23 13:01:50 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies) SQL context available as sqlContext. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.6.0-SNAPSHOT /_/ Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0) Type in expressions to have them evaluated. Type :help for more information. scala> sc.parallelize((1 to 100000), 10000).count() [Stage 0:> (0 + 0) / 10000] [Stage 0:==> (420 + 4) / 10000] [Stage 0:===> (683 + 4) / 10000] ... elided ... [Stage 0:==========================================> (8264 + 4) / 10000] [Stage 0:==============================================> (8902 + 6) / 10000] [Stage 0:=================================================> (9477 + 4) / 10000] res0: Long = 100000 scala> sc.parallelize((1 to 1000000), 100000).count() [Stage 1:> (0 + 0) / 100000] [Stage 1:> (0 + 0) / 100000] [Stage 1:> (0 + 0) / 100000]15/11/23 13:04:09 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@7f9d659c,BlockManagerId(driver, localhost, 55188))] in 1 attempts org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout at org.apache.spark.rpc.RpcTimeout.org $apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101) at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77) at org.apache.spark.executor.Executor.org $apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:452) at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:472) at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:472) at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:472) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1708) at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:472) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:744) Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) ... 15 more 15/11/23 13:04:56 WARN NettyRpcEndpointRef: Ignore message Success(HeartbeatResponse(false)) [Stage 1:=> (2204 + 6) / 100000] [Stage 1:=> (2858 + 4) / 100000] [Stage 1:=> (3616 + 5) / 100000] ... elided ... [Stage 1:=================================================>(98393 + 4) / 100000] [Stage 1:=================================================>(99347 + 4) / 100000] [Stage 1:=================================================>(99734 + 4) / 100000] res1: Long = 1000000 scala> Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) Typesafe <http://typesafe.com> @deanwampler <http://twitter.com/deanwampler> http://polyglotprogramming.com On Sun, Nov 22, 2015 at 4:21 PM, Michael Armbrust <mich...@databricks.com> wrote: > In order to facilitate community testing of Spark 1.6.0, I'm excited to > announce the availability of an early preview of the release. This is not a > release candidate, so there is no voting involved. However, it'd be awesome > if community members can start testing with this preview package and report > any problems they encounter. > > This preview package contains all the commits to branch-1.6 > <https://github.com/apache/spark/tree/branch-1.6> till commit > 308381420f51b6da1007ea09a02d740613a226e0 > <https://github.com/apache/spark/tree/v1.6.0-preview2>. > > The staging maven repository for this preview build can be found here: > https://repository.apache.org/content/repositories/orgapachespark-1162 > > Binaries for this preview build can be found here: > http://people.apache.org/~pwendell/spark-releases/spark-v1.6.0-preview2-bin/ > > A build of the docs can also be found here: > http://people.apache.org/~pwendell/spark-releases/spark-v1.6.0-preview2-docs/ > > The full change log for this release can be found on JIRA > <https://issues.apache.org/jira/browse/SPARK-11908?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.6.0> > . > > *== How can you help? ==* > > If you are a Spark user, you can help us test this release by taking a > Spark workload and running on this preview release, then reporting any > regressions. > > *== Major Features ==* > > When testing, we'd appreciate it if users could focus on areas that have > changed in this release. Some notable new features include: > > SPARK-11787 <https://issues.apache.org/jira/browse/SPARK-11787> *Parquet > Performance* - Improve Parquet scan performance when using flat schemas. > SPARK-10810 <https://issues.apache.org/jira/browse/SPARK-10810> *Session * > *Management* - Multiple users of the thrift (JDBC/ODBC) server now have > isolated sessions including their own default database (i.e USE mydb) > even on shared clusters. > SPARK-9999 <https://issues.apache.org/jira/browse/SPARK-9999> *Dataset > API* - A new, experimental type-safe API (similar to RDDs) that performs > many operations on serialized binary data and code generation (i.e. Project > Tungsten) > SPARK-10000 <https://issues.apache.org/jira/browse/SPARK-10000> *Unified > Memory Management* - Shared memory for execution and caching instead of > exclusive division of the regions. > SPARK-10978 <https://issues.apache.org/jira/browse/SPARK-10978> *Datasource > API Avoid Double Filter* - When implementing a datasource with filter > pushdown, developers can now tell Spark SQL to avoid double evaluating a > pushed-down filter. > SPARK-2629 <https://issues.apache.org/jira/browse/SPARK-2629> *New > improved state management* - trackStateByKey - a DStream transformation > for stateful stream processing, supersedes updateStateByKey in > functionality and performance. > > Happy testing! > > Michael > >