I'm seeing the same failure but manifesting itself as a stackoverflow,
various operating systems and architectures (RHEL 71, CentOS 72, SUSE 12,
Ubuntu 14 04 and 16 04 LTS)
Build and test options:
mvn -T 1C -Psparkr -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver
-DskipTests clean package
mvn -Pyarn -Phadoop-2.7 -Phive -Phive-thriftserver
-Dtest.exclude.tags=org.apache.spark.tags.DockerTest -fn test
-Xss2048k -Dspark.buffer.pageSize=1048576 -Xmx4g
Stacktrace (this is with IBM's latest SDK for Java 8):
scala> org.apache.spark.SparkException: Job aborted due to stage
failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost
task 0.0 in stage 0.0 (TID 0, localhost):
com.google.common.util.concurrent.ExecutionError:
java.lang.StackOverflowError: operating system stack overflow
at
com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261)
at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
at
com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:849)
at
org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:188)
at
org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:36)
at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:833)
at
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:830)
at
org.apache.spark.sql.execution.ObjectOperator$.deserializeRowToObject(objects.scala:137)
... omitted the rest for brevity
Would also be useful to include this small but useful change that looks to
have only just missed the cut: https://github.com/apache/spark/pull/14409
From: Reynold Xin <[email protected]>
To: Dongjoon Hyun <[email protected]>
Cc: "[email protected]" <[email protected]>
Date: 02/11/2016 18:37
Subject: Re: [VOTE] Release Apache Spark 2.0.2 (RC2)
Looks like there is an issue with Maven (likely just the test itself
though). We should look into it.
On Wed, Nov 2, 2016 at 11:32 AM, Dongjoon Hyun <[email protected]>
wrote:
Hi, Sean.
The same failure blocks me, too.
- SPARK-18189: Fix serialization issue in KeyValueGroupedDataset ***
FAILED ***
I used `-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver
-Dsparkr` on CentOS 7 / OpenJDK1.8.0_111.
Dongjoon.
On 2016-11-02 10:44 (-0700), Sean Owen <[email protected]> wrote:
> Sigs, license, etc are OK. There are no Blockers for 2.0.2, though here
are
> the 4 issues still open:
>
> SPARK-14387 Enable Hive-1.x ORC compatibility with
> spark.sql.hive.convertMetastoreOrc
> SPARK-17957 Calling outer join and na.fill(0) and then inner join will
miss
> rows
> SPARK-17981 Incorrectly Set Nullability to False in FilterExec
> SPARK-18160 spark.files & spark.jars should not be passed to driver in
yarn
> mode
>
> Running with Java 8, -Pyarn -Phive -Phive-thriftserver -Phadoop-2.7 on
> Ubuntu 16, I am seeing consistent failures in this test below. I think
we
> very recently changed this so it could be legitimate. But does anyone
else
> see something like this? I have seen other failures in this test due to
OOM
> but my MAVEN_OPTS allows 6g of heap, which ought to be plenty.
>
>
> - SPARK-18189: Fix serialization issue in KeyValueGroupedDataset ***
FAILED
> ***
> isContain was true Interpreter output contained 'Exception':
> Welcome to
> ____ __
> / __/__ ___ _____/ /__
> _\ \/ _ \/ _ `/ __/ '_/
> /___/ .__/\_,_/_/ /_/\_\ version 2.0.2
> /_/
>
> Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_102)
> Type in expressions to have them evaluated.
> Type :help for more information.
>
> scala>
> scala> keyValueGrouped:
> org.apache.spark.sql.KeyValueGroupedDataset[Int,(Int, Int)] =
> org.apache.spark.sql.KeyValueGroupedDataset@70c30f72
>
> scala> mapGroups: org.apache.spark.sql.Dataset[(Int, Int)] = [_1: int,
> _2: int]
>
> scala> broadcasted: org.apache.spark.broadcast.Broadcast[Int] =
> Broadcast(0)
>
> scala>
> scala>
> scala> dataset: org.apache.spark.sql.Dataset[Int] = [value: int]
>
> scala> org.apache.spark.SparkException: Job aborted due to stage
failure:
> Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0
in
> stage 0.0 (TID 0, localhost):
> com.google.common.util.concurrent.ExecutionError:
> java.lang.ClassCircularityError:
>
io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher
> at
com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261)
> at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
> at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
> at
>
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:841)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:188)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:36)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:825)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:822)
> at
>
org.apache.spark.sql.execution.ObjectOperator$.deserializeRowToObject(objects.scala:137)
> at
>
org.apache.spark.sql.execution.AppendColumnsExec$$anonfun$9.apply(objects.scala:251)
> at
>
org.apache.spark.sql.execution.AppendColumnsExec$$anonfun$9.apply(objects.scala:250)
> at
>
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803)
> at
>
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803)
> at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
> at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
> at
>
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
> at
>
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
> at org.apache.spark.scheduler.Task.run(Task.scala:86)
> at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> at
>
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
>
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassCircularityError:
>
io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at
>
io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62)
> at
>
io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54)
> at
>
io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42)
> at
>
io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78)
> at
>
io.netty.handler.codec.MessageToMessageEncoder.<init>(MessageToMessageEncoder.java:60)
> at
>
org.apache.spark.network.protocol.MessageEncoder.<init>(MessageEncoder.java:34)
> at
>
org.apache.spark.network.TransportContext.<init>(TransportContext.java:78)
> at
>
org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354)
> at
>
org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324)
> at org.apache.spark.repl.ExecutorClassLoader.org
>
$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90)
> at
>
org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57)
> at
>
org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57)
> at
>
org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:161)
> at
>
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at
>
io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:62)
> at
>
io.netty.util.internal.JavassistTypeParameterMatcherGenerator.generate(JavassistTypeParameterMatcherGenerator.java:54)
> at
>
io.netty.util.internal.TypeParameterMatcher.get(TypeParameterMatcher.java:42)
> at
>
io.netty.util.internal.TypeParameterMatcher.find(TypeParameterMatcher.java:78)
> at
>
io.netty.handler.codec.MessageToMessageEncoder.<init>(MessageToMessageEncoder.java:60)
> at
>
org.apache.spark.network.protocol.MessageEncoder.<init>(MessageEncoder.java:34)
> at
>
org.apache.spark.network.TransportContext.<init>(TransportContext.java:78)
> at
>
org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:354)
> at
>
org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324)
> at org.apache.spark.repl.ExecutorClassLoader.org
>
$apache$spark$repl$ExecutorClassLoader$$getClassFileInputStreamFromSparkRPC(ExecutorClassLoader.scala:90)
> at
>
org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57)
> at
>
org.apache.spark.repl.ExecutorClassLoader$$anonfun$1.apply(ExecutorClassLoader.scala:57)
> at
>
org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:161)
> at
>
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:80)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> at
>
org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
>
org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at
>
org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:78)
> at org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:254)
> at
org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:6893)
> at
>
org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:5331)
> at
>
org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:5207)
> at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:5188)
> at
org.codehaus.janino.UnitCompiler.access$12600(UnitCompiler.java:185)
> at
>
org.codehaus.janino.UnitCompiler$16.visitReferenceType(UnitCompiler.java:5119)
> at org.codehaus.janino.Java$ReferenceType.accept(Java.java:2880)
> at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:5159)
> at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:5414)
> at
org.codehaus.janino.UnitCompiler.access$12400(UnitCompiler.java:185)
> at
>
org.codehaus.janino.UnitCompiler$16.visitArrayType(UnitCompiler.java:5117)
> at org.codehaus.janino.Java$ArrayType.accept(Java.java:2954)
> at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:5159)
> at
org.codehaus.janino.UnitCompiler.access$16700(UnitCompiler.java:185)
> at
>
org.codehaus.janino.UnitCompiler$31.getParameterTypes2(UnitCompiler.java:8533)
> at
> org.codehaus.janino.IClass$IInvocable.getParameterTypes(IClass.java:835)
> at org.codehaus.janino.IClass$IMethod.getDescriptor2(IClass.java:1063)
> at
org.codehaus.janino.IClass$IInvocable.getDescriptor(IClass.java:849)
> at org.codehaus.janino.IClass.getIMethods(IClass.java:211)
> at org.codehaus.janino.IClass.getIMethods(IClass.java:199)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:409)
> at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:393)
> at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:185)
> at
>
org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:347)
> at
>
org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1139)
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:354)
> at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:322)
> at
>
org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:383)
> at
>
org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:315)
> at
> org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:233)
> at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:192)
> at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:84)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:887)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:950)
> at
>
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:947)
> at
>
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
> at
>
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
> at
>
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
> at
com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
> ... 26 more
>
>
>
> On Wed, Nov 2, 2016 at 4:52 AM Reynold Xin <[email protected]> wrote:
>
> > Please vote on releasing the following candidate as Apache Spark
version
> > 2.0.2. The vote is open until Fri, Nov 4, 2016 at 22:00 PDT and passes
if a
> > majority of at least 3+1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Spark 2.0.2
> > [ ] -1 Do not release this package because ...
> >
> >
> > The tag to be voted on is v2.0.2-rc2
> > (a6abe1ee22141931614bf27a4f371c46d8379e33)
> >
> > This release candidate resolves 84 issues:
> > https://s.apache.org/spark-2.0.2-jira
> >
> > The release files, including signatures, digests, etc. can be found
at:
> > http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-bin/
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> > The staging repository for this release can be found at:
> >
https://repository.apache.org/content/repositories/orgapachespark-1210/
> >
> > The documentation corresponding to this release can be found at:
> >
http://people.apache.org/~pwendell/spark-releases/spark-2.0.2-rc2-docs/
> >
> >
> > Q: How can I help test this release?
> > A: If you are a Spark user, you can help us test this release by
taking an
> > existing Spark workload and running on this release candidate, then
> > reporting any regressions from 2.0.1.
> >
> > Q: What justifies a -1 vote for this release?
> > A: This is a maintenance release in the 2.0.x series. Bugs already
present
> > in 2.0.1, missing features, or bugs related to new features will not
> > necessarily block this release.
> >
> > Q: What fix version should I use for patches merging into branch-2.0
from
> > now on?
> > A: Please mark the fix version as 2.0.3, rather than 2.0.2. If a new
RC
> > (i.e. RC3) is cut, I will change the fix version of those patches to
2.0.2.
> >
>
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU