[
https://issues.apache.org/jira/browse/SPARK-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739367#comment-14739367
]
Josh Rosen commented on SPARK-10251:
------------------------------------
I don't think that we properly test Spark with registrationRequired, so people
are always finding missing classes. The right way to detect and prevent these
problems is to run all of Spark's tests with Kryo registration enabled. Given
that requiring registration is strictly stronger than not requiring it, I'm
wondering if we should always turn it on by default for all of our tests; this
shouldn't risk regression for the non-required case, AFAIK.
> Some internal spark classes are not registered with kryo
> --------------------------------------------------------
>
> Key: SPARK-10251
> URL: https://issues.apache.org/jira/browse/SPARK-10251
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.4.1
> Reporter: Soren Macbeth
> Assignee: Ram Sriharsha
> Fix For: 1.6.0
>
>
> When running a job using kryo serialization and setting
> `spark.kryo.registrationRequired=true` some internal classes are not
> registered, causing the job to die. This is still a problem when this setting
> is false (which is the default) because it makes the space required to store
> serialized objects in memory or disk much much more expensive in terms of
> runtime and storage space.
> {code}
> 15/08/25 20:28:21 WARN spark.scheduler.TaskSetManager: Lost task 0.0 in stage
> 0.0 (TID 0, a.b.c.d): java.lang.IllegalArgumentException: Class is not
> registered: scala.Tuple2[]
> Note: To register this class use: kryo.register(scala.Tuple2[].class);
> at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:442)
> at
> com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:79)
> at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:472)
> at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:565)
> at
> org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:250)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:236)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]