Kousuke Saruta has posted comments on this change.
Change subject: KuduRDD.collect fails because of NoSerializableException
......................................................................
Patch Set 4:
Hi Dan,
Thank you for the review! I have a question and a comment.
> I'm not keen on adding io.Serializable to the java client classes
> due to compatibility concerns with the Java Serializable API.
Can we have any compatibility issue? Do you have any examples?
> looked into this issue, and it seems our biggest spark users are
> using Kryo serialization instead of Java serialization, since it
> provides much better performance (and it should be compatible with
> KuduRDD). Is it an option to use Kryo? Using it should be as
> simple as setting the "spark.serializer" option on the SparkConf:
>
> new SparkConf().set("spark.serializer",
> "org.apache.spark.serializer.KryoSerializer")
As you mentioned, Kryo is more efficient than Java serializer but unfortunately,
we can't serialize/deserialize those classes by Kryo.
When we try to serialize those classes by Kryo, we will get exception like as
follows.
```
16/12/15 15:40:58 ERROR TaskResultGetter: Exception while getting task result
com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
Serialization trace:
columnsByIndex (org.apache.kudu.Schema)
schema (org.apache.kudu.client.RowResult)
rowResult (org.apache.kudu.spark.kudu.KuduRow)
at
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:396)
at
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:307)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at
org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:327)
at
org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:88)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:72)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$1.apply(TaskResultGetter.scala:63)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:62)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsupportedOperationException
at
org.apache.kudu.client.shaded.com.google.common.collect.ImmutableCollection.add(ImmutableCollection.java:96)
at
com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
at
com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
... 21 more
```
The reason why we get the exception above is that Kryo can't
serialize/deserialize guava's ImmutableList which is the type of columnsByIndex
in Schema.
--
To view, visit http://gerrit.cloudera.org:8080/5496
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: If0463424481a33c66fd7464345c305062420cfe9
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Kousuke Saruta <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Kousuke Saruta <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No