Hi Josh,
The classes which are not serialized are automatically serialized by
kryo.It is used as follows while making the sparkConf object:
val conf = new SparkConf()
.setAppName(detail)
.set("spark.serializer",
"org.apache.spark.serializer.KryoSerializer")
.set("spark.driver.maxResultSize",
arguments.get("maxResultSize").get)
.registerKryoClasses(Array(classOf[org.apache.accumulo.core.data.Key]))
Thanks
Madhvi
On Tuesday 28 April 2015 11:32 PM, Josh Elser wrote:
Thanks for the report back, Vaibhav.
To clarify, did you have to write code to do the serialization with
Kryo or was this something that Kryo could do automatically.
If you could expand on the steps you took, that'd be awesome as I
assume it will be extremely helpful in the mail archives for others
who run into this problem.
vaibhav thapliyal wrote:
Hi josh,
We solved it using the kryo serializer library to serialise the key
class.
Thanks
vaibhav
On 28-Apr-2015 11:14 pm, "Josh Elser" <[email protected]
<mailto:[email protected]>> wrote:
Hi Madhvi,
Thanks for posting this. I'm not super familiar, but my hunch is
that Spark requires objects that it works with to implement the Java
Serializable interface.
Accumulo deals with Key (and Value) through Hadoop's Writable
interface (technically WritableComparable, but still stems from
Writable). I'm not sure if there's a way that you can inform Spark
to use the Writable interface methods instead of Serializable.
If there isn't a way to do this, I don't see any reason why we
couldn't make Key (and Value) also implement Serializable for this
use case. Please open an issue on JIRA for this -- we can track the
investigation there.
Thanks!
- Josh
madhvi wrote:
Hi,
While connecting to accumulo through spark by making sparkRDD
I am
getting the following error:
object not serializable (class:
org.apache.accumulo.core.data.Key)
This is due to the 'key' class of accumulo which does not
implement
serializable interface.How it can be solved and accumulo can
be used
with spark
Thanks
Madhvi