Itsuki Toyota created JENA-1233:
-----------------------------------
Summary: Make RDF primitives Serializable
Key: JENA-1233
URL: https://issues.apache.org/jira/browse/JENA-1233
Project: Apache Jena
Issue Type: Improvement
Components: Elephas
Affects Versions: Jena 3.1.0
Reporter: Itsuki Toyota
I always use Jena when I handle RDF data with Apache Spark.
However, when I want to store resulting RDD data (ex. RDD[Triple]) in binary
format, I can't call RDD.saveAsObjectFile method.
It's because RDD.saveAsObjectFile requires java.io.Serializable interface.
See the following code.
https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L1469
https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/util/Utils.scala#L79-L86
You can see that
1) RDD.saveAsObjectFile calls Util.serialize method
2) Util.serialize method requires the RDD-wrapped object implementing
java.io.Serializable interface. For example, if you want to save a RDD[Triple]
object, Triple must implements java.io.Serializable.
So why not implement java.io.Serializable ?
I think it will improve the usability in Apache Spark.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)