[ 
https://issues.apache.org/jira/browse/JENA-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474177#comment-15474177
 ] 

Paul Houle commented on JENA-1233:
----------------------------------

In theory the identity of blank nodes between different RDF graphs is 
undefined.  For instance,  in standard SPARQL,  a blank node which is local to 
a triple store is replaced with a blank node that is local to the results,  you 
can't go back and ask more questions about that blank node.

In practice,  most triple stores (including those included in Jena) have 
vendor-specific functions to get blank node id's and to synthesize blank node 
references from them.  With these functions it is possible to "snip" a 
JSON-like structure out of one triple store into another.

In a perfect world,  the identity of blank nodes would be application specific. 
 At least from my point of view,  modelling legacy data structures such as 
relational and JSON in RDF is essential because semantic systems need to 
understand and control legacy systems.  If a blank node is being used to model 
relational rows,  for instance,  you would want to treat two blank nodes that 
share the same primary key as the same blank node.

None of this stuff is standardized though,  but to usefully export blank nodes 
you need to configure some solution that works for a particular application.

> Make RDF primitives Serializable
> --------------------------------
>
>                 Key: JENA-1233
>                 URL: https://issues.apache.org/jira/browse/JENA-1233
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Elephas
>    Affects Versions: Jena 3.1.0
>            Reporter: Itsuki Toyota
>
> I always use Jena when I handle RDF data with Apache Spark.
> However, when I want to store resulting RDD data (ex. RDD[Triple]) in binary 
> format, I can't call RDD.saveAsObjectFile method.
> It's because RDD.saveAsObjectFile requires java.io.Serializable interface.
> See the following code. 
> https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L1469
> https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/util/Utils.scala#L79-L86
> You can see that 
> 1) RDD.saveAsObjectFile calls Util.serialize method
> 2) Util.serialize method requires the RDD-wrapped object implementing 
> java.io.Serializable interface. For example, if you want to save a 
> RDD[Triple] object, Triple must implements java.io.Serializable.
> So why not implement java.io.Serializable ?
> I think it will improve the usability in Apache Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to