On 08/11/16 16:58, Stian Soiland-Reyes wrote:
Looks like an OK solutions. Could the S classes be made "final"..?There are some security concerns with readObject() that can expose remote code execution, by for instance building a sorted collection with comparable objects, where one of those objects allow custom comparison code (e.g. a groovy script)
Indeed there can be but I don't understand what you claiming in this message. It is all too vague and hints of "maybe".
Are you claiming there is a security issue? Something specific to this code? or Java serializable in general? or whenever data is being read in?
As S* are injectable, I don't see that final makes a difference one way of the other. I can add it - does no harm - but it does not relate to the serialiablity of Node because S* are not mandated and enforced.
Note that the serialization object and the UID comes from the injected code and not from class Node/Triple/Quad. And it's written into publicly access source code.
SNode, STriple, SQuad are not composite Serialiable nor collections. They do not pull in any other Serializable.
STriple, SQuad do not use SNode.SNode does not use any other serializable. It directly encoding it's own format into the stream. This is intentional to avoid recursively invoking Serializable
Andy
https://www.owasp.org/index.php/Deserialization_of_untrusted_data The problem is that it is possible to deserialize anything on the class path, not just the type you cast to in the end. If an evil object stream contains such collections then the damage might have already been done. And this could happen when an application is deserializing non-Jena objects but have Jena on the class path. It would be good checking if any other measures are needed. E.g. is it possible to do similar nasty things in the thrift stream (beyond just having massively large strings)? I would think not, but worth double checking. On 6 Nov 2016 8:17 pm, "Andy Seaborne (JIRA)" <[email protected]> wrote:[ https://issues.apache.org/jira/browse/JENA-1233?page= com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel&focusedCommentId=15642341#comment-15642341 ] Andy Seaborne commented on JENA-1233: ------------------------------------- A plan: Java serialization has a mechanism to allow an object to be serialized by another object. This plan uses that to put the serialization code into ARQ. https://docs.oracle.com/javase/8/docs/platform/ serialization/spec/serialTOC.html We need to decouple {{Node}} and {{Triple}} serialization otherwise we limit the possible serialization implementation to what is available to jena-core. In order to make {{Node}} and {{Triple}} serializable, use {{writeReplace}} and {{readResolve}} to produce a serializable wrapper. These work by inserting a different object into the serialization stream. The {{SNode}}/{{STriple}}/{{SQuad}} in the explicit wrapper. {{Node}}/{{Triple}}/{{Quad}} have a function called in {{writeReplace}} injected so the serialization is not fixed. The binary form using Thrift will be injected by ARQ when Jena initializes. Sketch (a better injection mechanism is needed to avoid cluttering the API of {{Node}}): {noformat} public abstract class Node implements Serializable { // Returned Object must provide "readResolve()" that returns a Node. public static Function<Node, Object> replacement = null ; protected Object writeReplace() throws ObjectStreamException { return replacement.apply(this); } ... {noformat} NB No "serialVersionUID" here - it is given in the replacement Object so different serializations will not get mixed up. This means that jena-core, on its own without ARQ, does not support serialization of {{Node}} and {{Triple}}. As running jena-core without jena-arq is to be discouraged anyway, that is no bad thing. A string based form could be provided, but not supporting quad. Alt plan: have two injected functions for {{writeObject}} and {{readObject}}, then the serialVersionUID comes from {{Node}} and is the same for all implementations. I don't see sufficient advantage of this. It looks more like a "normal" implementation, rather than the {{writeReplace}}/{{readResolve}} dance, and does not create the short lived, intermediate object but with the impact of same serialVersionUID across implementations.Make RDF primitives Serializable -------------------------------- Key: JENA-1233 URL: https://issues.apache.org/jira/browse/JENA-1233 Project: Apache Jena Issue Type: Improvement Components: Elephas Affects Versions: Jena 3.1.0 Reporter: Itsuki Toyota I always use Jena when I handle RDF data with Apache Spark. However, when I want to store resulting RDD data (ex. RDD[Triple]) inbinary format, I can't call RDD.saveAsObjectFile method.It's because RDD.saveAsObjectFile requires java.io.Serializableinterface.See the following code. https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L1469https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/util/Utils.scala#L79-L86You can see that 1) RDD.saveAsObjectFile calls Util.serialize method 2) Util.serialize method requires the RDD-wrapped object implementingjava.io.Serializable interface. For example, if you want to save a RDD[Triple] object, Triple must implements java.io.Serializable.So why not implement java.io.Serializable ? I think it will improve the usability in Apache Spark.-- This message was sent by Atlassian JIRA (v6.3.4#6332)
