On 08/11/16 16:58, Stian Soiland-Reyes wrote:
Looks like an OK solutions. Could the S classes be made "final"..?

There are some security concerns with readObject() that can expose remote
code execution, by for instance building a sorted collection with
comparable objects, where one of those objects allow custom comparison code
(e.g. a groovy script)

Indeed there can be but I don't understand what you claiming in this message. It is all too vague and hints of "maybe".

Are you claiming there is a security issue? Something specific to this code? or Java serializable in general? or whenever data is being read in?

As S* are injectable, I don't see that final makes a difference one way of the other. I can add it - does no harm - but it does not relate to the serialiablity of Node because S* are not mandated and enforced.

Note that the serialization object and the UID comes from the injected code and not from class Node/Triple/Quad. And it's written into publicly access source code.

SNode, STriple, SQuad are not composite Serialiable nor collections. They do not pull in any other Serializable.

STriple, SQuad do not use SNode.

SNode does not use any other serializable. It directly encoding it's own format into the stream. This is intentional to avoid recursively invoking Serializable

        Andy

https://www.owasp.org/index.php/Deserialization_of_untrusted_data

The problem is that it is possible to deserialize anything on the class
path, not just the type you cast to in the end. If an evil object stream
contains such collections then the damage might have already been done.

And this could happen when an application is deserializing non-Jena objects
but have Jena on the class path.

It would be good checking if any other measures are needed. E.g. is it
possible to do similar nasty things in the thrift stream (beyond just
having massively large strings)? I would think not, but worth double
checking.

On 6 Nov 2016 8:17 pm, "Andy Seaborne (JIRA)" <[email protected]> wrote:


    [ https://issues.apache.org/jira/browse/JENA-1233?page=
com.atlassian.jira.plugin.system.issuetabpanels:comment-
tabpanel&focusedCommentId=15642341#comment-15642341 ]

Andy Seaborne commented on JENA-1233:
-------------------------------------

A plan:

Java serialization has a mechanism to allow an object to be serialized by
another object. This plan uses that to put the serialization code into ARQ.

https://docs.oracle.com/javase/8/docs/platform/
serialization/spec/serialTOC.html

We need to decouple {{Node}} and {{Triple}} serialization otherwise we
limit the possible serialization implementation to what is available to
jena-core.

In order to make {{Node}} and {{Triple}} serializable, use
{{writeReplace}} and {{readResolve}} to produce a serializable wrapper.
These work by inserting a different object into the serialization stream.
The {{SNode}}/{{STriple}}/{{SQuad}} in the explicit wrapper.

{{Node}}/{{Triple}}/{{Quad}} have a function called in {{writeReplace}}
injected so the serialization is not fixed.  The binary form using Thrift
will be injected by ARQ when Jena initializes.

Sketch (a better injection mechanism is needed to avoid cluttering the API
of {{Node}}):
{noformat}
public abstract class Node implements Serializable {
    // Returned Object must provide "readResolve()" that returns a Node.
    public static Function<Node, Object> replacement = null ;
    protected Object writeReplace() throws ObjectStreamException {
        return replacement.apply(this);
    }
...
{noformat}
NB No "serialVersionUID" here - it is given in the replacement Object so
different serializations will not get mixed up.

This means that jena-core, on its own without ARQ, does not support
serialization of {{Node}} and {{Triple}}. As running jena-core without
jena-arq is to be discouraged anyway, that is no bad thing. A string based
form could be provided, but not supporting quad.

Alt plan: have two injected functions for {{writeObject}} and
{{readObject}}, then the serialVersionUID comes from {{Node}} and is the
same for all implementations.

I don't see sufficient advantage of this. It looks more like a "normal"
implementation, rather than the {{writeReplace}}/{{readResolve}} dance,
and does not create the short lived, intermediate object but with the
impact of same serialVersionUID across implementations.


Make RDF primitives Serializable
--------------------------------

                Key: JENA-1233
                URL: https://issues.apache.org/jira/browse/JENA-1233
            Project: Apache Jena
         Issue Type: Improvement
         Components: Elephas
   Affects Versions: Jena 3.1.0
           Reporter: Itsuki Toyota

I always use Jena when I handle RDF data with Apache Spark.
However, when I want to store resulting RDD data (ex. RDD[Triple]) in
binary format, I can't call RDD.saveAsObjectFile method.
It's because RDD.saveAsObjectFile requires java.io.Serializable
interface.
See the following code.
https://github.com/apache/spark/blob/v1.6.0/core/src/
main/scala/org/apache/spark/rdd/RDD.scala#L1469
https://github.com/apache/spark/blob/v1.6.0/core/src/
main/scala/org/apache/spark/util/Utils.scala#L79-L86
You can see that
1) RDD.saveAsObjectFile calls Util.serialize method
2) Util.serialize method requires the RDD-wrapped object implementing
java.io.Serializable interface. For example, if you want to save a
RDD[Triple] object, Triple must implements java.io.Serializable.
So why not implement java.io.Serializable ?
I think it will improve the usability in Apache Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Reply via email to