Wonder what a bit of winter air does.. I checked deeper - I think the
code should be safe.
The new Jena code doesn't make it's own ObjectInputStream and as your
wrapper classes basically hijack the java.io.ObjectOutputStream and
java.io.ObjectInputStream (given to Thrift as pure
Input/OutputStream), then .readObject() won't be called there at all.
As well you have added the guard exception throws by Node/Triple/Quads
n readObject / writeObject() if a serializer has not been set, so
they should also be safe.
I've suggested to add "transient" to the wrapped Quad/Triple/Node
fields of the S* classes to clarify for future developers that they
are not actually serialized in the ObjectOutputStream, but I don't
think it is needed for security reasons.
${insertImNotASecurityDisclaimerHere}
On 9 November 2016 at 14:49, Stian Soiland-Reyes <[email protected]> wrote:
> Just raising it as something to check.. I don't think your code looks
> particularly vulnerable, as the main job is done by Thrift.
>
>
> About "final" - good practice is to use 'final' unless a class is
> meant to be extended. As the S classes are just serialisation capsules
> I think this would be the case here:
>
> http://www.yegor256.com/2014/11/20/seven-virtues-of-good-object.html#7-his-class-is-either-final-or-abstract
>
> (Warning: quite opinionated author :)
>
>
> There's been quite a few CVEs raised because of unchecked
> Serialization usage in Java within libraries - where unknowing API
> users (who for some reason decide to do untrusted Java deserialization
> of their own code) get an additional drive-by serialization attack
> vector open via their dependencies.
>
> Here are some guidance:
>
> http://www.oracle.com/technetwork/java/seccodeguide-139067.html#8
> https://christian-schneider.net/JavaDeserializationSecurityFAQ.html#main
>
>
> I liked these slides, by Christian Schneider who helped me fix
> CVE-2016-2510 in Beanshell:
>
> http://www.slideshare.net/cschneider4711/owasp-benelux-day-2016-serial-killer-silently-pwning-your-java-endpoints
>
>
> I'm afraid I don't understand Java serialization enough to look
> further into this further without a fair amount of effort.
>
> On 9 November 2016 at 09:39, Andy Seaborne <[email protected]> wrote:
>> On 08/11/16 16:58, Stian Soiland-Reyes wrote:
>>>
>>> Looks like an OK solutions. Could the S classes be made "final"..?
>>>
>>> There are some security concerns with readObject() that can expose remote
>>> code execution, by for instance building a sorted collection with
>>> comparable objects, where one of those objects allow custom comparison
>>> code
>>> (e.g. a groovy script)
>>
>>
>> Indeed there can be but I don't understand what you claiming in this
>> message. It is all too vague and hints of "maybe".
>>
>> Are you claiming there is a security issue? Something specific to this code?
>> or Java serializable in general? or whenever data is being read in?
>>
>> As S* are injectable, I don't see that final makes a difference one way of
>> the other. I can add it - does no harm - but it does not relate to the
>> serialiablity of Node because S* are not mandated and enforced.
>>
>> Note that the serialization object and the UID comes from the injected code
>> and not from class Node/Triple/Quad. And it's written into publicly access
>> source code.
>>
>> SNode, STriple, SQuad are not composite Serialiable nor collections. They do
>> not pull in any other Serializable.
>>
>> STriple, SQuad do not use SNode.
>>
>> SNode does not use any other serializable. It directly encoding it's own
>> format into the stream. This is intentional to avoid recursively invoking
>> Serializable
>>
>> Andy
>>
>>
>>> https://www.owasp.org/index.php/Deserialization_of_untrusted_data
>>>
>>> The problem is that it is possible to deserialize anything on the class
>>> path, not just the type you cast to in the end. If an evil object stream
>>> contains such collections then the damage might have already been done.
>>>
>>> And this could happen when an application is deserializing non-Jena
>>> objects
>>> but have Jena on the class path.
>>>
>>> It would be good checking if any other measures are needed. E.g. is it
>>> possible to do similar nasty things in the thrift stream (beyond just
>>> having massively large strings)? I would think not, but worth double
>>> checking.
>>>
>>> On 6 Nov 2016 8:17 pm, "Andy Seaborne (JIRA)" <[email protected]> wrote:
>>>
>>>>
>>>> [ https://issues.apache.org/jira/browse/JENA-1233?page=
>>>> com.atlassian.jira.plugin.system.issuetabpanels:comment-
>>>> tabpanel&focusedCommentId=15642341#comment-15642341 ]
>>>>
>>>> Andy Seaborne commented on JENA-1233:
>>>> -------------------------------------
>>>>
>>>> A plan:
>>>>
>>>> Java serialization has a mechanism to allow an object to be serialized by
>>>> another object. This plan uses that to put the serialization code into
>>>> ARQ.
>>>>
>>>> https://docs.oracle.com/javase/8/docs/platform/
>>>> serialization/spec/serialTOC.html
>>>>
>>>> We need to decouple {{Node}} and {{Triple}} serialization otherwise we
>>>> limit the possible serialization implementation to what is available to
>>>> jena-core.
>>>>
>>>> In order to make {{Node}} and {{Triple}} serializable, use
>>>> {{writeReplace}} and {{readResolve}} to produce a serializable wrapper.
>>>> These work by inserting a different object into the serialization stream.
>>>> The {{SNode}}/{{STriple}}/{{SQuad}} in the explicit wrapper.
>>>>
>>>> {{Node}}/{{Triple}}/{{Quad}} have a function called in {{writeReplace}}
>>>> injected so the serialization is not fixed. The binary form using Thrift
>>>> will be injected by ARQ when Jena initializes.
>>>>
>>>> Sketch (a better injection mechanism is needed to avoid cluttering the
>>>> API
>>>> of {{Node}}):
>>>> {noformat}
>>>> public abstract class Node implements Serializable {
>>>> // Returned Object must provide "readResolve()" that returns a Node.
>>>> public static Function<Node, Object> replacement = null ;
>>>> protected Object writeReplace() throws ObjectStreamException {
>>>> return replacement.apply(this);
>>>> }
>>>> ...
>>>> {noformat}
>>>> NB No "serialVersionUID" here - it is given in the replacement Object so
>>>> different serializations will not get mixed up.
>>>>
>>>> This means that jena-core, on its own without ARQ, does not support
>>>> serialization of {{Node}} and {{Triple}}. As running jena-core without
>>>> jena-arq is to be discouraged anyway, that is no bad thing. A string
>>>> based
>>>> form could be provided, but not supporting quad.
>>>>
>>>> Alt plan: have two injected functions for {{writeObject}} and
>>>> {{readObject}}, then the serialVersionUID comes from {{Node}} and is the
>>>> same for all implementations.
>>>>
>>>> I don't see sufficient advantage of this. It looks more like a "normal"
>>>> implementation, rather than the {{writeReplace}}/{{readResolve}} dance,
>>>> and does not create the short lived, intermediate object but with the
>>>> impact of same serialVersionUID across implementations.
>>>>
>>>>
>>>>> Make RDF primitives Serializable
>>>>> --------------------------------
>>>>>
>>>>> Key: JENA-1233
>>>>> URL: https://issues.apache.org/jira/browse/JENA-1233
>>>>> Project: Apache Jena
>>>>> Issue Type: Improvement
>>>>> Components: Elephas
>>>>> Affects Versions: Jena 3.1.0
>>>>> Reporter: Itsuki Toyota
>>>>>
>>>>> I always use Jena when I handle RDF data with Apache Spark.
>>>>> However, when I want to store resulting RDD data (ex. RDD[Triple]) in
>>>>
>>>> binary format, I can't call RDD.saveAsObjectFile method.
>>>>>
>>>>> It's because RDD.saveAsObjectFile requires java.io.Serializable
>>>>
>>>> interface.
>>>>>
>>>>> See the following code.
>>>>> https://github.com/apache/spark/blob/v1.6.0/core/src/
>>>>
>>>> main/scala/org/apache/spark/rdd/RDD.scala#L1469
>>>>>
>>>>> https://github.com/apache/spark/blob/v1.6.0/core/src/
>>>>
>>>> main/scala/org/apache/spark/util/Utils.scala#L79-L86
>>>>>
>>>>> You can see that
>>>>> 1) RDD.saveAsObjectFile calls Util.serialize method
>>>>> 2) Util.serialize method requires the RDD-wrapped object implementing
>>>>
>>>> java.io.Serializable interface. For example, if you want to save a
>>>> RDD[Triple] object, Triple must implements java.io.Serializable.
>>>>>
>>>>> So why not implement java.io.Serializable ?
>>>>> I think it will improve the usability in Apache Spark.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> This message was sent by Atlassian JIRA
>>>> (v6.3.4#6332)
>>>>
>>>
>>
>
>
>
> --
> Stian Soiland-Reyes
> http://orcid.org/0000-0001-9842-9718
--
Stian Soiland-Reyes
http://orcid.org/0000-0001-9842-9718