Re: RDF Thrift for Jena

Stian Soiland-Reyes Mon, 01 Sep 2014 11:58:21 -0700

Sounds proper enough :) with a binary format obviously one has to be very
careful about any changes, but I was more thinking of versioning of the API
of Apache Thrift that your module would use through dependencies.


If I was to use Jena 1.14.0 depending on Apache Thrift say 0.6.0, but
instead also depended on  (something that depends on) a newer Apache Thrift
0.9.0, have that project committed themselves to semantic versioning so
that this would still in theory work? E.g. not deleting or breaking
existing API signatures (adding is ok)

In theory it should not make anything fall over unless you tried to use the
Jena Thrift serialization.. but that depends on how it is wired in. In RIOT
the standard language serializers are hardcoded somewhere, right?
On 1 Sep 2014 09:35, "Andy Seaborne" <[email protected]> wrote:

> On 31/08/14 19:03, Stian Soiland-Reyes wrote:
>
>> How have you tested this for IRIs and international characters in
>> literals?
>> sorry, I am out travelling and have not checked the code yet.. :)
>>
>
> Yes.
>
> Thrift encodes strings as UTF-8.
>
> The wire form of an IRI is a tagged string:
> http://afs.github.io/rdf-thrift/rdf-binary-thrift.html
>
> struct RDF_IRI {
> 1: required string iri
> }
>
>  The new dependency on Apache Thrift would be my main concern if this is
>> not
>> in a separate module. How stable are Thrift APIs?E.g. do they follow
>> semantic versioning so that a Jena build will work with a newer Thrift
>> version (with same major)?
>>
>
> Stronger than that - Thrift cares a lot about wire/storage format
> compatibility because of the large scale of deployments in which it's used.
>
> A system wide, cross-language change of format simply isn't practical. It
> would have to be a parallel evolution.
>
> See their discussion of adding the union type - on the wire its a struct
> of one element (i.e. each element is 'optional') and union-ness is provided
> by the encode/decode.  Old implementations that are not aware of union
> still work.
>
> What is open (but closing) is whether the RDF encoding is the right one.
> Evidence from real use is always going to be valuable.
>
>         Andy
>
>  On 31 Aug 2014 15:37, "Andy Seaborne" <[email protected]> wrote:
>>
>>  On 26/08/14 21:20, Andy Seaborne wrote:
>>>
>>>  I've been working on a binary format for RDF and SPARQL result sets:
>>>>
>>>> http://afs.github.io/rdf-thrift/
>>>>
>>>> This is now ready to go if everyone is OK with that.
>>>>
>>>> I'm flagging this up for passive consensus because it adds a new
>>>> dependency (for Apache Thrift).
>>>>
>>>> And of course any questions or comments.
>>>>
>>>> Summary, as an RDF syntax:
>>>>
>>>> + x3 faster to parse than N-triples
>>>> + same size as N-triples, and same compression effects with gzip (8-10
>>>> compression).
>>>> + Not much additional work to add because Thrift does most of the work.
>>>>
>>>>       Andy
>>>>
>>>>
>>> Migration done (JENA-774).  Some cleaning up to do (putting classes in
>>> more logical places mostly) but tests in and passing.
>>>
>>>          Andy
>>>
>>>
>>>
>>
>

Re: RDF Thrift for Jena

Reply via email to