On 30/06/14 14:07, Rob Vesse wrote:
Setup and code?

https://github.com/afs/rdf-thrift

(caution - I have swapped the encoding scheme to see if a different one is better/worse and haven't rerun the timing tests).

There are a couple of scripts rdf2thrift (writes thrift) and thrift2rdf.

In theory, now if you call LangThrift.init() it wires itself into RIOT but I ran out of time properly testing that.

I don't know what the writing speed is yet. It should be much better than the string-based N-Triples etc.

        Andy


I'd be interested in seeing how the internal binary rdf stuff we have
compares

Rob

On 21/06/2014 22:19, "Andy Seaborne" <[email protected]> wrote:

First pass results for parsing from a file to a null sink, no tuning or
profiling. Jena java level Triple objects and all nodes are created.

RIOT (128K IO buffer)
bsbm-25m.nt.gz : 127,082 Triples per second (TPS)
bsbm-25m.nt:     133,104 TPS

RDF Thrift (32K IO buffer)
bsbm-25m.rt:     357,101 TPS  x2.8
bsbm-25m.rt.gz:  390,578 TPS  x2.9

RDF Thrift (128K IO buffer)
bsbm-25m.rt:     409,788 TPS  x3.2
bsbm-25m.rt.gz:  389,969 TPS  x2.9

and best
gzip -d bsbm-25m.rt.gz | thrift2rdf (128K IO buffer)
   490,138 TPS

File sizes:
bsbm-25m.nt:     6,505,289,318 bytes (6.1G)
bsbm-25m.nt.gz:    691,429,780 bytes (660M)

bsbm-25m.rt:     6,684,543,995 bytes (6.3G)
bsbm-25m.rt.gz:    700,639,242 bytes (669M)

        Andy





Reply via email to