[ 
https://issues.apache.org/jira/browse/JENA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418752#comment-17418752
 ] 

Andy Seaborne commented on JENA-2167:
-------------------------------------

Some initial figures.

Parsing BSBM 25 million (which is large enough to get stable timing figures 
after warm up):

Thrift: 1 million triples per second.
Protobuf: 918kTPS
N-Triples: 245kTPS

The thrift rate is faster than last time I ran it. Same hardware, same code, 
newer Java (this is Java 17-ea)

Suspicion: The protobuf is slightly slower because protobuf does not provide 
length delimited objects, where as Thrift encoding is self contained. The 
encoding of a graph is writing triples streaming fashion, each triple a 
Protobuf message. The protobuf way is to add a block length into the stream, 
and the extra decoding of this is slightly inefficient (it create two java 
objects per triple, rather than reuse existing objects).

 

 

> Provide an RDF Binary format using Protobuf
> -------------------------------------------
>
>                 Key: JENA-2167
>                 URL: https://issues.apache.org/jira/browse/JENA-2167
>             Project: Apache Jena
>          Issue Type: New Feature
>    Affects Versions: Jena 4.2.0
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>            Priority: Major
>
> To go along side the RDF Thrift encoding.
> Sometimes, apps want protobuf encoded RDF, e.g. for use with gRPC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to