[ 
https://issues.apache.org/jira/browse/TINKERPOP-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667028#comment-16667028
 ] 

Jorge Bay commented on TINKERPOP-1942:
--------------------------------------

I've just pushed a proof of concept of the new binary serialization, focused on 
deserialization that includes some benchmarks.

{code:java}
Benchmark                                         Mode  Cnt         Score       
  Error  Units
SerializationBenchmark.testReadMessage1Binary    thrpt   20   4223528.852 ±  
101080.280  ops/s
SerializationBenchmark.testReadMessage1GraphSON  thrpt   20     36766.877 ±    
2057.289  ops/s
SerializationBenchmark.testReadMessage2Binary    thrpt   20    806403.628 ±   
27573.698  ops/s
SerializationBenchmark.testReadMessage2GraphSON  thrpt   20     26880.121 ±     
420.503  ops/s
{code}

As you can see the difference is significant, between 1 and 2 orders of 
magnitude. Note that this is only for deserializing the request, this 
difference will be amplified by serialization/deserialization performed both 
client side and server side.

Also note that the current json based format is limiting throughput to around 
36K on a single core (YRMV) for the simplest of traversals, just by 
deserializing (!).

The main reason differentiator between the binary format and json based format 
is the computational complexity:
 * Forward-only index-based reading in the new format vs open/close nested 
trees in json
 * Direct conversion of binary to object/type vs binary -> utf-8 string -> 
object.

Besides possible improvements we could make to GraphSON on the server/clients 
(if any), the complexity of GraphSON serialization can't be reduced.

I think this is something we should tackle sooner than later, to prevent 
further investment on GraphSON from us and providers.

> Binary serialization format
> ---------------------------
>
>                 Key: TINKERPOP-1942
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1942
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: io
>            Reporter: Jorge Bay
>            Priority: Major
>
> We should provide a binary serialization format designed to reduce 
> serialization overhead and minimizing the size of the payload that is 
> transmitted over the wire.
> It could be implemented in a very similar way as Kryo support but with 
> interoperability in mind and ultimately we could fade Gryo out, as now with 
> the GLVs it doesn't have a role to play.
> The main benefit would be the performance improvement, making serialization 
> and deserialization processing time negligible on both the server and the 
> client.
> Background: 
> https://lists.apache.org/thread.html/13e70235591853801bab16ed457ee4f56f3dfe2d1c5817c34a036408@%3Cdev.tinkerpop.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to