[ 
https://issues.apache.org/jira/browse/HTRACE-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523633#comment-14523633
 ] 

Colin Patrick McCabe commented on HTRACE-164:
---------------------------------------------

Part of the motivation for using msgpack here is that we can do incremental 
serialization.  That is, we can send an array of 1000 spans, but not have to do 
a "big bang" serialization where we translate them all from bytes -> objects at 
once.  We can do one at a time in a streaming fashion.  This will help 
alleviate GC pressure in the Java client especially, since our Span objects 
will stay short-lived.

The protobuf native library has been troublesome in Hadoop.  We had a lot of 
pain going from 2.4.1 -> 2.5.0, since they are incompatible (i.e. they can't 
both be installed or else things will fail mysteriously).  Since most of our 
potential clients of the C lib use protobuf (eg. Impala, etc.), this would 
continue to be a sore point if we went that direction.  And then there's the 
fact that Google removed 2.4.1 from the downloads page.  In contrast, with 
msgpack there is a very simple and small MIT-licensed C library that's just two 
files that we can easily include in the C client, avoiding all the hassle of 
shared libraries.  The google protobuf stuff is also C++ rather than C, which 
is what we use.  There is a C protobuf implementation but it's not officially 
supported by Google.

I would also prefer to keep the canonical span definitions in span.go, 
Span.java, and span.c rather than having to copy back and forth between 
auto-generated structures and internal structures like we'd have to do with PB. 
 msgpack allows this.

I think in general our strategy for python, ruby, node, rust, etc. support will 
be to wrap the C library.  Frankly, this will give better performance than 
porting to those languages, and reduce the amount of code we need to maintain.  
Even excluding RPC, which is a pretty small part of the client, there is a fair 
amount of client code that we probably want to avoid rewriting.  [~abec] has 
been talking about creating a Python client.  This might be interesting for 
something like OpenStack.

> htrace hrpc: use msgpack for serialization
> ------------------------------------------
>
>                 Key: HTRACE-164
>                 URL: https://issues.apache.org/jira/browse/HTRACE-164
>             Project: HTrace
>          Issue Type: Bug
>    Affects Versions: 3.2.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>
> htrace HRPC should use msgpack for serialization.  Messages serialized using 
> msgpack use less space on the wire and use less CPU time to encode.  The CMP 
> library allows us to include msgpack support easily in the htrace C client.  
> There is also good Java and Golang support available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to