[
https://issues.apache.org/jira/browse/HTRACE-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523633#comment-14523633
]
Colin Patrick McCabe commented on HTRACE-164:
---------------------------------------------
Part of the motivation for using msgpack here is that we can do incremental
serialization. That is, we can send an array of 1000 spans, but not have to do
a "big bang" serialization where we translate them all from bytes -> objects at
once. We can do one at a time in a streaming fashion. This will help
alleviate GC pressure in the Java client especially, since our Span objects
will stay short-lived.
The protobuf native library has been troublesome in Hadoop. We had a lot of
pain going from 2.4.1 -> 2.5.0, since they are incompatible (i.e. they can't
both be installed or else things will fail mysteriously). Since most of our
potential clients of the C lib use protobuf (eg. Impala, etc.), this would
continue to be a sore point if we went that direction. And then there's the
fact that Google removed 2.4.1 from the downloads page. In contrast, with
msgpack there is a very simple and small MIT-licensed C library that's just two
files that we can easily include in the C client, avoiding all the hassle of
shared libraries. The google protobuf stuff is also C++ rather than C, which
is what we use. There is a C protobuf implementation but it's not officially
supported by Google.
I would also prefer to keep the canonical span definitions in span.go,
Span.java, and span.c rather than having to copy back and forth between
auto-generated structures and internal structures like we'd have to do with PB.
msgpack allows this.
I think in general our strategy for python, ruby, node, rust, etc. support will
be to wrap the C library. Frankly, this will give better performance than
porting to those languages, and reduce the amount of code we need to maintain.
Even excluding RPC, which is a pretty small part of the client, there is a fair
amount of client code that we probably want to avoid rewriting. [~abec] has
been talking about creating a Python client. This might be interesting for
something like OpenStack.
> htrace hrpc: use msgpack for serialization
> ------------------------------------------
>
> Key: HTRACE-164
> URL: https://issues.apache.org/jira/browse/HTRACE-164
> Project: HTrace
> Issue Type: Bug
> Affects Versions: 3.2.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
>
> htrace HRPC should use msgpack for serialization. Messages serialized using
> msgpack use less space on the wire and use less CPU time to encode. The CMP
> library allows us to include msgpack support easily in the htrace C client.
> There is also good Java and Golang support available.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)