Re: A question about RPC

Stephan Gammeter Tue, 11 Oct 2011 12:44:39 -0700

When i started writing i was not aware of zmq yet, so the connectionlayer is just written using boost, but it would be quite simple toreplace that with zmq, just have not gotten around to it yet.> i really don't see the need for people reinventing the low level rcpstuff over and over again.I totally agree! The thing I like about the protobufs is however thatyou can directly write your service definition in the .proto file whichgets parsed for you by the protocol buffer compiler and you can accessthe parsed definitions via protoc plugins for example. With the plugins+ protoc compiler nicely integrated into a cross language build system,it's not minimal effort to have clients in different languages, it'szero effort. Also it takes a little bit of code if you want to havemultiple services on the same port, being able to list all services andget service definitions, etc... Basically i have just taken care of thatpart.

> for rcp the comparison of speed is a bit of a moot point, since allthe latency will be in the communication, not so much in theserialization, i suspect.Yes the communication of course adds latency but i am talking aboutthroughput here. If you believe comparison of speed is a bit of a mootpoint, then you must be one of the fortunate people that never had touse SOAP ;) also you need to properly hide the latency and also makesure you minimize copying data around in memory etc... But I am certainzeromq does a much better job there than my boost implementation ;)haven't benchmarked it yet though.

> but once you communicate using protobuf it also becomes reallytempting to store in hadoop using protobuf instead ofwritables/sequencefiles, and from what i have heard (i have not testedthis myself) it is a good deal slower in that situation.What do you mean of just using protobuf instead ofwritables/sequencefiles exactly? I.e. let's assume you just use someProtobufToWritable adapter, i don't see how that would be much slowerthan using writables, writables and protobufs really just do the samejob, do they not? Except that protobufs are available in otherlanguages, are defined via the proto language etc... If you usewritables or protobufs, you most likely can serialize faster than youcan write to disk or to network. At least that is my feeling so far fromusing protobufs to store stuff in hbase or raw hfiles, but i have toadmit, i have not properly benchmarked this. What kind of fileformatwould you use to write serialized protobufs to, that would make it soslow? I guess in the end, one just needs to benchmark everything :)


TL;DR

.proto + protocol buffer plugins for generating rpc clients and serversis really handy. If writables or protobufs are faster needs to bebenchmarked, but probably both serialize faster than one can write.


On 23.09.2011 15:40, Koert Kuipers wrote:

did you build it on top of zmq? i really don't see the need for peoplereinventing the low level rcp stuff over and over again. zmq comeswith baked in request-response, pub-sub, and pipeline (distributedprocessing) communication. once you rely on protobuf + zmq for the rpcis it trivial to add clients in other languages, i had java, R andpython talking to each other with minimal effort.

for rcp the comparison of speed is a bit of a moot point, since allthe latency will be in the communication, not so much in theserialization, i suspect. but once you communicate using protobuf italso becomes really tempting to store in hadoop using protobuf insteadof writables/sequencefiles, and from what i have heard (i have nottested this myself) it is a good deal slower in that situation.

On Fri, Sep 23, 2011 at 8:14 AM, Stephan Gammeter<gamme...@vision.ee.ethz.ch <mailto:gamme...@vision.ee.ethz.ch>> wrote:


    I don't think protobuf are slower than writable actually, they do
    really well in speed. I actually wrote some rpc code in C++ for
    protocolbuffers and some swig wrappers to have clients in java. A
    simple c++ server can easily handle about 20k qps in that setup
    and this is just with a naive implementation where still some
    excess data copies happen during the processing of requests. If i
    have time i would like to opensource it, but i would need some
    help to get it running properly in other languages, so that it can
    be truly cross language. (right now servers are only supported in
    c++, clients are synchronous and asynchronous in c++, in java only
    synchronous clients are supported)


    On 21.09.2011 22 <tel:21.09.2011%2022>:59, Koert Kuipers wrote:

    i would love an IDL, plus that modern serialization frameworks
    such as protobuf/thrift support versioning (although i still have
    issues with different versions of thrift not working nicely
    together, argh why is that). the only downside is perhaps that
    they are a little slower than writables.

    On Wed, Sep 21, 2011 at 3:12 AM, Uma Maheswara Rao G 72686
    <mahesw...@huawei.com <mailto:mahesw...@huawei.com>> wrote:

        Hadoop has its RPC machanism mainly Writables to overcome
        some of the disadvantages on normal serializations.
        For more info:
        http://www.lexemetech.com/2008/07/rpc-and-serialization-with-hadoop.html

        Regards,
        Uma
        ----- Original Message -----
        From: jie_zhou <jie_z...@xa.allyes.com
        <mailto:jie_z...@xa.allyes.com>>
        Date: Wednesday, September 21, 2011 12:12 pm
        Subject: A question about RPC
        To: hdfs-user@hadoop.apache.org
        <mailto:hdfs-user@hadoop.apache.org>

        > Dear:
        >
        > Nice to meet you!
        >
        > I am a beginner of hadoop. Recently, I have seen the source
        of RPC of
        > hadoop,but now I have a question. As we know,hadoop RPC
        make use
        > of Dynamic
        > proxy mechanism ,but
        >
        > why not use IDL such as CORBA, or AIDL of Android?
        >
        > Thanks for your early reply.
        >
        > Best Regards,
        >
        > jie
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >

Re: A question about RPC

Reply via email to