Andy
I just tried to explain that the problem is not the number of
connections but the implementation of the RPC itself.
As for the RPC you mentioned, I would probably choose a pure Java
implementation. It will be much easier to learn and debug it, I think.
M.
On Wed, Dec 24, 2008 at 10:49 PM, Andrew Purtell <[email protected]> wrote:
> Hello Michael,
>
> What you are describing -- if accurate -- is a limitation of
> Hadoop RPC. You should take these issues to Hadoop Core,
> core-user@ and so on.
>
> Regarding HBase in particular, in 2009 as part of HBASE-1015
> (https://issues.apache.org/jira/browse/HBASE-1015), the HBase
> master, regionserver, and clients will need a high
> performance platform neutral RPC. Hadoop RPC is not suitable
> for that, so we/I will need to find another. Thrift is one
> possibility. AFAIK, Bryan Duxbury has been working to improve
> the concurrency of the server and minimize PDU sizes. There
> are other options also, such as Google protobufs.
>
> - Andy
>
>> From: Michael Dagaev <[email protected]>
>> Subject: Re: Question on Single Connection Limitation
>> To: [email protected]
>> Date: Wednesday, December 24, 2008, 12:41 PM
>> Ok. Let's consider a multithreaded client and server.
>>
>> Two client threads C1 and C2 call server over RPC. What
>> happens ?
>>
>> 0:00 Client thread C1 grabs the connection and sends a
>> request over it,
>> 0:01 Client thread C2 comes and waits for the connection
>> 0:02 Server thread S1 receives the request and handles it
>>
>> Client thread C1 is waiting for response while server
>> thread S1 is
>> handling the request.
>> Client thread C2 is waiting for the connection.
>>
>> 0:05 Server thread S1 finished the request handling and
>> sends the response to C1
>> 0:06 C1 receives the response and releases the connection
>> 0:06 Once C1 releases the connection C2 grabs the
>> connection and sends
>> the request ...
>>
>> Does it make sense so far ?
>>
>> Now the question is what the connection did since 0:02 till
>> 0:05.
>> The answer is that the connection was probably IDLE.
>>
>> Is it good ? No.
>>
>> How should it work ?
>> 0:02 C1 should release the connection after sending the
>> request.
>> 0:02 C2 should grab it and send the request while C1 is
>> waiting
>> and S1 is handling the request of C1.
>> ...
>>
>> Now the connection works more and that is how good RPC
>> implementations work.
>>
>> Is it more clear now?
>>
>> M.
>>
>> On Wed, Dec 24, 2008 at 10:09 PM, Slava Gorelik
>> <[email protected]> wrote:
>> > Sorry, some how i didn't get you.
>> >
>> > On Wed, Dec 24, 2008 at 9:17 PM, Michael Dagaev
>> <[email protected]>wrote:
>> >
>> >> Hi, all
>> >>
>> >> On the second thought, one connection per JVM
>> >> between Hbase servers and clients should be
>> enough.
>> >>
>> >> Let an Hbase client calls Hbase server.
>> >> This call consists of sending/receiving data over
>> the network
>> >> and processing on the Hbase side. I guess that the
>> network (LAN)
>> >> is not a bottleneck here.
>> >>
>> >> I believe that the raw throughput of one TCP
>> connection over LAN
>> >> is much better than the throughput of the RPC,
>> i.e. it looke like
>> >> RPC does not utilize the connection properly.
>> >>
>> >> For instance, when a client thread has sent a
>> request to Hbase
>> >> and Hbase is processing it, the thread is waiting
>> on the idle connection,
>> >> and no other thread can use it. However the
>> connection should be used
>> >> for sending/receiving data of other threads.
>> >>
>> >> Does it make sense?
>> >>
>> >> Thank you for your cooperation,
>> >> M.
>> >>
>> >
>
>
>
>