yes that was closer to my expectations, too. i am scratching my head
as well but i don't have time to figure this out any longer. in
reality i won't have 500QPS stream between single client and single
region so i don't care much.

On Thu, Apr 21, 2011 at 11:08 PM, Ted Dunning <[email protected]> wrote:
> This actually sounds like there is a problem with concurrency either on the
> client or the server side.  TCP is plenty fast for this and having a
> dedicated TCP connection over which multiple requests can be multiplexed is
> probably much better than UDP because you would have to adapt your own
> window loss recovery anyway.   Having a long-lived TCP channel lets you
> benefit from the decades of research in how to make that work right.
>
> Hadoop rpc allows multiple outstanding requests at once so that isn't
> inherently the problem either.  I feel like I have a memory of null requests
> taking < 1 ms with Hadoop RPC, but I can't place where that memory might
> have come from.
>
> Also, I can push > 20,000 transactions per second through 20 threads in YCSB
> and average latencies on those threads are often < 5 ms and sometimes near
> 1ms.
>
> My first suspicion would be a concurrency limit somewhere that is
> artificially throttling things down.  Why it would be sooo extreme, I cannot
> imagine.
>
> On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <[email protected]>wrote:
>
>> So of course this test is stupid becuase in reality nobody would scan
>> a table with 40 rows. So all the traffic goes to a single region
>> server, so with a relatively low stress we could get an idea how the
>> rest of the cluster would behave with proportionally higher load.
>>
>> Anyway. For a million requests shot at a region server at various
>> speeds between 300 and 500 qps the picture is not pretty. RPC metrics
>> are arctually good -- no more than 1ms average per next() and 0 per
>> get(). So region server is lightning fast.
>>
>> What doesn't seem so fast is RPC. As i reported before, i was getting
>> 25ms TTLB under the circumstances. In this case all the traffic to the
>> node goes thru same client (but in reality of course the node's
>> portion per client should be much less). All that traffic is using
>> single regionserver node rpc queue as HConnection would not open more
>> than one socket to same region. And tcp doesn't seem to perform very
>> well for some reason in this scenario.
>>
>> So, it seems to help to actually open multiple hbase connections and
>> round-robin them between scans. that way even though we waste more
>> zookeeper connections, we also have more than one rpc channel open for
>> the high-traffic region as well. A little coding and it brings us down
>> from 25ms to 18ms average at 500QPS per region and 3 pooled hbase
>> connections  Perhaps normally it is not as much a problem as traffic
>> is more uniformly distributed among regions from the same client.
>>
>> The next thing i did was to enable tcp_nodelay on both client and
>> server. That got us down even more to 13ms average.
>>
>> However, it is still about two times slower if i run all processes at
>> the same machine (i get around 6-7ms average TTLBs for the same type
>> of scan).
>>
>> Ping time for about same packet size between hosts involved seems to
>> revolve around 1ms. Where another 5ms average time are getting lost is
>> still a mystery. But oh well i guess it is as good as it gets.
>> In real life hbase applications traffic would be much more uniformly
>> distributed among regions and this would be much less of an issue
>> perhaps.
>>
>> I also suspect that using udp for short scans and gets might reduce
>> latency a bit as well.
>>
>> On Wed, Apr 20, 2011 at 3:05 PM, Dmitriy Lyubimov <[email protected]>
>> wrote:
>> > So i can't seem to be able to immediately find the explanation for those
>> metrics
>> >
>> > - rpcQueueTime -- do I assume it correctly it's the time a request
>> > sits waiting int the incoming rpc queue before being picked up by
>> > handler ?
>> >
>> > -rpcProcessingTime -- do i assume it correctly it's time of request
>> > being processed by region server's handler?
>> >
>> > So inner time to last byte should be approximately sum of those, right?
>> >
>> > Thanks.
>> > -Dmitriy
>> >
>> > On Wed, Apr 20, 2011 at 1:17 PM, Dmitriy Lyubimov <[email protected]>
>> wrote:
>> >> Yes that's what i said. there's metric for fs latency but we are not
>> >> hitting it so it's not useful.
>> >>
>> >> Question is which one might be useful to measure inner ttlb, and i
>> >> don't see it there.
>> >>
>> >> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <[email protected]>
>> wrote:
>> >>> FS latency shouldn't matter with your 99.9% cache hit rate as reported.
>> >>>
>> >>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <[email protected]
>> >wrote:
>> >>>
>> >>>> Yes -- I already looked thru 'regionserver' metrics some time ago in
>> >>>> hbase book. And i am not sure there's a 'inner ttlb' metric.
>> >>>>
>> >>>> There are fs latency metrics there but nothing for the respons times.
>> >>>> fs latency is essentially hdfs latency AFAICT and that would not be
>> >>>> relevant to what i am asking for (for as long as we are hitting LRU
>> >>>> block cache anyway). we are not hitting fs.
>> >>>>
>> >>>> Unless there are more metrics than listed in the Hbase Book?
>> >>>>
>> >>>>
>> >>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <[email protected]> wrote:
>> >>>> > Enable rpc logging.  Will show in your ganglia.  See metrics article
>> >>>> > on hbase home page.
>> >>>> >
>> >>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov <
>> [email protected]>
>> >>>> wrote:
>> >>>> >> Is there any way to log 'inner' TTLB times the region server incurs
>> for
>> >>>> reads?
>> >>>> >>
>> >>>> >>
>> >>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov <
>> [email protected]>
>> >>>> wrote:
>> >>>> >>> i just enabled debug logging for o.a.h.hbase logger in that
>> particular
>> >>>> >>> region server... so far not much except for LRUBlock cache
>> spitting
>> >>>> >>> metrics ..
>> >>>> >>>
>> >>>> >>> 2011-04-20 12:28:48,375 DEBUG
>> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
>> total=8.26
>> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209,
>> >>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195,
>> >>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0,
>> >>>> >>> evicted=0, evictedPerRun=NaN
>> >>>> >>> 2011-04-20 12:33:48,375 DEBUG
>> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
>> total=8.26
>> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200,
>> >>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186,
>> >>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0,
>> >>>> >>> evicted=0, evictedPerRun=NaN
>> >>>> >>> 2011-04-20 12:38:48,375 DEBUG
>> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats:
>> total=8.26
>> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231,
>> >>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217,
>> >>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0,
>> >>>> >>> evicted=0, evictedPerRun=NaN
>> >>>> >>>
>> >>>> >>>
>> >>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <[email protected]> wrote:
>> >>>> >>>> If one region only, then its located on a single regionserver.
>>  Tail
>> >>>> >>>> that regionservers logs.  It might tell us something.
>> >>>> >>>> St.Ack
>> >>>> >>>>
>> >>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <[email protected]>
>> wrote:
>> >>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <[email protected]>
>> wrote:
>> >>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov <
>> >>>> [email protected]> wrote:
>> >>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows
>> and
>> >>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data
>> tables
>> >>>> >>>>>>> are almost empty and in-memory, so they surely should fit in
>> those
>> >>>> 40%
>> >>>> >>>>>>> heap dedicated to them.
>> >>>> >>>>>>>
>> >>>> >>>>>>
>> >>>> >>>>>> How many clients are going against the cluster?  If you use
>> less, do
>> >>>> >>>>>> your numbers improve?
>> >>>> >>>>>>
>> >>>> >>>>>
>> >>>> >>>>> And all these clients are going against a single 40 row table?
>> >>>> >>>>> St.Ack
>> >>>> >>>>>
>> >>>> >>>>
>> >>>> >>>
>> >>>> >>
>> >>>> >
>> >>>>
>> >>>
>> >>
>> >
>>
>

Reply via email to