yes that was closer to my expectations, too. i am scratching my head as well but i don't have time to figure this out any longer. in reality i won't have 500QPS stream between single client and single region so i don't care much.
On Thu, Apr 21, 2011 at 11:08 PM, Ted Dunning <[email protected]> wrote: > This actually sounds like there is a problem with concurrency either on the > client or the server side. TCP is plenty fast for this and having a > dedicated TCP connection over which multiple requests can be multiplexed is > probably much better than UDP because you would have to adapt your own > window loss recovery anyway. Having a long-lived TCP channel lets you > benefit from the decades of research in how to make that work right. > > Hadoop rpc allows multiple outstanding requests at once so that isn't > inherently the problem either. I feel like I have a memory of null requests > taking < 1 ms with Hadoop RPC, but I can't place where that memory might > have come from. > > Also, I can push > 20,000 transactions per second through 20 threads in YCSB > and average latencies on those threads are often < 5 ms and sometimes near > 1ms. > > My first suspicion would be a concurrency limit somewhere that is > artificially throttling things down. Why it would be sooo extreme, I cannot > imagine. > > On Thu, Apr 21, 2011 at 10:49 PM, Dmitriy Lyubimov <[email protected]>wrote: > >> So of course this test is stupid becuase in reality nobody would scan >> a table with 40 rows. So all the traffic goes to a single region >> server, so with a relatively low stress we could get an idea how the >> rest of the cluster would behave with proportionally higher load. >> >> Anyway. For a million requests shot at a region server at various >> speeds between 300 and 500 qps the picture is not pretty. RPC metrics >> are arctually good -- no more than 1ms average per next() and 0 per >> get(). So region server is lightning fast. >> >> What doesn't seem so fast is RPC. As i reported before, i was getting >> 25ms TTLB under the circumstances. In this case all the traffic to the >> node goes thru same client (but in reality of course the node's >> portion per client should be much less). All that traffic is using >> single regionserver node rpc queue as HConnection would not open more >> than one socket to same region. And tcp doesn't seem to perform very >> well for some reason in this scenario. >> >> So, it seems to help to actually open multiple hbase connections and >> round-robin them between scans. that way even though we waste more >> zookeeper connections, we also have more than one rpc channel open for >> the high-traffic region as well. A little coding and it brings us down >> from 25ms to 18ms average at 500QPS per region and 3 pooled hbase >> connections Perhaps normally it is not as much a problem as traffic >> is more uniformly distributed among regions from the same client. >> >> The next thing i did was to enable tcp_nodelay on both client and >> server. That got us down even more to 13ms average. >> >> However, it is still about two times slower if i run all processes at >> the same machine (i get around 6-7ms average TTLBs for the same type >> of scan). >> >> Ping time for about same packet size between hosts involved seems to >> revolve around 1ms. Where another 5ms average time are getting lost is >> still a mystery. But oh well i guess it is as good as it gets. >> In real life hbase applications traffic would be much more uniformly >> distributed among regions and this would be much less of an issue >> perhaps. >> >> I also suspect that using udp for short scans and gets might reduce >> latency a bit as well. >> >> On Wed, Apr 20, 2011 at 3:05 PM, Dmitriy Lyubimov <[email protected]> >> wrote: >> > So i can't seem to be able to immediately find the explanation for those >> metrics >> > >> > - rpcQueueTime -- do I assume it correctly it's the time a request >> > sits waiting int the incoming rpc queue before being picked up by >> > handler ? >> > >> > -rpcProcessingTime -- do i assume it correctly it's time of request >> > being processed by region server's handler? >> > >> > So inner time to last byte should be approximately sum of those, right? >> > >> > Thanks. >> > -Dmitriy >> > >> > On Wed, Apr 20, 2011 at 1:17 PM, Dmitriy Lyubimov <[email protected]> >> wrote: >> >> Yes that's what i said. there's metric for fs latency but we are not >> >> hitting it so it's not useful. >> >> >> >> Question is which one might be useful to measure inner ttlb, and i >> >> don't see it there. >> >> >> >> On Wed, Apr 20, 2011 at 1:14 PM, Ted Dunning <[email protected]> >> wrote: >> >>> FS latency shouldn't matter with your 99.9% cache hit rate as reported. >> >>> >> >>> On Wed, Apr 20, 2011 at 12:55 PM, Dmitriy Lyubimov <[email protected] >> >wrote: >> >>> >> >>>> Yes -- I already looked thru 'regionserver' metrics some time ago in >> >>>> hbase book. And i am not sure there's a 'inner ttlb' metric. >> >>>> >> >>>> There are fs latency metrics there but nothing for the respons times. >> >>>> fs latency is essentially hdfs latency AFAICT and that would not be >> >>>> relevant to what i am asking for (for as long as we are hitting LRU >> >>>> block cache anyway). we are not hitting fs. >> >>>> >> >>>> Unless there are more metrics than listed in the Hbase Book? >> >>>> >> >>>> >> >>>> On Wed, Apr 20, 2011 at 12:46 PM, Stack <[email protected]> wrote: >> >>>> > Enable rpc logging. Will show in your ganglia. See metrics article >> >>>> > on hbase home page. >> >>>> > >> >>>> > On Wed, Apr 20, 2011 at 12:44 PM, Dmitriy Lyubimov < >> [email protected]> >> >>>> wrote: >> >>>> >> Is there any way to log 'inner' TTLB times the region server incurs >> for >> >>>> reads? >> >>>> >> >> >>>> >> >> >>>> >> On Wed, Apr 20, 2011 at 12:43 PM, Dmitriy Lyubimov < >> [email protected]> >> >>>> wrote: >> >>>> >>> i just enabled debug logging for o.a.h.hbase logger in that >> particular >> >>>> >>> region server... so far not much except for LRUBlock cache >> spitting >> >>>> >>> metrics .. >> >>>> >>> >> >>>> >>> 2011-04-20 12:28:48,375 DEBUG >> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: >> total=8.26 >> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=55732209, >> >>>> >>> hits=55732083, hitRatio=99.99%%, cachingAccesses=55732195, >> >>>> >>> cachingHits=55732083, cachingHitsRatio=99.99%%, evictions=0, >> >>>> >>> evicted=0, evictedPerRun=NaN >> >>>> >>> 2011-04-20 12:33:48,375 DEBUG >> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: >> total=8.26 >> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=56703200, >> >>>> >>> hits=56703074, hitRatio=99.99%%, cachingAccesses=56703186, >> >>>> >>> cachingHits=56703074, cachingHitsRatio=99.99%%, evictions=0, >> >>>> >>> evicted=0, evictedPerRun=NaN >> >>>> >>> 2011-04-20 12:38:48,375 DEBUG >> >>>> >>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: >> total=8.26 >> >>>> >>> MB, free=190.08 MB, max=198.34 MB, blocks=112, accesses=57708231, >> >>>> >>> hits=57708105, hitRatio=99.99%%, cachingAccesses=57708217, >> >>>> >>> cachingHits=57708105, cachingHitsRatio=99.99%%, evictions=0, >> >>>> >>> evicted=0, evictedPerRun=NaN >> >>>> >>> >> >>>> >>> >> >>>> >>> On Wed, Apr 20, 2011 at 12:35 PM, Stack <[email protected]> wrote: >> >>>> >>>> If one region only, then its located on a single regionserver. >> Tail >> >>>> >>>> that regionservers logs. It might tell us something. >> >>>> >>>> St.Ack >> >>>> >>>> >> >>>> >>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <[email protected]> >> wrote: >> >>>> >>>>> On Wed, Apr 20, 2011 at 12:25 PM, Stack <[email protected]> >> wrote: >> >>>> >>>>>> On Tue, Apr 19, 2011 at 4:46 PM, Dmitriy Lyubimov < >> >>>> [email protected]> wrote: >> >>>> >>>>>>> Right now i am shooting scans returning between 3 and 40 rows >> and >> >>>> >>>>>>> regardless of data size, approximately 500-400 QPS. The data >> tables >> >>>> >>>>>>> are almost empty and in-memory, so they surely should fit in >> those >> >>>> 40% >> >>>> >>>>>>> heap dedicated to them. >> >>>> >>>>>>> >> >>>> >>>>>> >> >>>> >>>>>> How many clients are going against the cluster? If you use >> less, do >> >>>> >>>>>> your numbers improve? >> >>>> >>>>>> >> >>>> >>>>> >> >>>> >>>>> And all these clients are going against a single 40 row table? >> >>>> >>>>> St.Ack >> >>>> >>>>> >> >>>> >>>> >> >>>> >>> >> >>>> >> >> >>>> > >> >>>> >> >>> >> >> >> > >> >
