I dug out these two issues: https://issues.apache.org/jira/browse/HDFS-918
https://issues.apache.org/jira/browse/HDFS-1323 There was also something about speeding up random reads in HDFS, but as is typical these kinds of issues go to JIRA to die. -ryan On Thu, Aug 19, 2010 at 11:51 PM, Jeff Hammerbacher <[email protected]> wrote: > Hey Ryan, > > Could you point to the particular JIRA issues for the DFS client that are > causing these performance issues for HBase? Knowing is half the battle. > > Thanks, > Jeff > > On Thu, Aug 19, 2010 at 9:20 PM, Ryan Rawson <[email protected]> wrote: > >> Due to DFS client things are a little not as good as they should be... >> They are being worked on, so it will get resolved in time. >> >> In the mean time, the key to fast access is caching... ram ram ram. >> >> -ryan >> >> On Thu, Aug 19, 2010 at 10:15 AM, Abhijit Pol <[email protected]> >> wrote: >> > We are using Hbase 0.20.5 drop with latest cloudera Hadoop distribution. >> > >> > - We are hitting 3 nodes Hbase cluster from a client which has 10 >> > threads each with thread local copy of HTable client object and >> > established connection to server. >> > - Each of 10 threads issuing 10,000 read requests of keys randomly >> > selected from pool of 1000 keys. All keys are present on HBase and >> > table is pinned in memory (to make sure we don't have any disk seeks). >> > - If we run this test with 10 threads we get avg latency as seen by >> > client = 8ms (excluding initial 10 connection setup time) . But if we >> > increase # threads to 100, 250 to 500 we get increasing latency >> > numbers like 26ms, 51ms, 90ms. >> > - We have enabled HBase metrics on RS and we see "get_avg_time" on all >> > RS between 5-15ms in all tests, consistently. >> > >> > Is this expected? Any tips to get consistent performance below 20ms? >> > >> >
