On Fri, Mar 31, 2017 at 7:29 PM, 杨苏立 Yang Su Li <[email protected]> wrote:
> Hi, > > We found that when there is a mix of CPU-intensive and I/O intensive > workload, HBase seems to slow everything down to the disk throughput level. > > This is shown in the performance graph at > http://pages.cs.wisc.edu/~suli/blocking-orig.pdf : both client-1 and > client-2 are issuing 1KB Gets. From second 0 , both repeatedly access a > small set of data that is cachable and both get high throughput (~45k > ops/s). At second 60, client-1 switch to an I/O intensive workload and > begins to randomly access a large set of data (does not fit in cache). > *Both* client-1 and client-2's throughput drops to ~0.5K ops/s. > > Is this acceptable behavior for HBase or is it considered a bug or > performance drawback? > I can find an old JIRA entry about similar problems ( > https://issues.apache.org/jira/browse/HBASE-8836), but that was never > resolved. > > Fairness is an old, hard, full-stack problem [1]. You want the hbase client to characterize its read pattern and pass it down through hdfs to the os so it might influence the disk scheduler? We do little in this regard. What is client-1 doing out of interest when it switches to "i/o intensive workload"? It seems to be soaking up all I/Os. Is it blowing the cache too? (On HBASE-8836, on the end it refers to the scheduler which allows you divide the requests at the front door by read/write/scan). Thanks, St.Ack 1. https://www.slideshare.net/cloudera/ecosystem-session-7b > Thanks. > > Suli > > -- > Suli Yang > > Department of Physics > University of Wisconsin Madison > > 4257 Chamberlin Hall > Madison WI 53703 >
