There is a performance evaluation result: http://cloudepr.blogspot.com/2009/08/hbase-0200-performance-evaluation.html That benchmarks does not use LZO, we will do it. On Sat, Oct 10, 2009 at 11:21 AM, stack <[email protected]> wrote:
> I should have said, to figure the count of regions, see the UI (HBase puts > up UIs on port 60020 for master by default... regionservers on 60030). > St.Ack > > On Wed, Oct 7, 2009 at 10:14 PM, stack <[email protected]> wrote: > > > On Tue, Oct 6, 2009 at 10:52 PM, Adam Silberstein < > [email protected]>wrote: > > > >> Hey, > >> Thanks for all the info... > >> > >> First, a few details to clarify my use case: > >> -I have 6 region servers. > >> -I loaded a total of 120GB in 1K records into my table, so 20GB per > >> server. I'm not sure how many regions that has created. > >> > > > > You could run the rowcounter mapreduce job to see: > > > > ./bin/hadoop jar hbase.jar rowcounter > > > > That'll dump usage. You pass a tablename, column and a tmpdir IIRC. > > > > > > > >> -My reported numbers are on workloads taking place once the 120GB is in > >> place, rather than while loading the 120GB. > >> -I've run with combinations of 50,100,200 clients hitting the REST > >> server. So that's e.g. 200 clients across all region servers, not per > >> region server. Each client just repeatedly a) generates a random record > >> known to exist, and b) reads or updates it. > >> > > > > Our client can be a bottleneck. At its core is hadoop RPC with its > single > > connection to each server over which request/response are multiplexed. > As > > per J-D's suggestion, you might be able to get more throughput by upping > the > > REST server count (or, should be non-issue when you move to java api). > > > > REST server base64's everything too so this'll add a bit of friction. > > > > > > > >> -I'm interested in both throughput and latency. First, at medium > >> throughputs (i.e. not at maximum capacity) what are average read/write > >> latencies. And then, what is the maximum possible throughput, even as > >> that causes latencies to be very high. What is the throughput wall? > >> Plotting throughput vs. latency for different target throughputs reveals > >> both of these. > >> > > > > Good stuff. Let us know how else we can help out. > > > > > > When I have 50 clients across 6 region server, this is fairly close to > >> your read throughput experiment with 8 clients on 1 region server. Your > >> 2.4 k/sec throughput is obviously a lot better than what I'm seeing at > >> 300/sec. Since you had 10GB loaded, is it reasonable to assume that > >> ~50% of the reads were from memory? > > > > > > I think I had 3G per RS with 40% given over to cache. I had 1RS so not > too > > much coming from hbase cache (OS cache probably played a big factor). > > > > > > > >> In my case, with 20GB loaded and > >> 6GB heapspace, I assume ~30% was served from memory. I haven't run > >> enough tests on different size tables to estimate the impact of having > >> data in memory, though intuitively, in the time it takes to read a > >> record from disk, you could read several from memory. And the more the > >> data is disk resident, the more the disk contention. > >> > >> Yes. > > > > > > > >> Finally, I haven't tried LZO or increasing the logroll multiplier yet, > >> > > > > LZO would be good. Logroll multiplier is more about writing which you > are > > doing little of so maybe its ok at default? > > > > > > > >> and I'm hoping to move to the java client soon. As you might recall, > >> we're working toward a benchmark for cloud serving stores. We're > >> testing the newest version of our tool now. Since it's in java, we'll > >> be able to use it with HBase. > >> > > > > Tell us more? You are comparing HBase to others with a tool of your > > writing? > > > > > > I'll report back when I find out how much these changes close the > >> performance gap, and how much seems inherent when much of the data is > >> disk resident. > >> > >> > > Thanks Adam. > > St.Ack > > > > > >> -Adam > >> > >> -----Original Message----- > >> From: [email protected] [mailto:[email protected]] On Behalf Of > >> stack > >> Sent: Tuesday, October 06, 2009 1:08 PM > >> To: [email protected] > >> Subject: Re: random read/write performance > >> > >> Hey Adam: > >> > >> Thanks for checking in. > >> > >> I just did some rough loadings on a small (old hardware) cluster using > >> less > >> memory per regionserver than you. Its described on this page: > >> http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation. Random > >> writing > >> 1k records with the PerformanceEvaluation script to a single > >> regionserver, I > >> can do about 8-10k/writes/second on average using the 0.20.1 release > >> candidate 1 with a single client. Sequential writes are about the same > >> speed usually. Random reads are about 650/second on average with single > >> client and about 2.4k/second on average with 8 concurrent clients. > >> > >> So it seems like you should be able to do better than > >> 300ops/persecond/permachine -- especially if you can do the java api. > >> > >> This single regionserver was carrying about 50 regions. Thats about > >> 10GB. > >> How many regions loaded in your case? > >> > >> If throughput is important to you, lzo should help (as per J-D). > >> Turning > >> off WAL will also help with write throughput but that might not be what > >> you > >> want. Random-read-wise, the best thing you can do is give it RAM (6G > >> should > >> be good). > >> > >> Is that 50-200 clients per regionserver or for the overall cluster? If > >> per > >> regionserver, I can try that over here. I can try with bigger regions > >> if > >> you'd like -- 1G regions -- to see if that'd help your use case (if you > >> enable lzo, this should up your throughput and shrink the number of > >> regions > >> any one server is hosting). > >> > >> St.Ack > >> > >> > >> > >> > >> > >> On Tue, Oct 6, 2009 at 8:59 AM, Adam Silberstein > >> <[email protected]>wrote: > >> > >> > Hi, > >> > > >> > Just wanted to give a quick update on our HBase benchmarking efforts > >> at > >> > Yahoo. The basic use case we're looking at is: > >> > > >> > 1K records > >> > > >> > 20GB of records per node (and 6GB of memory per node, so data is not > >> > memory resident) > >> > > >> > Workloads that do random reads/writes (e.g. 95% reads, 5% writes). > >> > > >> > Multiple clients doing the reads/writes (i.e. 50-200) > >> > > >> > Measure throughput vs. latency, and see how high we can push the > >> > throughput. > >> > > >> > Note that although we want to see where throughput maxes out, the > >> > workload is random, rather than scan-oriented. > >> > > >> > > >> > > >> > I've been tweaking our HBase installation based on advice I've > >> > read/gotten from a few people. Currently, I'm running 0.20.0, have > >> heap > >> > size set to 6GB per server, and have iCMS off. I'm still using the > >> REST > >> > server instead of the java client. We're about to move our > >> benchmarking > >> > tool to java, so at that point we can use the java API. At that > >> point, > >> > I want to turn off WAL as well. If anyone has more suggestions for > >> this > >> > workload (either things to try while still using REST, or things to > >> try > >> > once I have a java client), please let me know. > >> > > >> > > >> > > >> > Given all that, I'm currently seeing maximal throughput of about 300 > >> > ops/sec/server. Has anyone with a similar disk-resident and random > >> > workload seen drastically different numbers, or guesses for what I can > >> > expect with the java client? > >> > > >> > > >> > > >> > Thanks! > >> > > >> > Adam > >> > > >> > > >> > > > > >
