Hi Tao, Your results match up with what I see running YCSB as well.
The lapse in write throughput happens when one of the regions is splitting or has reached its "blocking" threshold. Essentially there's a short period of time when it can't take writes, and thus clients will back up waiting on it. Because YCSB is doing entirely random writes, after a couple of seconds all of the threads will have hit the "stalled" region and throughput will drop to 0 until that region becomes available again. We've been brainstorming some ideas to "smooth out" these performance lapses, so instead of getting a 10 second period of unavailability, you get a 30 second period of slower performance, which is usually preferable. Thanks -Todd On Thu, Sep 9, 2010 at 10:04 PM, Jean-Daniel Cryans <[email protected]>wrote: > Is there anything specific you'd like to work on or improve? I see > numbers, but that doesn't really mean anything. For example, do you > have more heap that the size of those regions? If not, it may be > garbage collecting like mad? Are the machines swapping? Is the CPU > even contented? How many machines do you have? (please don't answer, > those are just examples of reasons why a back and forth on the mailing > list is probably going to take a long time and get us nowhere) > > If you are just trying to get better throughput for the sake of it, > then you should probably look around the documentation (like the FAQ) > and search the mailing for advices. Look at what's happening in the > region server logs and try to see where it's blocking, it's usually > pretty obvious. Also try using the most recent HBase version. > > If you're trying to see how fast you can import your data set > (providing that you have one), then instead of a 100% writes YCSB I > would recommend instead using HFOF: > http://hbase.apache.org/docs/r0.89.20100726/bulk-loads.html > > J-D > > On Thu, Sep 9, 2010 at 9:47 PM, Tao Xie <[email protected]> wrote: > > You mean "hbase.hregion.max.filesize", I set it to 2G. But I see more > > waiting time and lower ops. I have no real use case for intensive write. > I > > just use YCSB to do a performance test. > > > > 2010-09-10 11:20:42,796 332 sec: 2437024 operations; 7509.85 current > > ops/sec; [INSERT AverageLatency(ms)=35.09] > > 2010-09-10 11:20:52,801 342 sec: 2437024 operations; 0 current ops/sec; > > 2010-09-10 11:21:02,806 352 sec: 2441720 operations; 469.37 current > > ops/sec; [INSERT AverageLatency(ms)=21.37] > > 2010-09-10 11:21:12,807 362 sec: 2451112 operations; 939.11 current > > ops/sec; [INSERT AverageLatency(ms)=40.12] > > 2010-09-10 11:21:22,809 372 sec: 2569586 operations; 11845.03 current > > ops/sec; [INSERT AverageLatency(ms)=31.16] > > 2010-09-10 11:21:32,811 382 sec: 2714088 operations; 14447.31 current > > ops/sec; [INSERT AverageLatency(ms)=33.44] > > 2010-09-10 11:21:42,814 392 sec: 2718784 operations; 469.46 current > > ops/sec; [INSERT AverageLatency(ms)=34.64] > > 2010-09-10 11:21:52,815 402 sec: 2779832 operations; 6104.19 current > > ops/sec; [INSERT AverageLatency(ms)=36.46] > > 2010-09-10 11:22:02,817 412 sec: 2930104 operations; 15024.2 current > > ops/sec; [INSERT AverageLatency(ms)=38.77] > > 2010-09-10 11:22:12,819 422 sec: 3009936 operations; 7981.6 current > > ops/sec; [INSERT AverageLatency(ms)=43.41] > > 2010-09-10 11:22:22,821 432 sec: 3009936 operations; 0 current ops/sec; > > 2010-09-10 11:22:32,823 442 sec: 3009936 operations; 0 current ops/sec; > > 2010-09-10 11:22:42,825 452 sec: 3144094 operations; 13413.12 current > > ops/sec; [INSERT AverageLatency(ms)=56.82] > > 2010-09-10 11:22:52,827 462 sec: 3310480 operations; 16635.27 current > > ops/sec; [INSERT AverageLatency(ms)=34.46] > > 2010-09-10 11:23:02,829 472 sec: 3338656 operations; 2817.04 current > > ops/sec; [INSERT AverageLatency(ms)=20.91] > > 2010-09-10 11:23:12,831 482 sec: 3338656 operations; 0 current ops/sec; > > 2010-09-10 11:23:22,832 492 sec: 3438535 operations; 9986.9 current > > ops/sec; [INSERT AverageLatency(ms)=26.74] > > 2010-09-10 11:23:35,600 505 sec: 3566729 operations; 10040.26 current > > ops/sec; [INSERT AverageLatency(ms)=27.53] > > 2010-09-10 11:23:45,601 515 sec: 3620416 operations; 5368.16 current > > ops/sec; [INSERT AverageLatency(ms)=48.66] > > 2010-09-10 11:23:55,603 525 sec: 3620416 operations; 0 current ops/sec; > > 2010-09-10 11:24:05,605 535 sec: 3620416 operations; 0 current ops/sec; > > 2010-09-10 11:24:15,607 545 sec: 3648592 operations; 2817.04 current > > ops/sec; [INSERT AverageLatency(ms)=52.15] > > > > 2010/9/10 Jean-Daniel Cryans <[email protected]> > > > >> If you have a very heavy write load (like YCSB when only inserting), > >> then you really have to tune HBase for that kind of workload since > >> it's not the "normal" use case. Setting MAX_FILESIZE really high > >> (1-2GB) and even pre-splitting the table when creating it (available > >> in 0.89) will help. > >> > >> Most of the time spent waiting is due to splitting and blocking due to > >> either MemStores growing over their max size and the global MemStore > >> size limit being reached. It's kinda rough and could probably be > >> "smoother", but do you really have a use case that requires it or just > >> poking? > >> > >> J-D > >> > >> On Thu, Sep 9, 2010 at 7:32 PM, Tao Xie <[email protected]> > wrote: > >> > hi, all > >> > I use YCSB to measure the insert/read latency of hbase. > >> > I found there will be 0 records inserted in up to 10 seconds during > the > >> > insertion procedure. > >> > See the following result at 1514 second. I want to know why this > occurs. > >> Is > >> > this due to compaction? > >> > And I also want to know why the ops/sec varies all the time. Seems no > a > >> > stable time. > >> > Thanks. > >> > > >> > 2010-09-10 00:07:29,608 1484 sec: 28786280 operations; 23475.3 > current > >> > ops/sec; [INSERT AverageLatency(ms)=8.81] > >> > 2010-09-10 00:07:39,610 1494 sec: 28842632 operations; 5634.07 > current > >> > ops/sec; [INSERT AverageLatency(ms)=6.68] > >> > 2010-09-10 00:07:49,612 1504 sec: 28964728 operations; 12207.16 > current > >> > ops/sec; [INSERT AverageLatency(ms)=7.68] > >> > 2010-09-10 00:07:59,614 1514 sec: 28964728 operations; 0 current > >> ops/sec; > >> > 2010-09-10 00:08:10,778 1525 sec: 29130475 operations; 14846.56 > current > >> > ops/sec; [INSERT AverageLatency(ms)=24.45] > >> > 2010-09-10 00:08:20,782 1535 sec: 29606967 operations; 47630.15 > current > >> > ops/sec; [INSERT AverageLatency(ms)=12.64] > >> > 2010-09-10 00:08:30,784 1545 sec: 29908624 operations; 30159.67 > current > >> > ops/sec; [INSERT AverageLatency(ms)=0.12] > >> > 2010-09-10 00:08:40,786 1555 sec: 30016632 operations; 10798.64 > current > >> > ops/sec; [INSERT AverageLatency(ms)=5.66] > >> > > >> > > > -- Todd Lipcon Software Engineer, Cloudera
