Hi Tao,

Your results match up with what I see running YCSB as well.

The lapse in write throughput happens when one of the regions is splitting
or has reached its "blocking" threshold. Essentially there's a short period
of time when it can't take writes, and thus clients will back up waiting on
it. Because YCSB is doing entirely random writes, after a couple of seconds
all of the threads will have hit the "stalled" region and throughput will
drop to 0 until that region becomes available again.

We've been brainstorming some ideas to "smooth out" these performance
lapses, so instead of getting a 10 second period of unavailability, you get
a 30 second period of slower performance, which is usually preferable.

Thanks
-Todd

On Thu, Sep 9, 2010 at 10:04 PM, Jean-Daniel Cryans <[email protected]>wrote:

> Is there anything specific you'd like to work on or improve? I see
> numbers, but that doesn't really mean anything. For example, do you
> have more heap that the size of those regions? If not, it may be
> garbage collecting like mad? Are the machines swapping? Is the CPU
> even contented? How many machines do you have? (please don't answer,
> those are just examples of reasons why a back and forth on the mailing
> list is probably going to take a long time and get us nowhere)
>
> If you are just trying to get better throughput for the sake of it,
> then you should probably look around the documentation (like the FAQ)
> and search the mailing for advices. Look at what's happening in the
> region server logs and try to see where it's blocking, it's usually
> pretty obvious. Also try using the most recent HBase version.
>
> If you're trying to see how fast you can import your data set
> (providing that you have one), then instead of a 100% writes YCSB I
> would recommend instead using HFOF:
> http://hbase.apache.org/docs/r0.89.20100726/bulk-loads.html
>
> J-D
>
> On Thu, Sep 9, 2010 at 9:47 PM, Tao Xie <[email protected]> wrote:
> > You mean "hbase.hregion.max.filesize", I set it to 2G. But I see more
> > waiting time and lower ops. I have no real use case for intensive write.
> I
> > just use YCSB to do a performance test.
> >
> > 2010-09-10 11:20:42,796   332 sec: 2437024 operations; 7509.85 current
> > ops/sec; [INSERT AverageLatency(ms)=35.09]
> > 2010-09-10 11:20:52,801   342 sec: 2437024 operations; 0 current ops/sec;
> > 2010-09-10 11:21:02,806   352 sec: 2441720 operations; 469.37 current
> > ops/sec; [INSERT AverageLatency(ms)=21.37]
> > 2010-09-10 11:21:12,807   362 sec: 2451112 operations; 939.11 current
> > ops/sec; [INSERT AverageLatency(ms)=40.12]
> > 2010-09-10 11:21:22,809   372 sec: 2569586 operations; 11845.03 current
> > ops/sec; [INSERT AverageLatency(ms)=31.16]
> > 2010-09-10 11:21:32,811   382 sec: 2714088 operations; 14447.31 current
> > ops/sec; [INSERT AverageLatency(ms)=33.44]
> > 2010-09-10 11:21:42,814   392 sec: 2718784 operations; 469.46 current
> > ops/sec; [INSERT AverageLatency(ms)=34.64]
> > 2010-09-10 11:21:52,815   402 sec: 2779832 operations; 6104.19 current
> > ops/sec; [INSERT AverageLatency(ms)=36.46]
> > 2010-09-10 11:22:02,817   412 sec: 2930104 operations; 15024.2 current
> > ops/sec; [INSERT AverageLatency(ms)=38.77]
> > 2010-09-10 11:22:12,819   422 sec: 3009936 operations; 7981.6 current
> > ops/sec; [INSERT AverageLatency(ms)=43.41]
> > 2010-09-10 11:22:22,821   432 sec: 3009936 operations; 0 current ops/sec;
> > 2010-09-10 11:22:32,823   442 sec: 3009936 operations; 0 current ops/sec;
> > 2010-09-10 11:22:42,825   452 sec: 3144094 operations; 13413.12 current
> > ops/sec; [INSERT AverageLatency(ms)=56.82]
> > 2010-09-10 11:22:52,827   462 sec: 3310480 operations; 16635.27 current
> > ops/sec; [INSERT AverageLatency(ms)=34.46]
> > 2010-09-10 11:23:02,829   472 sec: 3338656 operations; 2817.04 current
> > ops/sec; [INSERT AverageLatency(ms)=20.91]
> > 2010-09-10 11:23:12,831   482 sec: 3338656 operations; 0 current ops/sec;
> > 2010-09-10 11:23:22,832   492 sec: 3438535 operations; 9986.9 current
> > ops/sec; [INSERT AverageLatency(ms)=26.74]
> > 2010-09-10 11:23:35,600   505 sec: 3566729 operations; 10040.26 current
> > ops/sec; [INSERT AverageLatency(ms)=27.53]
> > 2010-09-10 11:23:45,601   515 sec: 3620416 operations; 5368.16 current
> > ops/sec; [INSERT AverageLatency(ms)=48.66]
> > 2010-09-10 11:23:55,603   525 sec: 3620416 operations; 0 current ops/sec;
> > 2010-09-10 11:24:05,605   535 sec: 3620416 operations; 0 current ops/sec;
> > 2010-09-10 11:24:15,607   545 sec: 3648592 operations; 2817.04 current
> > ops/sec; [INSERT AverageLatency(ms)=52.15]
> >
> > 2010/9/10 Jean-Daniel Cryans <[email protected]>
> >
> >> If you have a very heavy write load (like YCSB when only inserting),
> >> then you really have to tune HBase for that kind of workload since
> >> it's not the "normal" use case. Setting MAX_FILESIZE really high
> >> (1-2GB) and even pre-splitting the table when creating it (available
> >> in 0.89) will help.
> >>
> >> Most of the time spent waiting is due to splitting and blocking due to
> >> either MemStores growing over their max size and the global MemStore
> >> size limit being reached. It's kinda rough and could probably be
> >> "smoother", but do you really have a use case that requires it or just
> >> poking?
> >>
> >> J-D
> >>
> >> On Thu, Sep 9, 2010 at 7:32 PM, Tao Xie <[email protected]>
> wrote:
> >> > hi, all
> >> > I use YCSB to measure the insert/read latency of hbase.
> >> > I found there will be 0 records inserted in up to 10 seconds during
> the
> >> > insertion procedure.
> >> > See the following result at 1514 second. I want to know why this
> occurs.
> >> Is
> >> > this due to compaction?
> >> > And I also want to know why the ops/sec varies all the time. Seems no
> a
> >> > stable time.
> >> > Thanks.
> >> >
> >> > 2010-09-10 00:07:29,608   1484 sec: 28786280 operations; 23475.3
> current
> >> > ops/sec; [INSERT AverageLatency(ms)=8.81]
> >> > 2010-09-10 00:07:39,610   1494 sec: 28842632 operations; 5634.07
> current
> >> > ops/sec; [INSERT AverageLatency(ms)=6.68]
> >> > 2010-09-10 00:07:49,612   1504 sec: 28964728 operations; 12207.16
> current
> >> > ops/sec; [INSERT AverageLatency(ms)=7.68]
> >> > 2010-09-10 00:07:59,614   1514 sec: 28964728 operations; 0 current
> >> ops/sec;
> >> > 2010-09-10 00:08:10,778   1525 sec: 29130475 operations; 14846.56
> current
> >> > ops/sec; [INSERT AverageLatency(ms)=24.45]
> >> > 2010-09-10 00:08:20,782   1535 sec: 29606967 operations; 47630.15
> current
> >> > ops/sec; [INSERT AverageLatency(ms)=12.64]
> >> > 2010-09-10 00:08:30,784   1545 sec: 29908624 operations; 30159.67
> current
> >> > ops/sec; [INSERT AverageLatency(ms)=0.12]
> >> > 2010-09-10 00:08:40,786   1555 sec: 30016632 operations; 10798.64
> current
> >> > ops/sec; [INSERT AverageLatency(ms)=5.66]
> >> >
> >>
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to