Re: Multiwal performance with HBase 1.x

Yu Li Sun, 20 Sep 2015 23:10:06 -0700

Hi Vlad,

>> the existing write performance is more than adequate (avg load per RS
usually less than 1MB/sec)
We have some different user scenarios and I'd like to share with you. We
are using hbase to store data for building search index, and features like
pv/uv of each online item will be recorded, so the write load would reach
as high as 10MB/s (below is a screenshot of the ganglia metrics data) per
RS. OTOH, as a database I think the online write performance of HBase is as
important as read, bulkload is for offline and it cannot resolve all
problem.
[image: Inline images 2][image: Inline images 3]


Another advantage of using multiple wal is that we could do user/business
level isolation on wal. For example you could use one namespace per
business and one wal group per namespace, and you could replicate only the
data for the business in need.

Regarding compaction IO, as I mentioned before, we could use tiered storage
to prevent compaction to affect wal sync. This way we've observed an
obvious improvement on the avg mutate RT, from 0.5ms to 0.3ms on our online
cluster, FYI.

Best Regards,
Yu

On 19 September 2015 at 00:55, Vladimir Rodionov <[email protected]>
wrote:

> Hi, Jingcheng
>
> You postpone compaction until your test completes by setting number of
> blocking stores to 120. That is kind of cheating :)
> As I said previously, in a long run, compaction rules the world - not
> number of wal files. In a real production setting, the existing write
> performance
> is more than adequate (avg load per RS usually less than 1MB/sec). Multiwal
> has probably its value if someone need to load quick large volume of data,
> but ... why do not use bulk load instead?
>
> Thank for letting us know that beefy servers with 8 SSDs can sustain such a
> huge load.
>
> -Vlad
>
>
>
> On Thu, Sep 17, 2015 at 10:30 PM, Jingcheng Du <[email protected]>
> wrote:
>
> > More information for the test.
> > I use ycsb 0.3.0 for the test.
> > The command line is "./ycsb load hbase-10 -P ../workloads/workload
> -threads
> > 200 -p columnfamily=family -p clientbuffering=true -s > workload.dat"
> > The workload is, the data size is slightly less than 1TB:
> > fieldcount=5
> > fieldlength=200
> > recordcount=1000000000
> > maxexecutiontime=86400
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/Multiwal-performance-with-HBase-1-x-tp4074403p4074731.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>

Re: Multiwal performance with HBase 1.x

Reply via email to