Re: Multiwal performance with HBase 1.x

Yu Li Mon, 21 Sep 2015 08:25:41 -0700

@Ted, Sure, I've opened HBASE-14457
<https://issues.apache.org/jira/browse/HBASE-14457> as an umbrella for all
works done on multiwal and please allow me to give a more detailed sharing
there, in format of documents rather than emails. :-)


@Jingcheng and @Vlad, any comments/sharing from you will be warmly welcomed
in the JIRA. :-)

Best Regards,
Yu

On 21 September 2015 at 19:35, Ted Yu <[email protected]> wrote:

> Thanks for sharing, Yu.
>
> Images didn't go through. Can you use third party site for sharing ?
>
> Cheers
>
> > On Sep 20, 2015, at 11:09 PM, Yu Li <[email protected]> wrote:
> >
> > Hi Vlad,
> >
> > >> the existing write performance is more than adequate (avg load per RS
> usually less than 1MB/sec)
> > We have some different user scenarios and I'd like to share with you. We
> are using hbase to store data for building search index, and features like
> pv/uv of each online item will be recorded, so the write load would reach
> as high as 10MB/s (below is a screenshot of the ganglia metrics data) per
> RS. OTOH, as a database I think the online write performance of HBase is as
> important as read, bulkload is for offline and it cannot resolve all
> problem.
> >
> >
> > Another advantage of using multiple wal is that we could do
> user/business level isolation on wal. For example you could use one
> namespace per business and one wal group per namespace, and you could
> replicate only the data for the business in need.
> >
> > Regarding compaction IO, as I mentioned before, we could use tiered
> storage to prevent compaction to affect wal sync. This way we've observed
> an obvious improvement on the avg mutate RT, from 0.5ms to 0.3ms on our
> online cluster, FYI.
> >
> > Best Regards,
> > Yu
> >
> >> On 19 September 2015 at 00:55, Vladimir Rodionov <
> [email protected]> wrote:
> >> Hi, Jingcheng
> >>
> >> You postpone compaction until your test completes by setting number of
> >> blocking stores to 120. That is kind of cheating :)
> >> As I said previously, in a long run, compaction rules the world - not
> >> number of wal files. In a real production setting, the existing write
> >> performance
> >> is more than adequate (avg load per RS usually less than 1MB/sec).
> Multiwal
> >> has probably its value if someone need to load quick large volume of
> data,
> >> but ... why do not use bulk load instead?
> >>
> >> Thank for letting us know that beefy servers with 8 SSDs can sustain
> such a
> >> huge load.
> >>
> >> -Vlad
> >>
> >>
> >>
> >> On Thu, Sep 17, 2015 at 10:30 PM, Jingcheng Du <[email protected]>
> >> wrote:
> >>
> >> > More information for the test.
> >> > I use ycsb 0.3.0 for the test.
> >> > The command line is "./ycsb load hbase-10 -P ../workloads/workload
> -threads
> >> > 200 -p columnfamily=family -p clientbuffering=true -s > workload.dat"
> >> > The workload is, the data size is slightly less than 1TB:
> >> > fieldcount=5
> >> > fieldlength=200
> >> > recordcount=1000000000
> >> > maxexecutiontime=86400
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >> >
> http://apache-hbase.679495.n3.nabble.com/Multiwal-performance-with-HBase-1-x-tp4074403p4074731.html
> >> > Sent from the HBase User mailing list archive at Nabble.com.
> >> >
> >
>

Re: Multiwal performance with HBase 1.x

Reply via email to