Thanks for sharing, Yu. Images didn't go through. Can you use third party site for sharing ?
Cheers > On Sep 20, 2015, at 11:09 PM, Yu Li <[email protected]> wrote: > > Hi Vlad, > > >> the existing write performance is more than adequate (avg load per RS > >> usually less than 1MB/sec) > We have some different user scenarios and I'd like to share with you. We are > using hbase to store data for building search index, and features like pv/uv > of each online item will be recorded, so the write load would reach as high > as 10MB/s (below is a screenshot of the ganglia metrics data) per RS. OTOH, > as a database I think the online write performance of HBase is as important > as read, bulkload is for offline and it cannot resolve all problem. > > > Another advantage of using multiple wal is that we could do user/business > level isolation on wal. For example you could use one namespace per business > and one wal group per namespace, and you could replicate only the data for > the business in need. > > Regarding compaction IO, as I mentioned before, we could use tiered storage > to prevent compaction to affect wal sync. This way we've observed an obvious > improvement on the avg mutate RT, from 0.5ms to 0.3ms on our online cluster, > FYI. > > Best Regards, > Yu > >> On 19 September 2015 at 00:55, Vladimir Rodionov <[email protected]> >> wrote: >> Hi, Jingcheng >> >> You postpone compaction until your test completes by setting number of >> blocking stores to 120. That is kind of cheating :) >> As I said previously, in a long run, compaction rules the world - not >> number of wal files. In a real production setting, the existing write >> performance >> is more than adequate (avg load per RS usually less than 1MB/sec). Multiwal >> has probably its value if someone need to load quick large volume of data, >> but ... why do not use bulk load instead? >> >> Thank for letting us know that beefy servers with 8 SSDs can sustain such a >> huge load. >> >> -Vlad >> >> >> >> On Thu, Sep 17, 2015 at 10:30 PM, Jingcheng Du <[email protected]> >> wrote: >> >> > More information for the test. >> > I use ycsb 0.3.0 for the test. >> > The command line is "./ycsb load hbase-10 -P ../workloads/workload -threads >> > 200 -p columnfamily=family -p clientbuffering=true -s > workload.dat" >> > The workload is, the data size is slightly less than 1TB: >> > fieldcount=5 >> > fieldlength=200 >> > recordcount=1000000000 >> > maxexecutiontime=86400 >> > >> > >> > >> > -- >> > View this message in context: >> > http://apache-hbase.679495.n3.nabble.com/Multiwal-performance-with-HBase-1-x-tp4074403p4074731.html >> > Sent from the HBase User mailing list archive at Nabble.com. >> > >
