Hi Vlad, >> the existing write performance is more than adequate (avg load per RS usually less than 1MB/sec) We have some different user scenarios and I'd like to share with you. We are using hbase to store data for building search index, and features like pv/uv of each online item will be recorded, so the write load would reach as high as 10MB/s (below is a screenshot of the ganglia metrics data) per RS. OTOH, as a database I think the online write performance of HBase is as important as read, bulkload is for offline and it cannot resolve all problem. [image: Inline images 2][image: Inline images 3]
Another advantage of using multiple wal is that we could do user/business level isolation on wal. For example you could use one namespace per business and one wal group per namespace, and you could replicate only the data for the business in need. Regarding compaction IO, as I mentioned before, we could use tiered storage to prevent compaction to affect wal sync. This way we've observed an obvious improvement on the avg mutate RT, from 0.5ms to 0.3ms on our online cluster, FYI. Best Regards, Yu On 19 September 2015 at 00:55, Vladimir Rodionov <[email protected]> wrote: > Hi, Jingcheng > > You postpone compaction until your test completes by setting number of > blocking stores to 120. That is kind of cheating :) > As I said previously, in a long run, compaction rules the world - not > number of wal files. In a real production setting, the existing write > performance > is more than adequate (avg load per RS usually less than 1MB/sec). Multiwal > has probably its value if someone need to load quick large volume of data, > but ... why do not use bulk load instead? > > Thank for letting us know that beefy servers with 8 SSDs can sustain such a > huge load. > > -Vlad > > > > On Thu, Sep 17, 2015 at 10:30 PM, Jingcheng Du <[email protected]> > wrote: > > > More information for the test. > > I use ycsb 0.3.0 for the test. > > The command line is "./ycsb load hbase-10 -P ../workloads/workload > -threads > > 200 -p columnfamily=family -p clientbuffering=true -s > workload.dat" > > The workload is, the data size is slightly less than 1TB: > > fieldcount=5 > > fieldlength=200 > > recordcount=1000000000 > > maxexecutiontime=86400 > > > > > > > > -- > > View this message in context: > > > http://apache-hbase.679495.n3.nabble.com/Multiwal-performance-with-HBase-1-x-tp4074403p4074731.html > > Sent from the HBase User mailing list archive at Nabble.com. > > >
