Hi Stack, Thanks for the feedback. Comments inline ...
On Wed, Nov 16, 2011 at 3:35 PM, Stack <[email protected]> wrote: > On Wed, Nov 16, 2011 at 3:26 PM, Amit Jain <[email protected]> wrote: > > Hi Lars, > > > > The keys are arriving in random order. The HBase monitoring page shows > > evenly distributed load across all of the region servers. > > What kind of ops rates are you seeing? They are running nice and > smooth across all servers? No stuttering? Whats your regionserver > logs look like? > > Are you presplitting your table or just letting hbase run and do up the > splits? > As far as I can tell, the operations look smooth across all servers. We're not doing any pre-splitting, just letting HBase do the splits. > > I didn't see > > anything weird in the gc logs, no mention of any failures. I'm a little > > unclear about what the optimal values for the following properties should > > be: > > > > hbase.hstore.compactionThreshold > > Default is 3. Look in regionserver logs. See how many files you have > on average by region columnfamily (you could also look in filesystem). > Are we constantly rewriting them? If write only load mostly, you > might up this putting off compactions till more files around (but > looking in regionserver logs, if high write rate, we might be having > trouble keeping up with this default threshold anyways?). > Well, it looks like half of the regions are in the 25-32 file range and the other half just have 1 or 2 files. This was when we ran it with a compactionThreshold of 15. How can I tell by looking at the region server logs if we're seeing a "high write rate" ? We've got 48 clients sending load, 12 region servers total. We're pushing the system pretty hard. > > hbase.hstore.blockingStoreFiles > > > > The higher this is, the bigger the price you'll pay if a server > crashes because this will be the upper bound on how many WAL logs we > need to split for the server before its regions come back on line > again. Leave it default I'd say for now. > Ok, we'll leave it default for now. > > Is there some rule of thumb that I can use to determine good values for > > these properties? > > > > You've checked out this section of the book: > http://hbase.apache.org/book.html#performance > > Are you filling the machines? Are they burning cpu? Or io-bound? > If not, perhaps open the front gate wider by upping the number of > concurrent handlers. > I have read through that section of the HBase book. There is plenty of CPU available. How do I up the number of concurrent handlers? Increase hbase.regionserver.handler.count ? - Amit
