OK, from beginning 1. RegionTooBusy is thrown when Memstore size exceeds region flush size X flush multiplier. THIS is a sign of a great imbalance on a write path - some regions are much hotter than other or .... compaction can not keep up with load , you hit blocking store count and flushes get disabled (as well as writes) for 90 sec by default. Choose one - what is your case?
2. Your region load is unbalanced because default region split algorithm does not do its job well - try to presplit (salt) to more than 40 buckets, can you do 256? -Vlad On Tue, Sep 1, 2015 at 3:29 PM, Samarth Jain <[email protected]> wrote: > Ralph, > > Couple of questions. > > Do you have phoenix stats enabled? > > Can you send us a stacktrace of RegionTooBusy exception? Looking at HBase > code it is thrown in a few places. Would be good to check where the > resource crunch is occurring at. > > > > On Tue, Sep 1, 2015 at 2:26 PM, Perko, Ralph J <[email protected]> > wrote: > >> Hi I have run into an issue several times now and could really use some >> help diagnosing the problem. >> >> Environment: >> phoenix 4.4 >> hbase 0.98 >> 34 node cluster >> Tables are defined with 40 salt buckets >> We are continuously loading large, bz2, csv files into Phoenix via Pig. >> The data is in the hundred of TB’s per month >> >> The process runs well for a few weeks but as the regions split and the >> number of regions gets into the hundreds per table we begin to get >> “RegionTooBusy” exceptions around Phoenix write code when the Pig jobs run. >> >> Something else I have noticed is the number of requests on the regions >> becomes really unbalanced. While the number of regions is around 40, 80, >> 120 the number of requests per region (via the hbase master site) is pretty >> well balanced. But as the number gets into the 200’s many of the regions >> have 0 requests while the other regions have hundreds of millions of >> requests. >> >> If I drop the tables and start over the issue goes away. But we are >> approaching a production deadline and this is no longer an option. >> >> The cluster is on a closed network so sending log files is not possible >> although I can send scanned images of logs and answer specific questions. >> >> Can you please help me diagnose this issue. >> >> Thanks! >> Ralph >> >> >
