Re: help diagnosing issue

Perko, Ralph J Fri, 04 Sep 2015 07:25:35 -0700

Thank you for the response.  Here is an example of how our tables are
generally defined:


CONSTRAINT pkey PRIMARY KEY (fsd, s, sa, da,dp,cg,p)
) 
TTL='5616000',KEEP_DELETED_CELLS='false',IMMUTABLE_ROWS=true,COMPRESSION='S
NAPPY',SALT_BUCKETS=40,MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.
hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';

CREATE INDEX IF NOT EXISTS table1_site_idx ON table1(s)
TTL='5616000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE
=¹10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantS
izeRegionSplitPolicy';





To your questions,

1) I believe it is compactions cannot keep up with load.  We¹ve had
compactions go wrong and had to delete tables. Lots of compactions going
on.
2) I can split to 256 but we have 5 tables (similar data volumes) with 5
global indexes each.  That¹s 25 * 256 = 6400 regions across 34 nodes by
default, is that ok?  Should I make the indexes with less buckets?

Regarding stats - they are not disabled.  I thought those were
automatically connected on compactions?

I will get the stack trace

On 9/1/15, 3:47 PM, "Vladimir Rodionov" <[email protected]> wrote:

>OK, from beginning
>
>1. RegionTooBusy is thrown when Memstore size exceeds region flush size X
>flush multiplier. THIS is a sign of a great imbalance on a write path -
>some regions are much hotter than other or .... compaction can not keep up
>with load , you hit blocking store count and flushes get disabled (as well
>as writes) for 90 sec by default. Choose one - what is your case?
>
>2. Your region load is unbalanced because default region split  algorithm
>does not do its job well - try to presplit (salt) to more than 40 buckets,
>can you do 256?
>
>-Vlad
>
>On Tue, Sep 1, 2015 at 3:29 PM, Samarth Jain <[email protected]> wrote:
>
>> Ralph,
>>
>> Couple of questions.
>>
>> Do you have phoenix stats enabled?
>>
>> Can you send us a stacktrace of RegionTooBusy exception? Looking at
>>HBase
>> code it is thrown in a few places. Would be good to check where the
>> resource crunch is occurring at.
>>
>>
>>
>> On Tue, Sep 1, 2015 at 2:26 PM, Perko, Ralph J <[email protected]>
>> wrote:
>>
>>> Hi I have run into an issue several times now and could really use some
>>> help diagnosing the problem.
>>>
>>> Environment:
>>> phoenix 4.4
>>> hbase 0.98
>>> 34 node cluster
>>> Tables are defined with 40 salt buckets
>>> We are continuously loading large, bz2, csv files into Phoenix via Pig.
>>> The data is in the hundred of TB¹s per month
>>>
>>> The process runs well for a few weeks but as the regions split and the
>>> number of regions gets into the hundreds per table we begin to get
>>> ³RegionTooBusy² exceptions around Phoenix write code when the Pig jobs
>>>run.
>>>
>>> Something else I have noticed is the number of requests on the regions
>>> becomes really unbalanced.  While the number of regions is around 40,
>>>80,
>>> 120 the number of requests per region (via the hbase master site) is
>>>pretty
>>> well balanced.  But as the number gets into the 200¹s many of the
>>>regions
>>> have 0 requests while the other regions have hundreds of millions of
>>> requests.
>>>
>>> If I drop the tables and start over the issue goes away.  But we are
>>> approaching a production deadline and this is no longer an option.
>>>
>>> The cluster is on a closed network so sending log files is not possible
>>> although I can send scanned images of logs and answer specific
>>>questions

Re: help diagnosing issue

Reply via email to