For stats, what do you have your guidepost width set to? Do you only have a
single physical table? We've found that a value of 300MB still provides a
good enough granularity to get good bytes scanned estimates. We're
currently using an HBase API to update stats atomically. We could easily
make it splittable per column family per table if that's helpful. We could
also use the regular batch API and not have it be atomic (since stats are
just an estimate anyway). Please file a JIRA.

In theory, it would be fine to set IN_MEMORY to true, but there are a
couple of practical issues:
- You'll need the fix for PHOENIX-4579 or the property would be set back to
false when another client connects
- We already have client-side caching for stats and catalog and server-side
caching for catalog, so not sure how beneficial this would be.

Thanks,
James

On Sun, Apr 22, 2018 at 3:56 AM, Batyrshin Alexander <0x62...@gmail.com>
wrote:

> If all stats for given table should be on the same region there is no
> benefits on splitting.
>
> Another question: is it ok to set 'IN_MEMORY' => 'true' for CF of SYSTEM.*
> tables?
>
>
> On 20 Apr 2018, at 23:39, James Taylor <jamestay...@apache.org> wrote:
>
> Thanks for bringing this to our attention. There's a bug here in that the
> SYSTEM.STATS table has a custom split policy that prevents splitting from
> occurring (PHOENIX-4700). We'll get a fix out in 4.14, but in the meantime
> it's safe to split the table, as long as all stats for a given table are on
> the same region.
>
>     James
>
> On Fri, Apr 20, 2018 at 1:37 PM, James Taylor <jamestay...@apache.org>
> wrote:
>
>> Thanks for bringing this to our attention. There's a bug here in that the
>> SYSTEM.STATS
>>
>> On Wed, Apr 18, 2018 at 9:59 AM, Batyrshin Alexander <0x62...@gmail.com>
>> wrote:
>>
>>>  Hello,
>>> I've discovered that SYSTEM.STATS has only 1 region with size 3.25 GB.
>>> Is it ok to split it and distribute over different region servers?
>>
>>
>>
>
>

Reply via email to