I don’t think the HBase row_counter job is going to be faster than a
count(*) query. Both require a full table scan, so neither will be
particularly fast.

A couple of alternatives if you’re ok with an approximate count: 1) enable
stats collection (but you can leave off usage to parallelize queries) and
the do a SUM over the size column for the table using stats table directly,
or 2) do a count(*) using TABLESAMPLE clause (again enabling stats as
described above) to prevent a full scan.

On Thu, Feb 1, 2018 at 8:11 AM Flavio Pompermaier <pomperma...@okkam.it>
wrote:

> Hi Anil,
> Obviously I'm not using HBase just for the count query..Most of the time I
> do INSERT and selective queries, I was just trying to figure out if my
> HBase + Phoenix installation is robust enough to deal with a huge amount of
> data..
>
> On Thu, Feb 1, 2018 at 5:07 PM, anil gupta <anilgupt...@gmail.com> wrote:
>
>> Hey Flavio,
>>
>> IMHO, If most of your app is just doing full table scans then i am not
>> really sure HBase(or any other NoSql) will be a good fit for your
>> solution.(building an OLAP system?) If you have point lookups and short
>> range scans then HBase/Phoenix will work well.
>> Also, if you wanna do select count(*). The HBase row_counter job will be
>> much faster than phoenix queries.
>>
>> Thanks,
>> Anil Gupta
>>
>> On Thu, Feb 1, 2018 at 7:35 AM, Flavio Pompermaier <pomperma...@okkam.it>
>> wrote:
>>
>>> I was able to make it work changing the following params (both on server
>>> and client side and restarting hbase) and now the query answers in about 6
>>> minutes:
>>>
>>> hbase.rpc.timeout (to 600000)
>>> phoenix.query.timeoutMs (to 600000)
>>> hbase.client.scanner.timeout.period (from 1 m to 10m)
>>> hbase.regionserver.lease.period (from 1 m to 10m)
>>>
>>> However I'd like to know id those performances could be easily improved
>>> or not. Any ideas?
>>>
>>> On Thu, Feb 1, 2018 at 4:30 PM, Vaghawan Ojha <vaghawan...@gmail.com>
>>> wrote:
>>>
>>>> I've the same problem, even after I increased the hbase.rpc.timeout the
>>>> result is same. The difference is that I use 4.12.
>>>>
>>>>
>>>> On Thu, Feb 1, 2018 at 8:23 PM, Flavio Pompermaier <
>>>> pomperma...@okkam.it> wrote:
>>>>
>>>>> Hi to all,
>>>>> I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase
>>>>> and everything was fine until the data was quite small (about few
>>>>> millions). As I inserted 170 M of rows in my table I cannot get the row
>>>>> count anymore (using ELECT COUNT) because of
>>>>> org.apache.hbase.ipc.CallTimeoutException (operationTimeout 60000 
>>>>> expired).
>>>>> How can I fix this problem? I could increase the hbase.rpc.timeout
>>>>> parameter but I suspect I could improve a little bit the HBase performance
>>>>> first..the problem is that I don't know how.
>>>>>
>>>>> Thanks in advance,
>>>>> Flavio
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>>
>

Reply via email to