Re: Phoenix has slow response times compared to HBase

Jonathan Leech Fri, 02 Sep 2016 22:58:25 -0700

The direct hbase client probably made 500 direct clients whereas Phoenix maybe 
made fewer simultaneous calls, with a little waiting and hit a sweeter spot for 
load on your configuration.


> On Sep 2, 2016, at 7:06 PM, Mujtaba Chohan <[email protected]> wrote:
> 
> Single user average: Phoenix 8ms, HBase 5ms
> 50 users average: Phoenix 35ms, HBase 40ms
> 500 users average: Phoenix 300-400ms, HBase 350-450ms
> 
> Few notes:
> 
> * We have yet to identify why Phoenix was showing slight advantage with high 
> number of concurrent users from single client. 
> 
> * For the case with 500 concurrent users from single client, region server 
> handler count and Phoenix thread pool size was bumped to 500 to accommodate 
> this level of concurrency.
> 
>> On Friday, September 2, 2016, James Taylor <[email protected]> wrote:
>> Thanks, Mujtaba. What's the average query time for HBase and Phoenix for the 
>> 1/50/500 simultaneous user scenarios?
>> 
>> Edu - make sure to set the UPDATE_CACHE_FREQUENCY property on the table (as 
>> Mujtaba showed in his ALTER TABLE statement - you can do this in the CREATE 
>> TABLE statement as well).
>> 
>> Thanks,
>> James
>> 
>>> On Fri, Sep 2, 2016 at 5:40 PM, Mujtaba Chohan <[email protected]> wrote:
>>> Here is the graph that I get simulating 1, 50 and 500 concurrent users from 
>>> single client. Query time for Phoenix is highly comparable with direct 
>>> HBase gets. 
>>> 
>>> See the chart below with query time (ms) for random point gets over large 
>>> table that will not fit HBase block cache. Query/gets were executed for 
>>> 1000 time for each user.
>>> 
>>> <image.png>
>>> Source code to execute gets/phoenix query simulating multiple users is at:
>>> 
>>>  directhbasemt.java
>>> 
>>>  directphoenixmt.java
>>> 
>>> Table DDL
>>> create table testuuid (k varchar not null primary key, a varchar, b 
>>> varchar, c varchar, d varchar, e varchar, f varchar);
>>> 
>>> alter table testuuid set "UPDATE_CACHE_FREQUENCY"=150000; // this restricts 
>>> how often server will check for metadata updates to improve performance
>>> 
>>> Table was filled with 68M rows.
>>> Phoenix 4.8/HBase 0.98.17 running on single machine.
>>> 
>>> //mujtaba
>>> 
>>> 
>>>> On Thu, Sep 1, 2016 at 3:34 AM, Narros, Eduardo (ELS-LON) 
>>>> <[email protected]> wrote:
>>>> Hi Mujtaba,
>>>> 
>>>> 
>>>> See the answers inline below:
>>>> 
>>>> 
>>>> * How are you running Phoenix queries? We are using apache-jmeter and the 
>>>> jdbc sampler.
>>>> * Were the concurrent Phoenix queries using the same JVM? Yes.
>>>> * Was the JVM restarted after changing number of concurrent users? Yes.
>>>> * Is the response time plotted when query is executed for the first time 
>>>> or second or average of both? Average. We see response times ranging 
>>>> significantly even via sqlline. i.e. the same query run 11 times 
>>>> sequentially takes anything between 17ms to around 489ms with no other 
>>>> load on the server.
>>>> * Is the UUID filtered on randomly distributed? Yes. 
>>>> * Does UUID match a single row? Yes.
>>>> * It seems that even non-concurrent Phoenix query which filters on UUID 
>>>> takes 500ms in your environment. Can you try the same query in Sqlline a 
>>>> few times and see how much time it takes for each run? We run the same 
>>>> query 11 times via sqlline and these were the response times:
>>>> 1 row selected (0.489 seconds)
>>>> 1 row selected (0.279 seconds)
>>>> 1 row selected (0.227 seconds)
>>>> 1 row selected (0.22 seconds)
>>>> 1 row selected (0.17 seconds)
>>>> 1 row selected (0.152 seconds)
>>>> 1 row selected (0.129 seconds)
>>>> 1 row selected (0.17 seconds)
>>>> 1 row selected (0.153 seconds)
>>>> 1 row selected (0.259 seconds)
>>>> 1 row selected (0.102 seconds)
>>>> 
>>>> * What is the explain plan for your Phoenix query? CLIENT 1-CHUNK PARALLEL 
>>>> 1-WAY ROUND ROBIN POINT LOOKUP ON 1 KEY OVER schema.DOCUMENTS
>>>> * If it's slow in Sqlline as well then try truncating your SYSTEM.STATS 
>>>> table and reconnect Sqlline and execute the query again. I think the issue 
>>>> is that the response times vary a lot, with 600 concurrent users the same 
>>>> query can take anything between 2ms to 10s.
>>>> * Can you share your table schema and how you ran Phoenix queries and your 
>>>> HBase equivalent code? It is a simple table with 15 columns, the primary 
>>>> key is the uuid which is of type VARCHAR(36). The hbase equivalent code is:
>>>>  HTableInterface hTable = pool.getTable("schema.DOCUMENTS");
>>>> 
>>>> Get get = new Get(toBytes(saltPrefix + uuid));
>>>> 
>>>> Result result = hTable.get(get);
>>>> 
>>>> * Any phoenix tuning defaults that you changed? No.
>>>> 
>>>> Kind Regards,
>>>> 
>>>> 
>>>> Edu
>>>> 
>>>> 
>>>> 
>>>>> On Wed, Aug 31, 2016 at 10:40 AM, Mujtaba Chohan <[email protected]> 
>>>>> wrote:
>>>>> Something seems inherently wrong in these test results.
>>>>> 
>>>>> * How are you running Phoenix queries? Were the concurrent Phoenix 
>>>>> queries using the same JVM? Was the JVM restarted after changing number 
>>>>> of concurrent users?
>>>>> * Is the response time plotted when query is executed for the first time 
>>>>> or second or average of both?
>>>>> * Is the UUID filtered on randomly distributed? Does UUID match a single 
>>>>> row?
>>>>> * It seems that even non-concurrent Phoenix query which filters on UUID 
>>>>> takes 500ms in your environment. Can you try the same query in Sqlline a 
>>>>> few times and see how much time it takes for each run?
>>>>> * If it's slow in Sqlline as well then try truncating your SYSTEM.STATS
>>>>> * Can you share your table schema and how you ran Phoenix queries and 
>>>>> your HBase equivalent code?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, Aug 31, 2016 at 5:42 AM, Narros, Eduardo (ELS-LON) 
>>>>>> <[email protected]> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> 
>>>>>> We are exploring starting to use Phoenix and have done some load tests 
>>>>>> to see whether Phoenix would scale. We have noted that compared to 
>>>>>> HBase, Phoenix response times have a much slower average as the number 
>>>>>> of concurrent users increases. We are trying to understand whether this 
>>>>>> is expected or there is something we are missing out.
>>>>>> 
>>>>>> 
>>>>>> This is the test we have performed:
>>>>>> 
>>>>>> Create table (20 columns) and load it with 400 million records indexed 
>>>>>> via a column called 'uuid'.
>>>>>> Perform the following queries using 10,20,100,200,400 and 600 users per 
>>>>>> second, each user will perform each query twice:
>>>>>> Phoenix: select * from schema.DOCUMENTS where uuid = ?
>>>>>> Phoenix: select /*+ SERIAL SMALL */* from schema.DOCUMENTS where uuid = ?
>>>>>> Hbase equivalent to: select * from schema.DOCUMENTS where uuid = ?
>>>>>> The results are attached and they show that Phoenix response times are 
>>>>>> at least an order of magnitude above those of HBase
>>>>>> The tests were run from the Master node of a CDH5.7.2 cluster with 
>>>>>> Phoenix 4.7.0.
>>>>>> 
>>>>>> Are these test results expected?
>>>>>> 
>>>>>> Kind Regards,
>>>>>> 
>>>>>> Edu
>>>>>> 
>>>>>> Elsevier Limited. Registered Office: The Boulevard, Langford Lane, 
>>>>>> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, 
>>>>>> Registered in England and Wales.
>>>> 
>>>> 
>>>> Elsevier Limited. Registered Office: The Boulevard, Langford Lane, 
>>>> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, 
>>>> Registered in England and Wales.

Re: Phoenix has slow response times compared to HBase

Reply via email to