The direct hbase client probably made 500 direct clients whereas Phoenix maybe made fewer simultaneous calls, with a little waiting and hit a sweeter spot for load on your configuration.
> On Sep 2, 2016, at 7:06 PM, Mujtaba Chohan <[email protected]> wrote: > > Single user average: Phoenix 8ms, HBase 5ms > 50 users average: Phoenix 35ms, HBase 40ms > 500 users average: Phoenix 300-400ms, HBase 350-450ms > > Few notes: > > * We have yet to identify why Phoenix was showing slight advantage with high > number of concurrent users from single client. > > * For the case with 500 concurrent users from single client, region server > handler count and Phoenix thread pool size was bumped to 500 to accommodate > this level of concurrency. > >> On Friday, September 2, 2016, James Taylor <[email protected]> wrote: >> Thanks, Mujtaba. What's the average query time for HBase and Phoenix for the >> 1/50/500 simultaneous user scenarios? >> >> Edu - make sure to set the UPDATE_CACHE_FREQUENCY property on the table (as >> Mujtaba showed in his ALTER TABLE statement - you can do this in the CREATE >> TABLE statement as well). >> >> Thanks, >> James >> >>> On Fri, Sep 2, 2016 at 5:40 PM, Mujtaba Chohan <[email protected]> wrote: >>> Here is the graph that I get simulating 1, 50 and 500 concurrent users from >>> single client. Query time for Phoenix is highly comparable with direct >>> HBase gets. >>> >>> See the chart below with query time (ms) for random point gets over large >>> table that will not fit HBase block cache. Query/gets were executed for >>> 1000 time for each user. >>> >>> <image.png> >>> Source code to execute gets/phoenix query simulating multiple users is at: >>> >>> directhbasemt.java >>> >>> directphoenixmt.java >>> >>> Table DDL >>> create table testuuid (k varchar not null primary key, a varchar, b >>> varchar, c varchar, d varchar, e varchar, f varchar); >>> >>> alter table testuuid set "UPDATE_CACHE_FREQUENCY"=150000; // this restricts >>> how often server will check for metadata updates to improve performance >>> >>> Table was filled with 68M rows. >>> Phoenix 4.8/HBase 0.98.17 running on single machine. >>> >>> //mujtaba >>> >>> >>>> On Thu, Sep 1, 2016 at 3:34 AM, Narros, Eduardo (ELS-LON) >>>> <[email protected]> wrote: >>>> Hi Mujtaba, >>>> >>>> >>>> See the answers inline below: >>>> >>>> >>>> * How are you running Phoenix queries? We are using apache-jmeter and the >>>> jdbc sampler. >>>> * Were the concurrent Phoenix queries using the same JVM? Yes. >>>> * Was the JVM restarted after changing number of concurrent users? Yes. >>>> * Is the response time plotted when query is executed for the first time >>>> or second or average of both? Average. We see response times ranging >>>> significantly even via sqlline. i.e. the same query run 11 times >>>> sequentially takes anything between 17ms to around 489ms with no other >>>> load on the server. >>>> * Is the UUID filtered on randomly distributed? Yes. >>>> * Does UUID match a single row? Yes. >>>> * It seems that even non-concurrent Phoenix query which filters on UUID >>>> takes 500ms in your environment. Can you try the same query in Sqlline a >>>> few times and see how much time it takes for each run? We run the same >>>> query 11 times via sqlline and these were the response times: >>>> 1 row selected (0.489 seconds) >>>> 1 row selected (0.279 seconds) >>>> 1 row selected (0.227 seconds) >>>> 1 row selected (0.22 seconds) >>>> 1 row selected (0.17 seconds) >>>> 1 row selected (0.152 seconds) >>>> 1 row selected (0.129 seconds) >>>> 1 row selected (0.17 seconds) >>>> 1 row selected (0.153 seconds) >>>> 1 row selected (0.259 seconds) >>>> 1 row selected (0.102 seconds) >>>> >>>> * What is the explain plan for your Phoenix query? CLIENT 1-CHUNK PARALLEL >>>> 1-WAY ROUND ROBIN POINT LOOKUP ON 1 KEY OVER schema.DOCUMENTS >>>> * If it's slow in Sqlline as well then try truncating your SYSTEM.STATS >>>> table and reconnect Sqlline and execute the query again. I think the issue >>>> is that the response times vary a lot, with 600 concurrent users the same >>>> query can take anything between 2ms to 10s. >>>> * Can you share your table schema and how you ran Phoenix queries and your >>>> HBase equivalent code? It is a simple table with 15 columns, the primary >>>> key is the uuid which is of type VARCHAR(36). The hbase equivalent code is: >>>> HTableInterface hTable = pool.getTable("schema.DOCUMENTS"); >>>> >>>> Get get = new Get(toBytes(saltPrefix + uuid)); >>>> >>>> Result result = hTable.get(get); >>>> >>>> * Any phoenix tuning defaults that you changed? No. >>>> >>>> Kind Regards, >>>> >>>> >>>> Edu >>>> >>>> >>>> >>>>> On Wed, Aug 31, 2016 at 10:40 AM, Mujtaba Chohan <[email protected]> >>>>> wrote: >>>>> Something seems inherently wrong in these test results. >>>>> >>>>> * How are you running Phoenix queries? Were the concurrent Phoenix >>>>> queries using the same JVM? Was the JVM restarted after changing number >>>>> of concurrent users? >>>>> * Is the response time plotted when query is executed for the first time >>>>> or second or average of both? >>>>> * Is the UUID filtered on randomly distributed? Does UUID match a single >>>>> row? >>>>> * It seems that even non-concurrent Phoenix query which filters on UUID >>>>> takes 500ms in your environment. Can you try the same query in Sqlline a >>>>> few times and see how much time it takes for each run? >>>>> * If it's slow in Sqlline as well then try truncating your SYSTEM.STATS >>>>> * Can you share your table schema and how you ran Phoenix queries and >>>>> your HBase equivalent code? >>>>> >>>>> >>>>> >>>>> >>>>>> On Wed, Aug 31, 2016 at 5:42 AM, Narros, Eduardo (ELS-LON) >>>>>> <[email protected]> wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> We are exploring starting to use Phoenix and have done some load tests >>>>>> to see whether Phoenix would scale. We have noted that compared to >>>>>> HBase, Phoenix response times have a much slower average as the number >>>>>> of concurrent users increases. We are trying to understand whether this >>>>>> is expected or there is something we are missing out. >>>>>> >>>>>> >>>>>> This is the test we have performed: >>>>>> >>>>>> Create table (20 columns) and load it with 400 million records indexed >>>>>> via a column called 'uuid'. >>>>>> Perform the following queries using 10,20,100,200,400 and 600 users per >>>>>> second, each user will perform each query twice: >>>>>> Phoenix: select * from schema.DOCUMENTS where uuid = ? >>>>>> Phoenix: select /*+ SERIAL SMALL */* from schema.DOCUMENTS where uuid = ? >>>>>> Hbase equivalent to: select * from schema.DOCUMENTS where uuid = ? >>>>>> The results are attached and they show that Phoenix response times are >>>>>> at least an order of magnitude above those of HBase >>>>>> The tests were run from the Master node of a CDH5.7.2 cluster with >>>>>> Phoenix 4.7.0. >>>>>> >>>>>> Are these test results expected? >>>>>> >>>>>> Kind Regards, >>>>>> >>>>>> Edu >>>>>> >>>>>> Elsevier Limited. Registered Office: The Boulevard, Langford Lane, >>>>>> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, >>>>>> Registered in England and Wales. >>>> >>>> >>>> Elsevier Limited. Registered Office: The Boulevard, Langford Lane, >>>> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, >>>> Registered in England and Wales.
