[ 
https://issues.apache.org/jira/browse/PHOENIX-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338996#comment-15338996
 ] 

Josh Elser commented on PHOENIX-2940:
-------------------------------------

{quote}
I'd do ImmutableBytesPtr since this is the key the underlying cache is using 
and you can easily access this from PTable.getName().getBytesPtr(). Otherwise 
you end up creating a new ImmutableBytesPtr with every call to get. There's 
only a few callers of ConnectionQueryService.getTableStats(), so I'm hoping 
it's not too bad.
{quote}

Yeah, I came to that one too. Glad we're in agreement :)

bq. I knew I was asking the right person about what would happen if we updated 
the protobuf

:)

--

The last thing I wanted to do was some performance testing. Amazingly, I 
actually got it done (at least to a degree I'm mostly happy about). I used a 
combination of [~ndimiduk]'s (which came from [~cartershanklin], originally) 
TPC-DS and Apache JMeter testing for the concurrent read-side and Pherf for 
some write testing.

I had a 5-RS HBase instance (on some "crappy" VMs: 2core, 16G RAM, 1 effective 
disk), so the numbers are a bit low (but the difference between them is more 
what we care about). I generated about 30G of data with TPC-DS and used JMeter 
to run a bunch of point queries (5 JMeter clients, 8 threads per client, 1500 
queries per thread). The point queries where generated using bounded, random 
values, so there was a decent ratio of hits to misses. For these, I did 3 runs 
with master and 3 runs with this patch. Looking at p90, p95, and p99, and 
median latencies across 4 different queries, there was not a significant 
difference in the execution of the queries. If anything, the 2940 patch might 
have been slightest faster on average over master (which makes sense because we 
should be reading the stats table less often and sending less data over the 
wire, but given the data size, it isn't significantly shower). 

I also ran one aggregate style query between the store_sales table and the date 
dimension table. The code from master was a little faster here, but I believe 
this may have been because I didn't re-compact the table after switching from 
the code in master to the code from 2940 (the restart screwed up locality for 
some reason and I had to run the balancer to redistribute the regions). In 
short, I did not observe a significant difference in concurrent reads with this 
patch.

I captured most (hopefully all) of my automation in 
https://github.com/joshelser/phoenix-performance

On the write side, I used Pherf to get some concurrent writers into HBase. 
Across the 5 nodes, I ingested pseudo-random data into a 10 column table with 5 
salt buckets to split up the load as much as possible. Each Pherf client wrote 
5M records and each Pherf client ran at the same time. The scenario include a 
validation of the ingest using a simple {{select count(..)}} on the primary key 
for the table. I performed 2 runs of this on both master and the 2940 patch  
(currently finishing up run 2 on master, but I don't expect a difference). 
Performance appears to be pretty equivalent across both master and the 2940 
patch.

I do have the numbers here if anyone is curious about them, but, IMO, the lack 
of significant difference between master and this patch is what I wanted to 
know to be more certain that we aren't introducing anything dumb performance 
regression. I will double check the last comments from James again with fresh 
eyes and then try to commit this tomorrow morning (FYI for 4.8 
[~ankit.singhal]).

> Remove STATS RPCs from rowlock
> ------------------------------
>
>                 Key: PHOENIX-2940
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2940
>             Project: Phoenix
>          Issue Type: Improvement
>         Environment: HDP 2.3 + Apache Phoenix 4.6.0
>            Reporter: Nick Dimiduk
>            Assignee: Josh Elser
>             Fix For: 4.8.0
>
>         Attachments: PHOENIX-2940.001.patch, PHOENIX-2940.002.patch, 
> PHOENIX-2940.003.patch, PHOENIX-2940.004.patch
>
>
> We have an unfortunate situation wherein we potentially execute many RPCs 
> while holding a row lock. This is problem is discussed in detail on the user 
> list thread ["Write path blocked by MetaDataEndpoint acquiring region 
> lock"|http://search-hadoop.com/m/9UY0h2qRaBt6Tnaz1&subj=Write+path+blocked+by+MetaDataEndpoint+acquiring+region+lock].
>  During some situations, the 
> [MetaDataEndpoint|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L492]
>  coprocessor will attempt to refresh it's view of the schema definitions and 
> statistics. This involves [taking a 
> rowlock|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2862],
>  executing a scan against the [local 
> region|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L542],
>  and then a scan against a [potentially 
> remote|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L964]
>  statistics table.
> This issue is apparently exacerbated by the use of user-provided timestamps 
> (in my case, the use of the ROW_TIMESTAMP feature, or perhaps as in 
> PHOENIX-2607). When combined with other issues (PHOENIX-2939), we end up with 
> total gridlock in our handler threads -- everyone queued behind the rowlock, 
> scanning and rescanning SYSTEM.STATS. Because this happens in the 
> MetaDataEndpoint, the means by which all clients refresh their knowledge of 
> schema, gridlock in that RS can effectively stop all forward progress on the 
> cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to