[
https://issues.apache.org/jira/browse/PHOENIX-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338996#comment-15338996
]
Josh Elser commented on PHOENIX-2940:
-------------------------------------
{quote}
I'd do ImmutableBytesPtr since this is the key the underlying cache is using
and you can easily access this from PTable.getName().getBytesPtr(). Otherwise
you end up creating a new ImmutableBytesPtr with every call to get. There's
only a few callers of ConnectionQueryService.getTableStats(), so I'm hoping
it's not too bad.
{quote}
Yeah, I came to that one too. Glad we're in agreement :)
bq. I knew I was asking the right person about what would happen if we updated
the protobuf
:)
--
The last thing I wanted to do was some performance testing. Amazingly, I
actually got it done (at least to a degree I'm mostly happy about). I used a
combination of [~ndimiduk]'s (which came from [~cartershanklin], originally)
TPC-DS and Apache JMeter testing for the concurrent read-side and Pherf for
some write testing.
I had a 5-RS HBase instance (on some "crappy" VMs: 2core, 16G RAM, 1 effective
disk), so the numbers are a bit low (but the difference between them is more
what we care about). I generated about 30G of data with TPC-DS and used JMeter
to run a bunch of point queries (5 JMeter clients, 8 threads per client, 1500
queries per thread). The point queries where generated using bounded, random
values, so there was a decent ratio of hits to misses. For these, I did 3 runs
with master and 3 runs with this patch. Looking at p90, p95, and p99, and
median latencies across 4 different queries, there was not a significant
difference in the execution of the queries. If anything, the 2940 patch might
have been slightest faster on average over master (which makes sense because we
should be reading the stats table less often and sending less data over the
wire, but given the data size, it isn't significantly shower).
I also ran one aggregate style query between the store_sales table and the date
dimension table. The code from master was a little faster here, but I believe
this may have been because I didn't re-compact the table after switching from
the code in master to the code from 2940 (the restart screwed up locality for
some reason and I had to run the balancer to redistribute the regions). In
short, I did not observe a significant difference in concurrent reads with this
patch.
I captured most (hopefully all) of my automation in
https://github.com/joshelser/phoenix-performance
On the write side, I used Pherf to get some concurrent writers into HBase.
Across the 5 nodes, I ingested pseudo-random data into a 10 column table with 5
salt buckets to split up the load as much as possible. Each Pherf client wrote
5M records and each Pherf client ran at the same time. The scenario include a
validation of the ingest using a simple {{select count(..)}} on the primary key
for the table. I performed 2 runs of this on both master and the 2940 patch
(currently finishing up run 2 on master, but I don't expect a difference).
Performance appears to be pretty equivalent across both master and the 2940
patch.
I do have the numbers here if anyone is curious about them, but, IMO, the lack
of significant difference between master and this patch is what I wanted to
know to be more certain that we aren't introducing anything dumb performance
regression. I will double check the last comments from James again with fresh
eyes and then try to commit this tomorrow morning (FYI for 4.8
[~ankit.singhal]).
> Remove STATS RPCs from rowlock
> ------------------------------
>
> Key: PHOENIX-2940
> URL: https://issues.apache.org/jira/browse/PHOENIX-2940
> Project: Phoenix
> Issue Type: Improvement
> Environment: HDP 2.3 + Apache Phoenix 4.6.0
> Reporter: Nick Dimiduk
> Assignee: Josh Elser
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2940.001.patch, PHOENIX-2940.002.patch,
> PHOENIX-2940.003.patch, PHOENIX-2940.004.patch
>
>
> We have an unfortunate situation wherein we potentially execute many RPCs
> while holding a row lock. This is problem is discussed in detail on the user
> list thread ["Write path blocked by MetaDataEndpoint acquiring region
> lock"|http://search-hadoop.com/m/9UY0h2qRaBt6Tnaz1&subj=Write+path+blocked+by+MetaDataEndpoint+acquiring+region+lock].
> During some situations, the
> [MetaDataEndpoint|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L492]
> coprocessor will attempt to refresh it's view of the schema definitions and
> statistics. This involves [taking a
> rowlock|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2862],
> executing a scan against the [local
> region|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L542],
> and then a scan against a [potentially
> remote|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L964]
> statistics table.
> This issue is apparently exacerbated by the use of user-provided timestamps
> (in my case, the use of the ROW_TIMESTAMP feature, or perhaps as in
> PHOENIX-2607). When combined with other issues (PHOENIX-2939), we end up with
> total gridlock in our handler threads -- everyone queued behind the rowlock,
> scanning and rescanning SYSTEM.STATS. Because this happens in the
> MetaDataEndpoint, the means by which all clients refresh their knowledge of
> schema, gridlock in that RS can effectively stop all forward progress on the
> cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)