[ 
https://issues.apache.org/jira/browse/PHOENIX-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336758#comment-15336758
 ] 

Josh Elser commented on PHOENIX-2940:
-------------------------------------

{quote}
Change ConnectionQueryServices.invalidateStats(), 
ConnectionQueryServicesImpl.addTableStats(), 
ConnectionQueryServicesImpl.getTableStats(), and TableStatsCache.put() to all 
be consistent and use ImmutableBytesPtr as the arg as it's possible you'd want 
to get the stats without having a PTable.
Remove TableStatsCache.put(PTable).
{quote}

Replacing with {{byte[]}} or {{ImmutableBytesPtr}}? I see {{byte[]}} primarily 
in use by {{ConnectionQueryServices}}. Unless I hear otherwise from ya, I'll go 
the 'consistency with what's already there' route :)

bq. Would it be possible to remove repeated PTableStats guidePosts = 12 from 
phoenix-protocol/src/main/PTable.proto without affecting b/w compat?

Older client talking to newer server: The server would send a PTable from the 
cache without the stats field, so the client would just think that it's 
missing. The old client would construct a PTableStatsImpl with an empty list of 
guideposts

Newer client talking to older server: The client would ignore the stats sent in 
the PTable protobuf and query it on its own.

So, the only concern I can think of is preventing any future use of the 
identifier {{12}} in PTable. If that would happen in some later Phoenix release 
it could break older clients. The protobuf 2 docs actually have a section:

{quote}
Non-required fields can be removed, as long as the tag number is not used again 
in your updated message type. You may want to rename the field instead, perhaps 
adding the prefix "OBSOLETE_", or make the tag reserved, so that future users 
of your .proto can't accidentally reuse the number. 
{quote}

I could remove it and leave a big-fat-warning to not reuse the number 12 (and 
we'd just need to be aware of it for a few releases in code-reviews to prevent 
someone from trying to be smart). How does that strike you?

> Remove STATS RPCs from rowlock
> ------------------------------
>
>                 Key: PHOENIX-2940
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2940
>             Project: Phoenix
>          Issue Type: Improvement
>         Environment: HDP 2.3 + Apache Phoenix 4.6.0
>            Reporter: Nick Dimiduk
>            Assignee: Josh Elser
>             Fix For: 4.8.0
>
>         Attachments: PHOENIX-2940.001.patch, PHOENIX-2940.002.patch, 
> PHOENIX-2940.003.patch, PHOENIX-2940.004.patch
>
>
> We have an unfortunate situation wherein we potentially execute many RPCs 
> while holding a row lock. This is problem is discussed in detail on the user 
> list thread ["Write path blocked by MetaDataEndpoint acquiring region 
> lock"|http://search-hadoop.com/m/9UY0h2qRaBt6Tnaz1&subj=Write+path+blocked+by+MetaDataEndpoint+acquiring+region+lock].
>  During some situations, the 
> [MetaDataEndpoint|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L492]
>  coprocessor will attempt to refresh it's view of the schema definitions and 
> statistics. This involves [taking a 
> rowlock|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2862],
>  executing a scan against the [local 
> region|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L542],
>  and then a scan against a [potentially 
> remote|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L964]
>  statistics table.
> This issue is apparently exacerbated by the use of user-provided timestamps 
> (in my case, the use of the ROW_TIMESTAMP feature, or perhaps as in 
> PHOENIX-2607). When combined with other issues (PHOENIX-2939), we end up with 
> total gridlock in our handler threads -- everyone queued behind the rowlock, 
> scanning and rescanning SYSTEM.STATS. Because this happens in the 
> MetaDataEndpoint, the means by which all clients refresh their knowledge of 
> schema, gridlock in that RS can effectively stop all forward progress on the 
> cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to