[
https://issues.apache.org/jira/browse/PHOENIX-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321456#comment-15321456
]
Hadoop QA commented on PHOENIX-2940:
------------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12809018/PHOENIX-2940.002.patch
against master branch at commit b2e3018c8799f326f453983f66eaf6c4291acd0f.
ATTACHMENT ID: 12809018
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 10 new
or modified tests.
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:red}-1 javadoc{color}. The javadoc tool appears to have generated
42 warning messages.
{color:red}-1 release audit{color}. The applied patch generated 7 release
audit warnings (more than the master's current 0 warnings).
{color:red}-1 lineLengths{color}. The patch introduces the following lines
longer than 100:
+ // Avoid querying the stats table because we're holding the
rowLock here. Issuing an RPC to a remote
+ long scn = context.getConnection().getSCN() == null ? Long.MAX_VALUE :
context.getConnection().getSCN();
+ PTableStats tableStats =
context.getConnection().getQueryServices().getTableStats(table.getName().getBytes(),
scn);
+ GuidePostsInfo gpsInfo =
tableStats.getGuidePosts().get(SchemaUtil.getEmptyColumnFamily(table));
+ tableStats = useStats() ?
context.getConnection().getQueryServices().getTableStats(physicalTableName,
currentSCN) : PTableStats.EMPTY_STATS;
+ PTableStats stats =
StatisticsUtil.readStatistics(statsHTable, physicalName, clientTimeStamp);
+ PTableStats stats =
connection.getQueryServices().getTableStats(Bytes.toBytes(physicalName),
getCurrentScn());
+ // Reference check -- we might not have gotten any stats. This
is what will happen if we fail to acquire stats
+ byte[] physicalSchemaName =
Bytes.toBytes(SchemaUtil.getSchemaNameFromFullName(physicalName));
+ byte[] physicalTableName =
Bytes.toBytes(SchemaUtil.getTableNameFromFullName(physicalName));
{color:red}-1 core tests{color}. The patch failed these unit tests:
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.CsvBulkLoadToolIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.rpc.PhoenixServerRpcIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.TenantSpecificTablesDMLIT
Test results:
https://builds.apache.org/job/PreCommit-PHOENIX-Build/390//testReport/
Release audit warnings:
https://builds.apache.org/job/PreCommit-PHOENIX-Build/390//artifact/patchprocess/patchReleaseAuditWarnings.txt
Javadoc warnings:
https://builds.apache.org/job/PreCommit-PHOENIX-Build/390//artifact/patchprocess/patchJavadocWarnings.txt
Console output:
https://builds.apache.org/job/PreCommit-PHOENIX-Build/390//console
This message is automatically generated.
> Remove STATS RPCs from rowlock
> ------------------------------
>
> Key: PHOENIX-2940
> URL: https://issues.apache.org/jira/browse/PHOENIX-2940
> Project: Phoenix
> Issue Type: Improvement
> Environment: HDP 2.3 + Apache Phoenix 4.6.0
> Reporter: Nick Dimiduk
> Assignee: Josh Elser
> Fix For: 4.9.0
>
> Attachments: PHOENIX-2940.001.patch, PHOENIX-2940.002.patch
>
>
> We have an unfortunate situation wherein we potentially execute many RPCs
> while holding a row lock. This is problem is discussed in detail on the user
> list thread ["Write path blocked by MetaDataEndpoint acquiring region
> lock"|http://search-hadoop.com/m/9UY0h2qRaBt6Tnaz1&subj=Write+path+blocked+by+MetaDataEndpoint+acquiring+region+lock].
> During some situations, the
> [MetaDataEndpoint|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L492]
> coprocessor will attempt to refresh it's view of the schema definitions and
> statistics. This involves [taking a
> rowlock|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2862],
> executing a scan against the [local
> region|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L542],
> and then a scan against a [potentially
> remote|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L964]
> statistics table.
> This issue is apparently exacerbated by the use of user-provided timestamps
> (in my case, the use of the ROW_TIMESTAMP feature, or perhaps as in
> PHOENIX-2607). When combined with other issues (PHOENIX-2939), we end up with
> total gridlock in our handler threads -- everyone queued behind the rowlock,
> scanning and rescanning SYSTEM.STATS. Because this happens in the
> MetaDataEndpoint, the means by which all clients refresh their knowledge of
> schema, gridlock in that RS can effectively stop all forward progress on the
> cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)