Chinmay Kulkarni created PHOENIX-5940:
-----------------------------------------
Summary: Pre-4.15 client cannot connect to 4.15+ server after
SYSTEM.CATALOG region has split
Key: PHOENIX-5940
URL: https://issues.apache.org/jira/browse/PHOENIX-5940
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.14.3
Reporter: Chinmay Kulkarni
Fix For: 4.16.0
Steps to repro:
# Start the server with 4.15 or 4.16-SNAPSHOT (head of 4.x) with the default
setting for splitting SYSTEM.CATALOG i.e. phoenix.system.catalog.splittable=true
# Connect with a 4.15+ client and create enough tables/views/indices to cause
SYSTEM.CATALOG region to split
# Now connect with any pre-4.15 client like 4.14.3. Getting a connection will
fail with the following stack trace:
{noformat}
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013
(INT15): MetadataEndpointImpl doGetTable called for table not present on region
tableName=SYSTEM.CATALOG SYSTEM.CATALOG
at
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl
doGetTable called for table not present on region tableName=SYSTEM.CATALOG
at
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
at
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
... 9 more
at
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1275)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:35542)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1702)
... 13 more
20/06/04 19:14:18 WARN client.HTable: Error calling coprocessor service
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for row
java.util.concurrent.ExecutionException:
org.apache.hadoop.hbase.DoNotRetryIOException:
org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013
(INT15): MetadataEndpointImpl doGetTable called for table not present on region
tableName=SYSTEM.CATALOG SYSTEM.CATALOG
at
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl
doGetTable called for table not present on region tableName=SYSTEM.CATALOG
at
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
at
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
... 9 more
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1775)
at
org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1731)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1350)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1239)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1576)
at
org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2731)
at
org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1115)
at
org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)
at
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:410)
at
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:393)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:392)
at
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:380)
at
org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1810)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2623)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2586)
at
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2586)
at
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
at
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
at sqlline.Commands.connect(Commands.java:1064)
at sqlline.Commands.connect(Commands.java:996)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
at sqlline.SqlLine.dispatch(SqlLine.java:809)
at sqlline.SqlLine.initArgs(SqlLine.java:588)
at sqlline.SqlLine.begin(SqlLine.java:661)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:291)
{noformat}
RS logs for the region throwing the error:
{noformat}
2020-06-04 19:14:18,655 ERROR
[RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704]
coprocessor.MetaDataEndpointImpl: loading system catalog table inside
getVersion failed2020-06-04 19:14:18,655 ERROR
[RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704]
coprocessor.MetaDataEndpointImpl: loading system catalog table inside
getVersion failedjava.sql.SQLException: ERROR 2013 (INT15):
MetadataEndpointImpl doGetTable called for table not present on region
tableName=SYSTEM.CATALOG at
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
at
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394) at
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)2020-06-04
19:14:18,656 DEBUG [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704]
ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704:
callId: 7 service: ClientService methodName: ExecService size: 131 connection:
10.3.4.181:57305org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013
(INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table
not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG at
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114) at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394) at
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)Caused
by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable
called for table not present on region tableName=SYSTEM.CATALOG at
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
at
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
... 9 more{noformat}
The reason why this happens is that in a pre-4.15 client, inside
CQSI.checkClientServerCompatibility, the getVersion method is invoked on
MetaDataEndpointImpl over *all SYSTEM.CATALOG regions* (we pass in null for
startKey and endKey), see
[this|https://github.com/apache/phoenix/blob/e2993552dc88cb7fc59fc0dfdaa2876ac260886c/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1350].
Inside MetaDataEndpointImpl#getVersion, we [call
doGetTable|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L3224]
Now, if SYSTEM.CATALOG has split, this call will also be invoked on a region
that does not contain the header row for SYSTEM.CATALOG causing it to fail in
MetaDataEndpointImpl#doGetTable
[here|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2928-L2933].
This is avoided in 4.15+ clients since we have restricted the getVersion
invocation to the region containing the header row for SYSTEM.CATALOG (see
[this|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1517]).
We need to add a special condition to consider pre-4.15 clients before
propagating the error back to clients inside MetaDataEndpointImpl.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)