Chinmay Kulkarni created PHOENIX-5940:
-----------------------------------------

             Summary: Pre-4.15 client cannot connect to 4.15+ server after 
SYSTEM.CATALOG region has split
                 Key: PHOENIX-5940
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5940
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.14.3
            Reporter: Chinmay Kulkarni
             Fix For: 4.16.0


Steps to repro:
 # Start the server with 4.15 or 4.16-SNAPSHOT (head of 4.x) with the default 
setting for splitting SYSTEM.CATALOG i.e. phoenix.system.catalog.splittable=true
 # Connect with a 4.15+ client and create enough tables/views/indices to cause 
SYSTEM.CATALOG region to split
 # Now connect with any pre-4.15 client like 4.14.3. Getting a connection will 
fail with the following stack trace:

{noformat}
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
 org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 
(INT15): MetadataEndpointImpl doGetTable called for table not present on region 
tableName=SYSTEM.CATALOG SYSTEM.CATALOG
        at 
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
        at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
        at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
        at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl 
doGetTable called for table not present on region tableName=SYSTEM.CATALOG
        at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
        at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
        at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
        at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
        ... 9 more

        at 
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1275)
        at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
        at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
        at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:35542)
        at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1702)
        ... 13 more
20/06/04 19:14:18 WARN client.HTable: Error calling coprocessor service 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for row
java.util.concurrent.ExecutionException: 
org.apache.hadoop.hbase.DoNotRetryIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 
(INT15): MetadataEndpointImpl doGetTable called for table not present on region 
tableName=SYSTEM.CATALOG SYSTEM.CATALOG
        at 
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
        at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
        at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
        at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl 
doGetTable called for table not present on region tableName=SYSTEM.CATALOG
        at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
        at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
        at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
        at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
        ... 9 more

        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1775)
        at 
org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1731)
        at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1350)
        at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1239)
        at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1576)
        at 
org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2731)
        at 
org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1115)
        at 
org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)
        at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:410)
        at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:393)
        at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
        at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:392)
        at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:380)
        at 
org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1810)
        at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2623)
        at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2586)
        at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
        at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2586)
        at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
        at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144)
        at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
        at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
        at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
        at sqlline.Commands.connect(Commands.java:1064)
        at sqlline.Commands.connect(Commands.java:996)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
        at sqlline.SqlLine.dispatch(SqlLine.java:809)
        at sqlline.SqlLine.initArgs(SqlLine.java:588)
        at sqlline.SqlLine.begin(SqlLine.java:661)
        at sqlline.SqlLine.start(SqlLine.java:398)
        at sqlline.SqlLine.main(SqlLine.java:291)
{noformat}
RS logs for the region throwing the error:
{noformat}
2020-06-04 19:14:18,655 ERROR 
[RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] 
coprocessor.MetaDataEndpointImpl: loading system catalog table inside 
getVersion failed2020-06-04 19:14:18,655 ERROR 
[RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] 
coprocessor.MetaDataEndpointImpl: loading system catalog table inside 
getVersion failedjava.sql.SQLException: ERROR 2013 (INT15): 
MetadataEndpointImpl doGetTable called for table not present on region 
tableName=SYSTEM.CATALOG at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
 at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
 at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
 at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
 at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
 at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338) 
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
 at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
 at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394) at 
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)2020-06-04
 19:14:18,656 DEBUG [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] 
ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704: 
callId: 7 service: ClientService methodName: ExecService size: 131 connection: 
10.3.4.181:57305org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 
(INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table 
not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG at 
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114) at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
 at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
 at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338) 
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
 at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
 at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394) at 
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)Caused 
by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable 
called for table not present on region tableName=SYSTEM.CATALOG at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
 at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
 at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
 at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
 ... 9 more{noformat}

The reason why this happens is that in a pre-4.15 client, inside 
CQSI.checkClientServerCompatibility, the getVersion method is invoked on 
MetaDataEndpointImpl over *all SYSTEM.CATALOG regions* (we pass in null for 
startKey and endKey), see 
[this|https://github.com/apache/phoenix/blob/e2993552dc88cb7fc59fc0dfdaa2876ac260886c/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1350].

Inside MetaDataEndpointImpl#getVersion, we [call 
doGetTable|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L3224]
 Now, if SYSTEM.CATALOG has split, this call will also be invoked on a region 
that does not contain the header row for SYSTEM.CATALOG causing it to fail in 
MetaDataEndpointImpl#doGetTable 
[here|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2928-L2933].

This is avoided in 4.15+ clients since we have restricted the getVersion 
invocation to the region containing the header row for SYSTEM.CATALOG (see 
[this|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1517]).

We need to add a special condition to consider pre-4.15 clients before 
propagating the error back to clients inside MetaDataEndpointImpl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to