[
https://issues.apache.org/jira/browse/PHOENIX-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xinyi Yan reassigned PHOENIX-5940:
----------------------------------
Assignee: Xinyi Yan
> Pre-4.15 client cannot connect to 4.15+ server after SYSTEM.CATALOG region
> has split
> ------------------------------------------------------------------------------------
>
> Key: PHOENIX-5940
> URL: https://issues.apache.org/jira/browse/PHOENIX-5940
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.14.3
> Reporter: Chinmay Kulkarni
> Assignee: Xinyi Yan
> Priority: Blocker
> Fix For: 4.16.0
>
>
> Steps to repro:
> # Start the server with 4.15 or 4.16-SNAPSHOT (head of 4.x) with the default
> setting for splitting SYSTEM.CATALOG i.e.
> _phoenix.system.catalog.splittable=true_
> # Connect with a 4.15+ client and create enough tables/views/indices to
> cause the SYSTEM.CATALOG region to split (you may want to set the following
> server-side configs for a quicker repro:
> ## _hbase.hregion.memstore.flush.size=1048576_ i.e. 1MB (to flush memstores
> quicker),
> ## _hbase.hregion.max.filesize=2097152_ i.e. 2MB (so we don've have to load
> too much data to cause a region split),
> ## _hbase.zookeeper.property.maxClientCnxns=-1_ (If we don’t set this, the
> default limit is easily hit when creating hundreds of views),
> ## _hbase.table.sanity.checks=false_ (otherwise HBase complains that the
> HRegion max file size config is too small).
> # With these configs, I've found that creating ~4000 views is sufficient to
> cause the SYSTEM.CATALOG region to split.
> # Now connect with any pre-4.15 client like 4.14.3. Getting a connection
> will fail with the following stack trace:
> {noformat}
> Caused by:
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR
> 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on
> region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> at
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> at
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl
> doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> at
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> at
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> ... 9 more
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1275)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:35542)
> at
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1702)
> ... 13 more
> 20/06/04 19:14:18 WARN client.HTable: Error calling coprocessor service
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for
> row
> java.util.concurrent.ExecutionException:
> org.apache.hadoop.hbase.DoNotRetryIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013
> (INT15): MetadataEndpointImpl doGetTable called for table not present on
> region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> at
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> at
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl
> doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> at
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> at
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> ... 9 more
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1775)
> at
> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1731)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1350)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1239)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1576)
> at
> org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2731)
> at
> org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1115)
> at
> org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:410)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:393)
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> at
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:392)
> at
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:380)
> at
> org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1810)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2623)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2586)
> at
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
> at
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2586)
> at
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
> at
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144)
> at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
> at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
> at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
> at sqlline.Commands.connect(Commands.java:1064)
> at sqlline.Commands.connect(Commands.java:996)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
> at sqlline.SqlLine.dispatch(SqlLine.java:809)
> at sqlline.SqlLine.initArgs(SqlLine.java:588)
> at sqlline.SqlLine.begin(SqlLine.java:661)
> at sqlline.SqlLine.start(SqlLine.java:398)
> at sqlline.SqlLine.main(SqlLine.java:291)
> {noformat}
> RS logs for the region throwing the error:
> {noformat}
> 2020-06-04 19:14:18,655 ERROR
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704]
> coprocessor.MetaDataEndpointImpl: loading system catalog table inside
> getVersion failed
> java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable
> called for table not present on region tableName=SYSTEM.CATALOG
> at
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> at
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> at
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> 2020-06-04 19:14:18,656 DEBUG
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] ipc.RpcServer:
> RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704: callId: 7 service:
> ClientService methodName: ExecService size: 131 connection: 10.3.4.181:57305
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013
> (INT15): MetadataEndpointImpl doGetTable called for table not present on
> region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> at
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> at
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl
> doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> at
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> at
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> at
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> ... 9 more
> {noformat}
> The reason why this happens is that in a pre-4.15 client, inside
> CQSI.checkClientServerCompatibility, the getVersion method is invoked on
> MetaDataEndpointImpl over *all SYSTEM.CATALOG regions* (we pass in null for
> startKey and endKey), see
> [this|https://github.com/apache/phoenix/blob/e2993552dc88cb7fc59fc0dfdaa2876ac260886c/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1350].
> Inside MetaDataEndpointImpl#getVersion, we [call
> doGetTable|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L3224]
> Now, if SYSTEM.CATALOG has split, this call will also be invoked on a region
> that does not contain the header row for SYSTEM.CATALOG causing it to fail in
> MetaDataEndpointImpl#doGetTable
> [here|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2928-L2933].
> This is avoided in 4.15+ clients since we have restricted the getVersion
> invocation to the region containing the header row for SYSTEM.CATALOG (see
> [this|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1517]).
> We need to add a special condition to consider pre-4.15 clients before
> propagating the error back to clients inside MetaDataEndpointImpl.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)