[ 
https://issues.apache.org/jira/browse/PHOENIX-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223399#comment-17223399
 ] 

ASF GitHub Bot commented on PHOENIX-5940:
-----------------------------------------

yanxinyi opened a new pull request #950:
URL: https://github.com/apache/phoenix/pull/950


   …TEM.CATALOG region has split


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Pre-4.15 client cannot connect to 4.15+ server after SYSTEM.CATALOG region 
> has split
> ------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5940
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5940
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.3
>            Reporter: Chinmay Kulkarni
>            Assignee: Xinyi Yan
>            Priority: Blocker
>             Fix For: 4.16.0
>
>
> Steps to repro:
>  # Start the server with 4.15 or 4.16-SNAPSHOT (head of 4.x) with the default 
> setting for splitting SYSTEM.CATALOG i.e. 
> _phoenix.system.catalog.splittable=true_
>  # Connect with a 4.15+ client and create enough tables/views/indices to 
> cause the SYSTEM.CATALOG region to split (you may want to set the following 
> server-side configs for a quicker repro:
>  ## _hbase.hregion.memstore.flush.size=1048576_ i.e. 1MB (to flush memstores 
> quicker),
>  ## _hbase.hregion.max.filesize=2097152_ i.e. 2MB (so we don've have to load 
> too much data to cause a region split),
>  ## _hbase.zookeeper.property.maxClientCnxns=-1_ (If we don’t set this, the 
> default limit is easily hit when creating hundreds of views),
>  ## _hbase.table.sanity.checks=false_ (otherwise HBase complains that the 
> HRegion max file size config is too small).
>  # With these configs, I've found that creating ~4000 views is sufficient to 
> cause the SYSTEM.CATALOG region to split.
>  # Now connect with any pre-4.15 client like 4.14.3. Getting a connection 
> will fail with the following stack trace:
> {noformat}
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
>  org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 
> 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on 
> region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
>       at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
>       at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl 
> doGetTable called for table not present on region tableName=SYSTEM.CATALOG
>       at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
>       at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
>       ... 9 more
>       at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1275)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:35542)
>       at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1702)
>       ... 13 more
> 20/06/04 19:14:18 WARN client.HTable: Error calling coprocessor service 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for 
> row
> java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 
> (INT15): MetadataEndpointImpl doGetTable called for table not present on 
> region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
>       at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
>       at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl 
> doGetTable called for table not present on region tableName=SYSTEM.CATALOG
>       at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
>       at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
>       ... 9 more
>       at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>       at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>       at 
> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1775)
>       at 
> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1731)
>       at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1350)
>       at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1239)
>       at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1576)
>       at 
> org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2731)
>       at 
> org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1115)
>       at 
> org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)
>       at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:410)
>       at 
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:393)
>       at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>       at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:392)
>       at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:380)
>       at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1810)
>       at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2623)
>       at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2586)
>       at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
>       at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2586)
>       at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
>       at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144)
>       at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
>       at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
>       at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
>       at sqlline.Commands.connect(Commands.java:1064)
>       at sqlline.Commands.connect(Commands.java:996)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>       at sqlline.SqlLine.dispatch(SqlLine.java:809)
>       at sqlline.SqlLine.initArgs(SqlLine.java:588)
>       at sqlline.SqlLine.begin(SqlLine.java:661)
>       at sqlline.SqlLine.start(SqlLine.java:398)
>       at sqlline.SqlLine.main(SqlLine.java:291)
> {noformat}
> RS logs for the region throwing the error:
> {noformat}
> 2020-06-04 19:14:18,655 ERROR 
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] 
> coprocessor.MetaDataEndpointImpl: loading system catalog table inside 
> getVersion failed
> java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable 
> called for table not present on region tableName=SYSTEM.CATALOG
>       at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
>       at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
>       at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> 2020-06-04 19:14:18,656 DEBUG 
> [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] ipc.RpcServer: 
> RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704: callId: 7 service: 
> ClientService methodName: ExecService size: 131 connection: 10.3.4.181:57305
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 
> (INT15): MetadataEndpointImpl doGetTable called for table not present on 
> region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
>       at 
> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
>       at 
> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl 
> doGetTable called for table not present on region tableName=SYSTEM.CATALOG
>       at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
>       at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
>       at 
> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
>       ... 9 more
> {noformat}
> The reason why this happens is that in a pre-4.15 client, inside 
> CQSI.checkClientServerCompatibility, the getVersion method is invoked on 
> MetaDataEndpointImpl over *all SYSTEM.CATALOG regions* (we pass in null for 
> startKey and endKey), see 
> [this|https://github.com/apache/phoenix/blob/e2993552dc88cb7fc59fc0dfdaa2876ac260886c/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1350].
> Inside MetaDataEndpointImpl#getVersion, we [call 
> doGetTable|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L3224]
>  Now, if SYSTEM.CATALOG has split, this call will also be invoked on a region 
> that does not contain the header row for SYSTEM.CATALOG causing it to fail in 
> MetaDataEndpointImpl#doGetTable 
> [here|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2928-L2933].
> This is avoided in 4.15+ clients since we have restricted the getVersion 
> invocation to the region containing the header row for SYSTEM.CATALOG (see 
> [this|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1517]).
> We need to add a special condition to consider pre-4.15 clients before 
> propagating the error back to clients inside MetaDataEndpointImpl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to