[ https://issues.apache.org/jira/browse/PHOENIX-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319888#comment-16319888 ]
Flavio Pompermaier edited comment on PHOENIX-4523 at 1/10/18 8:30 AM: ---------------------------------------------------------------------- Hi ~karanmehta93, You can find the complete stacktrace at the end of the comment. Now I respond to your questions: 1. I'm using Phoenix in a Flink (1.3.1) job. There's no problem until the connection is opened (almost) simultaneously in the cluster. After the error, the Flink job fails and I find a row in the SYSTEM:MUTEX (UPGRADE_MUTEX_UNLOCKED). This does not happen during local test from the IDE, it happens only when you try to run the job on a cluster, where multiple hosts try to create that table and put a row inside it (IMHO) 2. I don't know if there's an easy way to debug it locally because I can reproduce it only when I submit the Flink job to the cluster. I think here there are 2 different errors/problems: # first, there should not be attempt to create the mutex table (and this happens because getSystemTableNames() doesn't handle properly the case where namespace are enabled and the migration has been completed) # second, it seems that TableExistsException is not catched within createSysMutexTable() because is wrapped by a org.apache.hadoop.ipc.RemoteException, and here I don't know why that exception is not wrapped during local debug (connected to a remote cluster as well..) 3. I'm using 4.13.1-HBase-1.2 on client side, and the Cloudera parcel (for CDH 5.11.2) on the server side 4. The error is thrown on a system where the SYSTEM tables are already migrated (if you refer to SYSTEM. => SYSTEM: migration Thanks, Flavio {code:java} java.lang.IllegalArgumentException: open() failed.org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): SYSTEM:MUTEX at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.openInputFormat(JDBCInputFormat.java:144) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:115) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) at java.lang.Thread.run(Thread.java:745) Caused by: java.sql.SQLException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): SYSTEM:MUTEX at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2492) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2384) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2384) at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255) at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.openInputFormat(JDBCInputFormat.java:138) ... 3 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): SYSTEM:MUTEX at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.prepareCreate(CreateTableProcedure.java:288) at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:109) at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:59) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:119) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:498) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1061) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:856) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:809) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:495) {code} was (Author: f.pompermaier): Hi ~karanmehta93, You can find the complete stacktrace at the end of the comment. Now I respond to your questions: 1. I'm using Phoenix in a Flink (1.3.1) job. There's no problem until the connection is opened (almost) simultaneously in the cluster. After the error, the Flink job fails and I find a row in the SYSTEM:MUTEX (UPGRADE_MUTEX_UNLOCKED). This does not happen during local test from the IDE, it happens only when you try to run the job on a cluster, where multiple hosts try to create that table and put a row inside it (IMHO) 2. I don't know if there's an easy way to debug it locally because I can reproduce it only when I submit the Flink job to the cluster. I think here there are 2 different errors/problems: # first, there should not be attempt to create the mutex table (and this happens because getSystemTableNames() doesn't handle properly the case where namespace are enabled and the migration has been completed) # second, it seems that TableExistsException is not catched within createSysMutexTable() because is wrapped by a org.apache.hadoop.ipc.RemoteException, and here I don't know why that exception is not wrapped during local debug (connected to a remote cluster as well..) 3. I'm using 4.13.1-HBase-1.2 on client side, and the Cloudera parcel (for CDH 5.11.2) on the server side 4. The error is thrown on a system where the SYSTEM tables are already migrated (if you refer to SYSTEM. => SYSTEM: migration Thanks, Flavio {code:java} java.lang.IllegalArgumentException: open() failed.org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): SYSTEM:MUTEX at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.openInputFormat(JDBCInputFormat.java:144) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:115) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) at java.lang.Thread.run(Thread.java:745) Caused by: java.sql.SQLException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): SYSTEM:MUTEX at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2492) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2384) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2384) at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255) at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.openInputFormat(JDBCInputFormat.java:138) ... 3 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): SYSTEM:MUTEX at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.prepareCreate(CreateTableProcedure.java:288) at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:109) at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:59) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:119) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:498) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1061) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:856) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:809) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:495) {code} > phoenix.schema.isNamespaceMappingEnabled problem > ------------------------------------------------ > > Key: PHOENIX-4523 > URL: https://issues.apache.org/jira/browse/PHOENIX-4523 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.13.1 > Reporter: Flavio Pompermaier > Assignee: Karan Mehta > > I'm using Phoenix 4.13 for CDH 5.11.2 parcel and enabling schemas made my > code unusable. > I think that this is not a bug of the CDH release, but of all 4.13.x releases. > I have many parallel Phoenix connections and I always get the following > exception: > {code:java} > Caused by: java.sql.SQLException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): > SYSTEM:MUTEX > at > org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2492) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2384) > at > org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) > at > org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2384) > at > org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255) > at > org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) > at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:270) > {code} > This is caused by the fact that all the times the SYSTEM tables are > recreated, and this cannot be done simultaneously. > Trying to debug the issue I found that in > ConnectionQueryServicesImpl.createSysMutexTable() the call to > getSystemTableNames() always return an empty array and the SYSTEM:MUTEX > table is always recreated. > This because getSystemTableNames() doesn't consider the case when system > tables have namespace enabled. Right now that method tries to get all tables > starting with *SYSTEM.\**, while it should try to get the list of *SYSTEM:\** > tables.. > I hope this could get fixed very soon, > Flavio -- This message was sent by Atlassian JIRA (v6.4.14#64029)