[
https://issues.apache.org/jira/browse/HIVE-22566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pablo Junge updated HIVE-22566:
-------------------------------
Status: Open (was: Patch Available)
> Drop table involved in materialized view leaves the table in inconsistent
> state
> -------------------------------------------------------------------------------
>
> Key: HIVE-22566
> URL: https://issues.apache.org/jira/browse/HIVE-22566
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 3.1.0
> Reporter: Pablo Junge
> Assignee: Pablo Junge
> Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22566.02.patch, HIVE-22566.patch
>
>
> If you try dropping a table which is part of the definition of a created
> materialized view, the table is not dropped, which is the desired state as it
> is part of the materialized view.
> However, there was a "drop" call to the table, so it tried to drop it but did
> not succeed, leaving it in an inconsistent state.
>
> Repro:
> -------
> 1) Create tables:
>
> {code:java}
> CREATE TABLE emps ( empid INT, deptno INT, name VARCHAR(256), salary
> FLOAT, hire_date TIMESTAMP)STORED AS ORC TBLPROPERTIES
> ('transactional'='true');
> CREATE TABLE depts ( deptno INT, deptname VARCHAR(256), locationid
> INT)STORED AS ORC TBLPROPERTIES ('transactional'='true');
> {code}
>
> 2) Create the VM:
>
> {code:java}
> CREATE MATERIALIZED VIEW mv1 AS SELECT empid, deptname, hire_date FROM emps
> JOIN depts ON (emps.deptno = depts.deptno) WHERE hire_date >= '2016-01-01';
> {code}
>
> 3) Following is in backend database at this point:
>
> {code:java}
> mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where
> DB_ID=16;
> +--------+-------+-------+----------+-------------------+
> | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE |
> +--------+-------+-------+----------+-------------------+
> | 81 | 16 | 81 | emps | MANAGED_TABLE |
> | 83 | 16 | 83 | depts | MANAGED_TABLE |
> | 84 | 16 | 84 | mv1 | MATERIALIZED_VIEW |
> +--------+-------+-------+----------+-------------------+
> 3 rows in set (0.00 sec)
> {code}
>
> 4) Let's drop the 'emps' table:
>
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> drop table emps;
> INFO : Compiling
> command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b):
> drop table emps
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO : Completed compiling
> command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b);
> Time taken: 0.05 seconds
> INFO : Executing
> command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b):
> drop table emps
> INFO : Starting task [Stage-0:DDL] in serial mode
> INFO : Completed executing
> command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b);
> Time taken: 10.281 seconds
> INFO : OK
> No rows affected (16.949 seconds)
> {code}
> No issue displayed
>
> 5) List tables:
>
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> show tables;
> INFO : Compiling
> command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413):
> show tables
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema:
> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from
> deserializer)], properties:null)
> INFO : Completed compiling
> command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413);
> Time taken: 0.041 seconds
> INFO : Executing
> command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413):
> show tables
> INFO : Starting task [Stage-0:DDL] in serial mode
> INFO : Completed executing
> command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413);
> Time taken: 0.016 seconds
> INFO : OK
> +-----------+
> | tab_name |
> +-----------+
> | depts |
> | emps |
> +-----------+
> 2 rows selected (0.08 seconds)
> {code}
>
> 6) Now, from the backend-db point of view:
>
> {code:java}
> mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where
> DB_ID=16;
> +--------+-------+-------+----------+-------------------+
> | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE |
> +--------+-------+-------+----------+-------------------+
> | 81 | 16 | NULL | emps | MANAGED_TABLE |
> | 83 | 16 | 83 | depts | MANAGED_TABLE |
> | 84 | 16 | 84 | mv1 | MATERIALIZED_VIEW |
> +--------+-------+-------+----------+-------------------+
> 3 rows in set (0.00 sec)
> {code}
> The table is left with NULL in SD_ID, making it not available.
>
> 7) From Metastore.log
>
> {code:java}
> 2019-12-02T20:00:25,545 INFO [pool-6-thread-195]: metastore.HiveMetaStore
> (HiveMetaStore.java:logInfo(907)) - 196: source:172.25.34.150 drop_table :
> tbl=hive.mvs.emps
> 2019-12-02T20:00:25,545 INFO [pool-6-thread-195]: HiveMetaStore.audit
> (HiveMetaStore.java:logAuditEvent(349)) - ugi=hive ip=172.25.34.150
> cmd=source:172.25.34.150 drop_table : tbl=hive.mvs.emps
> 2019-12-02T20:00:25,580 INFO [pool-6-thread-195]:
> metastore.ObjectStore$RetryingExecutor (ObjectStore.java:run(9966)) -
> Attempting to acquire the DB log notification lock: 0 out of 10 retries
> javax.jdo.JDODataStoreException: Error executing SQL query "select
> "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".
> at
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
> ~[datanucleus-api-jdo-4.2.4.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391)
> ~[datanucleus-api-jdo-4.2.4.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216)
> ~[datanucleus-api-jdo-4.2.4.jar:?]
> at
> org.apache.hadoop.hive.metastore.ObjectStore.lambda$lockForUpdate$0(ObjectStore.java:9936)
> ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:9963)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:9938)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:10002)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source) ~[?:?]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at com.sun.proxy.$Proxy28.addNotificationEvent(Unknown Source) [?:?]
> at
> org.apache.hive.hcatalog.listener.DbNotificationListener.process(DbNotificationListener.java:968)
> [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hive.hcatalog.listener.DbNotificationListener.onDropTable(DbNotificationListener.java:198)
> [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:99)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:273)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:335)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2670)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2842)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[?:1.8.0_112]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[?:1.8.0_112]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at com.sun.proxy.$Proxy30.drop_table_with_environment_context(Unknown
> Source) [?:?]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15533)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15517)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at java.security.AccessController.doPrivileged(Native Method)
> [?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> [hadoop-common-3.1.1.3.1.0.0-78.jar:?]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_112]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: java.sql.BatchUpdateException: Cannot delete or update a parent
> row: a foreign key constraint fails ("hive"."MV_TABLES_USED", CONSTRAINT
> "MV_TABLES_USED_FK2" FOREIGN KEY ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
> at
> com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2058)
> ~[mysql-connector-java.jar:?]
> at
> com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471)
> ~[mysql-connector-java.jar:?]
> at
> com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125)
> ~[HikariCP-2.6.1.jar:?]
> at
> com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java)
> ~[HikariCP-2.6.1.jar:?]
> at
> org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.executeQuery(Query.java:1855)
> ~[datanucleus-core-4.1.17.jar:?]
> at
> org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.execute(Query.java:1726)
> ~[datanucleus-core-4.1.17.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374)
> ~[datanucleus-api-jdo-4.2.4.jar:?]
> ... 37 more
> Caused by:
> com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException:
> Cannot delete or update a parent row: a foreign key constraint fails
> ("hive"."MV_TABLES_USED", CONSTRAINT "MV_TABLES_USED_FK2" FOREIGN KEY
> ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) ~[?:1.8.0_112]
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> ~[?:1.8.0_112]
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> ~[?:1.8.0_112]
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> ~[?:1.8.0_112]
> at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
> ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.Util.getInstance(Util.java:386)
> ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1041)
> ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
> ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
> ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
> ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
> ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2820)
> ~[mysql-connector-java.jar:?]
> at
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2159)
> ~[mysql-connector-java.jar:?]
> at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2462)
> ~[mysql-connector-java.jar:?]
> at
> com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2010)
> ~[mysql-connector-java.jar:?]
> at
> com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471)
> ~[mysql-connector-java.jar:?]
> at
> com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125)
> ~[HikariCP-2.6.1.jar:?]
> at
> com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java)
> ~[HikariCP-2.6.1.jar:?]
> at
> org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at
> org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.executeQuery(Query.java:1855)
> ~[datanucleus-core-4.1.17.jar:?]
> at
> org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807)
> ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.execute(Query.java:1726)
> ~[datanucleus-core-4.1.17.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374)
> ~[datanucleus-api-jdo-4.2.4.jar:?]
> ... 37 more
> {code}
>
>
> 8) If you try to query the table:
>
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> select * from emps;
> Error: Error while compiling statement: FAILED: SemanticException Unable to
> fetch table emps. null (state=42000,code=40000)
> {code}
>
> It fails as expected.
> 9) If you try to query the MV:
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> select * from mv1; INFO :
> Compiling
> command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2):
> select * from mv1 INFO : Semantic Analysis Completed (retrial = false) INFO :
> Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:mv1.empid,
> type:int, comment:null), FieldSchema(name:mv1.deptname, type:varchar(256),
> comment:null), FieldSchema(name:mv1.hire_date, type:timestamp,
> comment:null)], properties:null) INFO : Completed compiling
> command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2);
> Time taken: 0.229 seconds INFO : Executing
> command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2):
> select * from mv1 INFO : Completed executing
> command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2);
> Time taken: 0.01 seconds INFO : OK
> +------------+---------------+----------------+ | mv1.empid | mv1.deptname |
> mv1.hire_date | +------------+---------------+----------------+
> +------------+---------------+----------------+ No rows selected (0.276
> seconds)
> {code}
> It does not fail, as the underlying data has not changed, and the table is
> still being shown as valid.
>
> 10) Insert data into "depts" table and rebuild the mv.
> {code:java}
> $ INSERT INTO TABLE depts VALUES (101,'IT',25);
> $ INSERT INTO TABLE depts VALUES (102,'Eng',11);
> 0: jdbc:hive2://c1122-node2.squadron.support.> ALTER MATERIALIZED VIEW
> mvs.mv1 REBUILD;
> Error: Error while compiling statement: FAILED: SemanticException Unable to
> fetch table emps. null (state=42000,code=40000)
> {code}
> This fails as expected.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)