[
https://issues.apache.org/jira/browse/HIVE-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944297#comment-14944297
]
Sushanth Sowmyan commented on HIVE-8519:
----------------------------------------
I notice a similar issue when I try to drop a table with about 50000 partitions.
Essentially, what seems to be happening with that flow is the following:
a) Deleting a table requires deleting all partition objects for that table,
Table->Partition is a 1:many mapping
b) Deleting the partition objects requires deleting a all SD objects associated
with the partitions, Partition->SD is a 1:1 mapping
c) Deleting SD objects requires looking for all CDs pointed to by the SDs, and
wherever a CD has no more SDs pointing to it, we need to drop the CD in
question, SD->CD is a many:1 mapping.
d) If a CD is to be deleted, we need to drop all List<MFieldSchema> associated
with it (COLUMNS_V2 where CD_ID in list of CDs to delete.)
The big inefficiency here is that SD->CD is a many:1 mapping with a goal of
reusing CDs for efficiency, but in practice, we don't. But the fact that it is
many:1, not 1:1, means we need to do that additional check before dropping
rather than simply dropping. This combination hits us in the worst way possible
for both of those.
We need to rethink the way we use our objects and either drop the many:1 intent
or actually make sure that we create a unique CD for every SD, or this is not
going to be scalable. Other solutions that bypass this wonky model may also
exist that we have to work out.
> Hive metastore lock wait timeout
> --------------------------------
>
> Key: HIVE-8519
> URL: https://issues.apache.org/jira/browse/HIVE-8519
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Affects Versions: 0.10.0
> Reporter: Liao, Xiaoge
>
> We got a lot of exception as below when doing a drop table partition, which
> made hive query every every slow. For example, it will cost 250s while
> executing use db_test;
> Log:
> 2014-10-17 04:04:46,873 ERROR Datastore.Persist (Log4JLogger.java:error(115))
> - Update of object
> "org.apache.hadoop.hive.metastore.model.MStorageDescriptor@13c9c4b3" using
> statement "UPDATE `SDS` SET `CD_ID`=? WHERE `SD_ID`=?" failed :
> java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1074)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4096)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4028)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2734)
> at
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)
> at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2458)
> at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2375)
> at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2359)
> at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
> at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
> at
> org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeUpdate(ParamLoggingPreparedStatement.java:399)
> at
> org.datanucleus.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:439)
> at
> org.datanucleus.store.rdbms.request.UpdateRequest.execute(UpdateRequest.java:374)
> at
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.updateTable(RDBMSPersistenceHandler.java:417)
> at
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.updateObject(RDBMSPersistenceHandler.java:390)
> at
> org.datanucleus.state.JDOStateManager.flush(JDOStateManager.java:5012)
> at org.datanucleus.FlushOrdered.execute(FlushOrdered.java:106)
> at
> org.datanucleus.ExecutionContextImpl.flushInternal(ExecutionContextImpl.java:4019)
> at
> org.datanucleus.ExecutionContextThreadedImpl.flushInternal(ExecutionContextThreadedImpl.java:450)
> at org.datanucleus.store.query.Query.prepareDatastore(Query.java:1575)
> at org.datanucleus.store.query.Query.executeQuery(Query.java:1760)
> at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:243)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.listStorageDescriptorsWithCD(ObjectStore.java:2185)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.removeUnusedColumnDescriptor(ObjectStore.java:2131)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.preDropStorageDescriptor(ObjectStore.java:2162)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionCommon(ObjectStore.java:1361)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.dropPartition(ObjectStore.java:1301)
> at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
> at $Proxy4.dropPartition(Unknown Source)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_partition_common(HiveMetaStore.java:1865)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_partition(HiveMetaStore.java:1911)
> at sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at $Proxy5.drop_partition(Unknown Source)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)