[jira] [Commented] (HIVE-18885) Cascaded alter table + notifications = disaster

Alexander Kolbasov (JIRA) Wed, 07 Mar 2018 22:51:36 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-18885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390835#comment-16390835
 ]


Alexander Kolbasov commented on HIVE-18885:
-------------------------------------------

[~anishek] You are actually right, event is added after the loop. I guess what 
is happening is that some operation combines more then a single 
event-generating action. The first one obtains the lock which is not dropped 
immediately because the transaction isn't closed and the subsequent one 
executes with lock held. In this particular case the thread that was holding 
the lock was the one executing alter_table_with_cascade, but it could be that 
it was just the lucky guy who managed to get the lock.

> Cascaded alter table + notifications = disaster
> -----------------------------------------------
>
>                 Key: HIVE-18885
>                 URL: https://issues.apache.org/jira/browse/HIVE-18885
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Metastore
>    Affects Versions: 3.0.0
>            Reporter: Alexander Kolbasov
>            Priority: Major
>
> You can see the problem from looking at the code, but it actually created 
> severe problems for real life Hive user.
> When {{alter table}} has {{cascade}} option it does the following:
> {code:java}
>          msdb.openTransaction()
>           ...
>           List<Partition> parts = msdb.getPartitions(dbname, name, -1);
>           for (Partition part : parts) {
>             List<FieldSchema> oldCols = part.getSd().getCols();
>             part.getSd().setCols(newt.getSd().getCols());
>             String oldPartName = 
> Warehouse.makePartName(oldt.getPartitionKeys(), part.getValues());
>             updatePartColumnStatsForAlterColumns(msdb, part, oldPartName, 
> part.getValues(), oldCols, part);
>             msdb.alterPartition(dbname, name, part.getValues(), part);
>           }
>  {code}
> So it walks all partitions (and this may be huge list) and does some 
> non-trivial operations in one single uber-transaction.
> When DbNotificationListener is enabled, it adds an event for each partition, 
> all while
> holding a row lock on NOTIFICATION_SEQUENCE table. As a result, while this is 
> happening no other write DDL can proceed. This can sometimes cause DB lock 
> timeouts which cause HMS level operation retries which make things even worse.
> In one particular case this pretty much made HMS unusable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18885) Cascaded alter table + notifications = disaster

Reply via email to