[
https://issues.apache.org/jira/browse/HIVE-18885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389796#comment-16389796
]
Vihang Karajgaonkar commented on HIVE-18885:
--------------------------------------------
The current notification sequence id design has very strict constraints which
slows down the entire HMS. The restrictions like continuously increasing
without any holes essentially serializes all the transactions across multiple
instances. We should change the design to allow for holes in the sequence ids.
In order to do that it should guaranteed that notification is generated only
when a transaction is committed, instead of optimistically grabbing a
notification id number while holding the lock. The notification clients need to
know the commit order, as long as they can figure that out based on
monotonically increasing sequence ids I think its should be fine. Can we use
something like below to make sure we are generating notification ids only when
transaction is guaranteed to be committed. for example, something like below:
{code}
PersistenceManager pm = pmf.getPersistenceManager();
Transaction tx = pm.currentTransaction();
try
{
tx.begin();
tx.setSynchronization(new javax.transaction.Synchronization()
{
public void afterCompletion(int status)
{
if (status == javax.transaction.Status.STATUS_COMMITTED)
{
// generate notification sequence id
// we can also use database sequence generators which are much
more efficient
}
}
});
tx.commit();
}
finally
{
if (tx.isActive())
{
tx.rollback();
}
}
pm.close();
{code}
> Cascaded alter table + notifications = disaster
> -----------------------------------------------
>
> Key: HIVE-18885
> URL: https://issues.apache.org/jira/browse/HIVE-18885
> Project: Hive
> Issue Type: Bug
> Components: Hive, Metastore
> Affects Versions: 3.0.0
> Reporter: Alexander Kolbasov
> Priority: Major
>
> You can see the problem from looking at the code, but it actually created
> severe problems for real life Hive user.
> When {{alter table}} has {{cascade}} option it does the following:
> {code:java}
> msdb.openTransaction()
> ...
> List<Partition> parts = msdb.getPartitions(dbname, name, -1);
> for (Partition part : parts) {
> List<FieldSchema> oldCols = part.getSd().getCols();
> part.getSd().setCols(newt.getSd().getCols());
> String oldPartName =
> Warehouse.makePartName(oldt.getPartitionKeys(), part.getValues());
> updatePartColumnStatsForAlterColumns(msdb, part, oldPartName,
> part.getValues(), oldCols, part);
> msdb.alterPartition(dbname, name, part.getValues(), part);
> }
> {code}
> So it walks all partitions (and this may be huge list) and does some
> non-trivial operations in one single uber-transaction.
> When DbNotificationListener is enabled, it adds an event for each partition,
> all while
> holding a row lock on NOTIFICATION_SEQUENCE table. As a result, while this is
> happening no other write DDL can proceed. This can sometimes cause DB lock
> timeouts which cause HMS level operation retries which make things even worse.
> In one particular case this pretty much made HMS unusable.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)