[
https://issues.apache.org/jira/browse/HIVE-18940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399869#comment-16399869
]
Vihang Karajgaonkar commented on HIVE-18940:
--------------------------------------------
I think part of the issue is that the event id generation mechanism is done
using a different table which always has 1 row {{NOTIFICATION_SEQUENCE}} Why
can't we use auto-increment directly on the {{NOTIFICATION_LOG}} table for
event id? I know derby doesn't support auto-increment but I think most
production clusters will be not using derby. Designing a feature just to make
it work for derby does not seem to be a good idea. If derby doesn't support
auto-increments we should treat it as an exception and handle that in the code
separately. We should also separate event ID and commit id for an event.
Currently, the event id is strictly tied with the actual commit time which is
why we have to hold the lock at the generation time until the transaction
commits which in theory could take a long time. Also, the timing of generating
the event id and actual commit is non-obvious in the code. So it is easy to
miss that while writing the code.
I think it would be great to use something like auto-increment to just uniquely
identify a notification log message. The actual commit id should be generated
from a global monotonically increasing number at the actual commit time. This
number should apply to all the events pertaining to that transaction. A
transaction which alters 1000 partitions should not have 1000 different ids
because they were not committed one by one in 1000 transactions. They were all
committed as one transaction and hence ideally should only generate one commit
id. This would greatly help with the lock durations for long transactions
because now commit lock is held for a constant time irrespective of how long
that transaction ran.
> Hive notifications serialize all write DDL operations
> -----------------------------------------------------
>
> Key: HIVE-18940
> URL: https://issues.apache.org/jira/browse/HIVE-18940
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Affects Versions: 3.0.0
> Reporter: Alexander Kolbasov
> Priority: Major
>
> The implementation of DbNotificationListener uses a single row to store
> current notification ID and uses {{SELECT FOR UPDATE}} to lock the row. This
> serializes all write DDL operations which isn't good.
> We should consider using database auto-increment for notification ID instead.
> Especially on mMySQL/innoDb it is supported natively with relatively
> light-weight locking.
> This creates potential issue for consumers though because such IDs may have
> holes. There are two types of holes - transient hole for a transaction which
> have not committed yet and will be committed shortly and permanent holes for
> transactions that fail. Consumers need to deal with it. It may be useful to
> add DB-generated timestamp as well to assist in recovery from holes.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)