[
https://issues.apache.org/jira/browse/SENTRY-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138970#comment-16138970
]
Sergio Peña commented on SENTRY-1895:
-------------------------------------
I don't see the difference either. Seems the SENTRY_PATH_CHANGE does not
persist the ID anymore, the SENTRY_HMS_NOTIFICATION_ID to keeps all the seen
IDs now (SENTRY_PATH_CHANGE used to have only the processed IDs).
For upgrades, we're thinking of keeping the NOTIFICATION_HASH. We will have the
same upgrade issue on SENTRY_PATH_CHANGE if we want to remove it, right). Btw,
SENTRY-1885 is asking if we're safe to remove this ID from the
SENTRY_PATH_CHANGE.
> Sentry should handle the case of multiple notifications with the same ID
> ------------------------------------------------------------------------
>
> Key: SENTRY-1895
> URL: https://issues.apache.org/jira/browse/SENTRY-1895
> Project: Sentry
> Issue Type: Sub-task
> Components: Sentry
> Affects Versions: 2.0.0
> Reporter: Alexander Kolbasov
> Assignee: Sergio Peña
> Fix For: 2.0.0
>
>
> As shown in HIVE-16886, notification IDs generated by Hive may be non-unique
> and there may be cases with different evnts sharing the same ID. This creates
> various problems for Sentry/Hive interaction and we should fine some short
> -term solution until it is fixed in Hive.
> The issue was addressed in SENTRY-1803 by removing a primary-key constraint
> on the notification Id which allows for multiple keys. But this creates other
> problems:
> 1. We are using the primary key constraint to prevent multiple instances of
> Sentry from processing the same notifications multiple times.
> 2. We are using max(notificationId) to find the last processed event. When
> the field is a primary key, this operation is an index scan, but when it
> isn't, it is a full table scan which is more expensive.
> We also have a few other problems caused by duplicate IDs which are not
> related and not addressed by SENTRY-1803:
> 1. There is a synchronization mechanism between HMS and Sentry which ensures
> that a given event is processed. This doesn't work in the presence of
> duplicate IDs.
> 2. Some events may be missed due to the way they are processed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)