[
https://issues.apache.org/jira/browse/SENTRY-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138870#comment-16138870
]
Sergio Peña commented on SENTRY-1888:
-------------------------------------
I chat offline with [~akolb] about different ideas to handle this situation,
such as requesting events in a time window and ignore or re-apply already seen
notifications, and other ideas that are mentioned on HIVE-16886. This is not an
easy fix, and every solution has its drawbacks.
The proposed solution will be to request notifications from the last ID seen
again. This way we could bring current duplicates and apply them on Sentry. We
have the risk to miss duplicates that were committed much time later, but we
cannot trust on those duplicates as they will not know the order of the time
they were committed, for instance:
- Sentry fetches 1,2 and applies them
- Sentry fetches 1,2,2,3,4,5 (new IDs are 2,4,5). In what hat order were they
committed? 2,4,5 or 4,2,5 or 4,5,2?
The proposal is only to trust that the request of the last ID seen again would
bring duplicated events in order. For instance:
- Sentry fetches 1,2 and applies them
- Sentry requests last ID seen again (2)
- Sentry fetches 2,2,3,4,5 (new IDS are 2,4,5). We trust the 2 is in order with
the last seen ID = 2
[~akolb] does it sound good?
> Sentry might not fetch all HMS duplicated events IDs when requested
> -------------------------------------------------------------------
>
> Key: SENTRY-1888
> URL: https://issues.apache.org/jira/browse/SENTRY-1888
> Project: Sentry
> Issue Type: Bug
> Components: Sentry
> Affects Versions: 2.0.0
> Reporter: Sergio Peña
> Assignee: Sergio Peña
>
> HMS does not guarantee that each notification has unique IDs. SENTRY-1803
> solved the issue with Sentry to handle those duplicated events IDs. However,
> HMS notifications with duplicated events IDs could appear late on the HMS
> side due to delay issues on the DB (especially on HMS HA mode). These events
> could not be fetched by Sentry if we already processed a duplicated event ID
> before.
> Example:
> 1. HMS 1 attempts to persist event ID = 1
> 2. HMS 2 attempts to persist event ID = 1
> 3. HMS 1 commits event ID = 1
> 4. Sentry fetches notifications >= 1 (bringing the event from HMS 1)
> 5. HMS 2 commits event ID = 1
> 6. Sentry fetches notifications >= 2 (no events are fetched)
> HMS 2 event ID = 1 is never fetched nor processed by Sentry.
> The above scenario could cause Sentry to be out-of-sync because of these
> events that were not committed on time.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)