[ https://issues.apache.org/jira/browse/SENTRY-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636163#comment-16636163 ]
Na Li commented on SENTRY-2300: ------------------------------- [~kkalyan] for your comment A) Let’s also understand the limitations with the current approach. These are the issues i see. A.1) There would 0.5 sec delay in fetching the notifications. [Lina] For Sentry HA, it could be 1 second per operation. For customer is done batch operation with thousands or even millions of operation, the operation will {color:#d04437}fail{color} with such delay. A.2) Missing permissions updates for the HMS changes that happened while Sentry is fetching full snapshot.[As explained in the comment above] [Lina] This is functional bug. When it takes hours to get full snapshot, losing permission update is a serious issue. It is {color:#d04437}NOT trival{color} issues at all. We should fix it. B) Here are some of the issues that i see with the approach proposed in SENTRY-2300 B.1) When there is a failure in sentry server while processing an event, event is lost for ever. Failures could be any internal failures. This is the not case currently. If there is a failure in persisting the notification, it is not lost it will be fetches in the subsequent attempt. [Lina] This is not true. With current approach, if processing the notification event fails, the event is lost, same as the new approach I am taking. see https://github.com/apache/sentry/blob/master/sentry-service/sentry-service-server/src/main/java/org/apache/sentry/provider/db/service/persistent/HMSFollower.java#L493 {code} public void processNotifications(Collection<NotificationEvent> events) throws Exception { boolean isNotificationProcessed; if (events.isEmpty()) { return; } for (NotificationEvent event : events) { isNotificationProcessed = false; try { // Only the leader should process the notifications if (!isLeader()) { LOGGER.debug("Not processing notifications since not a leader"); return; } isNotificationProcessed = notificationProcessor.processNotificationEvent(event); } catch (Exception e) { if (e.getCause() instanceof JDODataStoreException) { LOGGER.info("Received JDO Storage Exception, Could be because of processing " + "duplicate notification"); if (event.getEventId() <= sentryStore.getLastProcessedNotificationID()) { // Rest of the notifications need not be processed. LOGGER.error("Received event with Id: {} which is smaller then the ID " + "persisted in store", event.getEventId()); break; } } else { LOGGER.error("Processing the notification with ID:{} failed with exception {}", event.getEventId(), e); } } if (!isNotificationProcessed) { try { // Update the notification ID in the persistent store even when the notification is <-- if the processing of event fails, the notification ID is still saved and not fetched again // not processed as the content in in the notification is not valid. // Continue processing the next notification. LOGGER.debug("Explicitly Persisting Notification ID = {} ", event.getEventId()); sentryStore.persistLastProcessedNotificationID(event.getEventId()); } catch (Exception failure) { LOGGER.error("Received exception while persisting the notification ID = {}", event.getEventId()); throw failure; } } // Wake up any HMS waiters that are waiting for this ID. wakeUpWaitingClientsForSync(event.getEventId()); } } {code} B.2) We still need to depend on HMS notifications to update the Full snapshot with the HMS changes that happened while fetching the snapshot. [Lina] With the new approach, the slow fetching of HMS notifications no longer causes timeout for DDL operations. DDL operation can finish regardless how sentry gets HMS changes. This is a clean solution and a big improvement! With this fix, we don't need short-term changes like trigger fetching notification event at DDL operation, or not block DDL operation when full snapshot is on going. If HDFS sync is disabled, Sentry does not need to fetch full snapshot or delta change at all. [~spena] [~amishra] Can you add comment? > Move Permission Update due to DDL to HMS Post Event Listener > ------------------------------------------------------------ > > Key: SENTRY-2300 > URL: https://issues.apache.org/jira/browse/SENTRY-2300 > Project: Sentry > Issue Type: Bug > Components: Sentry > Affects Versions: 2.1.0, 2.2.0 > Reporter: Na Li > Assignee: Na Li > Priority: Major > Attachments: SENTRY-2300.001.patch, SENTRY-2300.002.patch > > > There was a code in MetastorePlugin that modified Sentry privileges on table > Create/Drop and database Create/Drop. As part of Sentry HA work we moved all > this logic from Sentry plugin to be driven by notifications which required > the extra synchronization between HMS and Sentry. > It should be possible to do permission changes in the post event listener > itself to avoid blocking for Sentry. This requires some experiments though > because it may cause strange artifacts since at the time these DDL operations > are done Sentry may not be aware of the current state - for example you may > try to change permissions of a table that Sentry doesn’t know about, which > seems to be OK. > This update will have the following benefits: > {code} > * HMS waits on Sentry polling HMS update takes 0.5 to 1 second. This update > will remove this delay > * Sentry knows every DDL update, and therefore can update permission > correctly. In current approach using notification processing, Sentry could > miss updates if full snapshot is fetched from HMS, and permission is not > updated correctly. In the case of table rename, when mission DDL update event > because of full snapshot, sentry will not move the permissions associated > with old table to the new table. And the authorization on queries on the > renamed table will fail. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)