[ 
https://issues.apache.org/jira/browse/SENTRY-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15600043#comment-15600043
 ] 

Vadim Spector commented on SENTRY-1508:
---------------------------------------

I think it is a legitimate assumption that whenever there is a class taking 
time to initialize, the client of such class may want it to happen 
asynchronously, to keep the main thread alive. The question is - on who's side 
it is implemented.

The same could be done in the client code by simply creating MetastorePlugin on 
a separate thread and allowing callbacks only after creation is complete, at 
which point complete initialization is guaranteed. This approach has many 
advantages, one of them being a very simple state model - once you can get 
reference to MetastorePlugin, it is fully functional.

> MetastorePlugin.java does not handle properly initialization failure
> --------------------------------------------------------------------
>
>                 Key: SENTRY-1508
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1508
>             Project: Sentry
>          Issue Type: Bug
>            Reporter: Vadim Spector
>            Priority: Critical
>         Attachments: SENTRY-1508.001.patch
>
>
> MetastorePlugin.java constructor initializes successfully regardless of 
> whether initThread.run() was successful. E.g. if 
> cacheInitializer.createInitialUpdate() cal inside the constructor fails (in 
> one case in the field it was due to an invalid table uri in HMS database, 
> whose parsing produced ArrayIndexOutOfBoundsException), initComplete is still 
> assigned to true and MetastorePlugin gets constructed. While initError is 
> assigned to non-null value in case of failure, it is effectively ignored as 
> will be described later.
> MetastorePlugin implements several callbacks to be called on the Hive side, 
> to update Sentry state. It is critical functionality for enforcing access 
> control. Its callbacks (addPath(), removeAllPaths(), etc) all end up calling 
> notifySentryAndApplyLocal() which starts with the following code:
> if (initComplete) {
>     processUpdate()
> }...
> which means that processUpdate() is called even when MetastorePlugin 
> initialization has failed (i.e. initError != null). There is an option of 
> asynchronous initialization, when MetastorePlugin is constructed and returned 
> to the caller before initialization is complete, but the code makes no 
> distinction between the two cases, as will be shown below.
> processUpdate() called regardless of whether MetastorePlugin a) has been 
> initialized (sync mode, success), or b) still being initialized (async mode), 
> or c) failed to initialized (either mode). It calls (unconditionally) 
> applyLocal() and notifySentry(). aplyLocal() calls authzPaths.updatePartial() 
> which immediately fails with NullPointerException if authzPaths == null which 
> happens if initThread within the constructor fails. 
> SENTRY-1270 avoids NullPointerException by execuitng this before 
> dereferencing authzPaths inside applyLocal():
> if(authzPaths == null) {
> LOGGER.error(initializationFailureMsg);
> return;
> }
> However, this leads to hard-to-diagnose behavior, because a) local updates 
> fail forever, and b) housekeeping thread running SyncTask to synchronize with 
> Sentry-side state fails as well, printing endlessly misleading "#### 
> Metastore Plugin cache has not finished initialization." message suggesting 
> its temporary condition as opposed to a permanent failure. Failure to sync up 
> with Sentry state can lead to very subtle, often intermittent, problems with 
> acls. MetastorePlugin and its caller never get fully aware of the permanent 
> nature of init failure.
> Suggestion:
> a) define 3 states in MetastorePlugin: initializing, initialized, init_failed.
> 2) communicate those states to the caller clearly, by _immediately_ throwing 
> some kind of exception from public callback APIs in the cases preventing 
> those callback from completing successfully:
>     a) for state == initializing, throw some kind of 
> InitializationInProgressException. This only applies for asynchronous 
> initialization configuration. It is unknown whether it is actually deployed 
> anywhere, but the code exists and there is no reason why not to support it. 
> Async init makes sense as an optimization, as long as 
> initialization-in-progress condition is clearly communicated.
>    b) for state == init_failed throw InitializationFailedException of some 
> sort. For usability, it should contain the cause - the original exception 
> originating either from the constructor or from the initThread.run() started 
> from the constructor (in sync or async modes alike).
> Although in the sync mode initialization failure could be detected easily by 
> constructor failure, we may want to preserve the current design, in which 
> constructor always succeeds, so the client code does not need to be aware of 
> whether init mode is configured as sync vs async.
> Additional cleanup:
> 1. getClient() is always dereferenced, while it can actually return null 
> (exception is swallowed and error is logged). Suggested fix: getClient() may 
> still print error message, but it should also re-throw an original exception 
> - it is much easier to debug than dealing with inevitable and uninformative 
> NullPointerException up the caller stack. General rule - error messages are 
> to help with diagnostic, not to replace throwing exception, which is the only 
> way to make sure error conditions are properly propagated to the caller.
> 2. Each code fork deserves log message: e.g. addPath() retrurns immediately 
> if PathsUpdate.parsePath() returns null - w/o printing any log.
> 3. in SyncTask.run() - if notificationLock.tryLock() succeeds, yet the next 
> line's expression "MetastorePlugin.this.authzPaths == null" is evaluated to 
> false, run() exits w/o a chance to call notificationLock.unlock(). All the 
> code following lock.tryLock() should be inside try-catch-finally, with 
> finally's first line being lock.unlock(), as a general safe pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to