[ https://issues.apache.org/jira/browse/SENTRY-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15600043#comment-15600043 ]
Vadim Spector commented on SENTRY-1508: --------------------------------------- I think it is a legitimate assumption that whenever there is a class taking time to initialize, the client of such class may want it to happen asynchronously, to keep the main thread alive. The question is - on who's side it is implemented. The same could be done in the client code by simply creating MetastorePlugin on a separate thread and allowing callbacks only after creation is complete, at which point complete initialization is guaranteed. This approach has many advantages, one of them being a very simple state model - once you can get reference to MetastorePlugin, it is fully functional. > MetastorePlugin.java does not handle properly initialization failure > -------------------------------------------------------------------- > > Key: SENTRY-1508 > URL: https://issues.apache.org/jira/browse/SENTRY-1508 > Project: Sentry > Issue Type: Bug > Reporter: Vadim Spector > Priority: Critical > Attachments: SENTRY-1508.001.patch > > > MetastorePlugin.java constructor initializes successfully regardless of > whether initThread.run() was successful. E.g. if > cacheInitializer.createInitialUpdate() cal inside the constructor fails (in > one case in the field it was due to an invalid table uri in HMS database, > whose parsing produced ArrayIndexOutOfBoundsException), initComplete is still > assigned to true and MetastorePlugin gets constructed. While initError is > assigned to non-null value in case of failure, it is effectively ignored as > will be described later. > MetastorePlugin implements several callbacks to be called on the Hive side, > to update Sentry state. It is critical functionality for enforcing access > control. Its callbacks (addPath(), removeAllPaths(), etc) all end up calling > notifySentryAndApplyLocal() which starts with the following code: > if (initComplete) { > processUpdate() > }... > which means that processUpdate() is called even when MetastorePlugin > initialization has failed (i.e. initError != null). There is an option of > asynchronous initialization, when MetastorePlugin is constructed and returned > to the caller before initialization is complete, but the code makes no > distinction between the two cases, as will be shown below. > processUpdate() called regardless of whether MetastorePlugin a) has been > initialized (sync mode, success), or b) still being initialized (async mode), > or c) failed to initialized (either mode). It calls (unconditionally) > applyLocal() and notifySentry(). aplyLocal() calls authzPaths.updatePartial() > which immediately fails with NullPointerException if authzPaths == null which > happens if initThread within the constructor fails. > SENTRY-1270 avoids NullPointerException by execuitng this before > dereferencing authzPaths inside applyLocal(): > if(authzPaths == null) { > LOGGER.error(initializationFailureMsg); > return; > } > However, this leads to hard-to-diagnose behavior, because a) local updates > fail forever, and b) housekeeping thread running SyncTask to synchronize with > Sentry-side state fails as well, printing endlessly misleading "#### > Metastore Plugin cache has not finished initialization." message suggesting > its temporary condition as opposed to a permanent failure. Failure to sync up > with Sentry state can lead to very subtle, often intermittent, problems with > acls. MetastorePlugin and its caller never get fully aware of the permanent > nature of init failure. > Suggestion: > a) define 3 states in MetastorePlugin: initializing, initialized, init_failed. > 2) communicate those states to the caller clearly, by _immediately_ throwing > some kind of exception from public callback APIs in the cases preventing > those callback from completing successfully: > a) for state == initializing, throw some kind of > InitializationInProgressException. This only applies for asynchronous > initialization configuration. It is unknown whether it is actually deployed > anywhere, but the code exists and there is no reason why not to support it. > Async init makes sense as an optimization, as long as > initialization-in-progress condition is clearly communicated. > b) for state == init_failed throw InitializationFailedException of some > sort. For usability, it should contain the cause - the original exception > originating either from the constructor or from the initThread.run() started > from the constructor (in sync or async modes alike). > Although in the sync mode initialization failure could be detected easily by > constructor failure, we may want to preserve the current design, in which > constructor always succeeds, so the client code does not need to be aware of > whether init mode is configured as sync vs async. > Additional cleanup: > 1. getClient() is always dereferenced, while it can actually return null > (exception is swallowed and error is logged). Suggested fix: getClient() may > still print error message, but it should also re-throw an original exception > - it is much easier to debug than dealing with inevitable and uninformative > NullPointerException up the caller stack. General rule - error messages are > to help with diagnostic, not to replace throwing exception, which is the only > way to make sure error conditions are properly propagated to the caller. > 2. Each code fork deserves log message: e.g. addPath() retrurns immediately > if PathsUpdate.parsePath() returns null - w/o printing any log. > 3. in SyncTask.run() - if notificationLock.tryLock() succeeds, yet the next > line's expression "MetastorePlugin.this.authzPaths == null" is evaluated to > false, run() exits w/o a chance to call notificationLock.unlock(). All the > code following lock.tryLock() should be inside try-catch-finally, with > finally's first line being lock.unlock(), as a general safe pattern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)