[ https://issues.apache.org/jira/browse/SENTRY-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vadim Spector updated SENTRY-1508: ---------------------------------- Attachment: SENTRY-1508.010.patch > MetastorePlugin.java does not handle properly initialization failure > -------------------------------------------------------------------- > > Key: SENTRY-1508 > URL: https://issues.apache.org/jira/browse/SENTRY-1508 > Project: Sentry > Issue Type: Bug > Reporter: Vadim Spector > Assignee: Vadim Spector > Priority: Critical > Attachments: SENTRY-1508.001.patch, SENTRY-1508.002.patch, > SENTRY-1508.003.patch, SENTRY-1508.004.patch, SENTRY-1508.005.patch, > SENTRY-1508.006.patch, SENTRY-1508.007.patch, SENTRY-1508.008.patch, > SENTRY-1508.009.patch, SENTRY-1508.010.patch > > > MetastorePlugin.java has several implementation issues that need to be fixed > by this JIRA: > a) The state of MetastorePlugin is not communicated cleanly. The constructor > initializes successfully regardless of whether HMS cache initialization was > successful or not. The initThread.run() can easily fail when, say, > cacheInitializer.createInitialUpdate() fails due to misconfigured HMS, so it > is quite possible. The state is communicated through 3 variables: boolean > initComplete, Throwable initError, and UpdateableAuthzPaths authzPaths. > initComplete is always set to true from finally {} block, so it really > indicates the end of an _attempt_ to initialize the cache. It, thus, should > only be used to detect whether the cache initialization is still in-progress, > not whether initialization has been successful. To determine whether cache > initialization was successful, initComplete should be used together with > initError, which is set only when initialization fails. This is not how the > code implements it in more than one place, which should be clear from > analyzing a patch. > b) There are several synchronization issues that may lead to the failure of > synchronizing with the Sentry. They usually have to do with inconsistent use > of monitor objects to guard access to thread-unsafe objects. E.g. > authzPaths.updatePartial() uses a lock created just for the scope of a single > call, which makes it useless for synchronization. A single read-write lock > should be used instead, as there is one read and one write operation > performed on authzPaths within MetastorePlugin. > c) Failure to create authzPaths is not communicated clearly to a caller. When > it is dereferenced from the public callback APIs, it results in > NullPointerException. The suggested fix of this issue in SENTRY-1270 avoids > NullPointerException by detecting the null authzPaths, printing error > message, and returning. This effectively leads to consistent failure to > update local cache without notifying the caller. The suggested solution would > be not only to throw IllegalArgumentException to the caller instead, but to > also specify why exactly authzPaths == null - due to initialization still > being in progress or due to its failure. > d) The housekeeping thread running SyncTask to synchronize with Sentry state > fails when authzPaths == null. However, it keeps printing misleading "#### > Metastore Plugin cache has not finished initialization." message even in the > case of a permanent cache initialization failure. It needs to print the real > cause of this condition, by analyzing authzPaths together with initComplete > and initError values. > e) getClient() may return null, in which case dereferencing it causes not so > helpful NullPointerException. Instead, while getClient() may still print > error message, it should also re-throw an original exception, which would > then be much easier to debug. > f) Each code fork deserves log message: e.g. addPath() retrurns immediately > if PathsUpdate.parsePath() returns null - w/o printing any log. > g) in SyncTask.run() - if notificationLock.tryLock() succeeds, yet the next > line's expression "MetastorePlugin.this.authzPaths == null" is evaluated to > false, run() exits w/o a chance to call notificationLock.unlock(). All the > code following lock.tryLock() should be inside try-catch-finally, with > finally's first line being lock.unlock(), as a general safe pattern. > h) some additional misc synchronization issues may also be addressed by the > patch. > What's not in the scope: > a) the initial patch honors the original design supporting asynchronous > synchronization. It means that MetastorePlugin is always constructed, even if > it's in a broken state. It is up to the public callback APIs to properly > inform the caller of the broken state. > b) however, if the reviewers decide that support of asynchronous > initialization is not necessary, it may be preferable to drop such support in > the same JIRA. In this case, second patch can be generated to only support > synchronous initialization. This would lead to a much simpler and robust > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)