[ https://issues.apache.org/jira/browse/RANGER-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824400#comment-17824400 ]
Vikas Kumar commented on RANGER-4506: ------------------------------------- *Update:* [~jianchun] , thanks for your review. I analysed the code considering the points you raised, following is my analysis: let's take the example of getkeyVersion() method that you pointed: *Step1:* Thread1 (let's call it {*}T1{*}) enters the method with input *zonekey1@1* *Step2:* T1 acquires the ReadLock and checks *_dbStore.engineContainsAlias(versionName);_* a get/read operation on ConcurrentHashMap (keyEntries) *Step3:* zonekey1@1 not found in map and calls the engineLoad. This method does DB operation, reads all keys and updates the underlying keyEntries map. Please note, it creates a temp map while iterating through the all keys and at last assigns the new map to keyEntries reference ( that is marked volatile). *Step4:* Now next is, *engineGetDecryptedZoneKeyByte,* it again gets the key from the same map. Now if another thread *T2 for zonekey2@1* acquired Read lock and in-process of updating the keyEntries map while *T1* is reading that map, then *T1* may get different view. _But *T2* is simply refreshing the map. This refresh will not change anything for T1, zonekey1@1 and their corresponding value will be changed only through create/delete/rollover API. And these API methods are protected with the corresponding write lock. So keyentry for T1 will be same even if another thread (T2) updates the "keyEntries" map reference._ However, inconsistencies may occur with multiple KMS instances. Like assume above explained operations are executing on KMS-1 and KMS-2 got request to delete that key in between and got processed. This write lock (this is not a distributed lock) will not stop another instance from deleting the key. In that case, at step4 at KMS-1, T1 thread inside *engineGetDecryptedZoneKeyByte() will get null as key and thus NPE.* *The request will fail, and it should fail.* As per my understanding, purpose of introducing this lock is to protect/safeguard *DB read/write* operations, not the in-memory map. [~jianchun] , Please go through my analysis and let me know if I missed anything. Tagging [~madhan] ( author of this change) for his review. Thanks. > Illegal read lock usage in getMetadata/getKeyVersion > ---------------------------------------------------- > > Key: RANGER-4506 > URL: https://issues.apache.org/jira/browse/RANGER-4506 > Project: Ranger > Issue Type: Bug > Components: kms > Reporter: Jianchun Xu > Assignee: Vikas Kumar > Priority: Major > > RangerKeyStoreProvider illegally writes to key store under Read lock. This > happens in both getMetadata and getKeyVersion. > E.g. in following getKeyVersion, under Read lock, the code calls > `dbStore.engineLoad(null, masterKey)` which reloads all the keys. Since > multiple threads can hold the Read lock, multiple threads can enter and > reload all the keys. Thus the 2nd `dbStore.engineContainsAlias(versionName)` > test and following `dbStore.engineGetDecryptedZoneKeyByte(versionName)` can > both get wrong result if another thread is reloading keys. > [https://github.com/apache/ranger/blob/master/kms/src/main/java/org/apache/hadoop/crypto/key/RangerKeyStoreProvider.java#L331] > {code:java} > @Override > public KeyVersion getKeyVersion(String versionName) throws IOException { > if (logger.isDebugEnabled()) { > logger.debug("==> getKeyVersion({})", versionName); > } > KeyVersion ret = null; > try (AutoClosableReadLock ignored = new AutoClosableReadLock(lock)) { > if (keyVaultEnabled) { > try { > boolean versionNameExists = > dbStore.engineContainsAlias(versionName); > if (!versionNameExists) { > dbStore.engineLoad(null, masterKey); > versionNameExists = > dbStore.engineContainsAlias(versionName); > } > if (versionNameExists) { > byte[] decryptKeyByte; > try { > decryptKeyByte = > dbStore.engineGetDecryptedZoneKeyByte(versionName); > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)