[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2024-01-03 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-12667:
--
Target Version/s: 3.5.0  (was: 3.4.0)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
> Attachments: HDFS-12667-001.patch, HDFS-12667-002.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache> keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2020-04-11 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-12667:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.



> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh Shah
>Assignee: Rushabh Shah
>Priority: Major
> Attachments: HDFS-12667-001.patch, HDFS-12667-002.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache> keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-31 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Status: Patch Available  (was: Open)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch, HDFS-12667-002.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-31 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Attachment: HDFS-12667-002.patch

bq. IMO we should throw as an IOE to tell the caller "failed to drain".
It does no good throwing an IOE also. Since all the callers 
{{KMSClientProvider#drain}} and 
{{EagerKeyGeneratorKeyProviderCryptoExtension.CryptoExtension#drain}} can't 
throw anything back.
{{ValueQueue#drain}} will throw {{ExecutionException}} only if its not able to 
load values from {{CacheLoader}} so IMO its safe to ignore the exception.
On the other hand, I am thinking to remover CacheLoader in the first place.

bq. A little confused by this comment, {{drainAgain]} is protected by the 
{{PolicyBasedQueue]} object lock right? Could you elaborate?
Fixed in #2 patch.

bq. Is this supposed to be called outside of the class?
Removed in #2 patch.

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch, HDFS-12667-002.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-31 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Status: Open  (was: Patch Available)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-28 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Status: Open  (was: Patch Available)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-28 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Status: Patch Available  (was: Open)

submitting again..hopefully jenkins will run on the last patch.

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-26 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Status: Patch Available  (was: Open)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-26 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Attachment: HDFS-12667-001.patch

re-attaching the same patch for jenkins to pickup.

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-26 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Status: Open  (was: Patch Available)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-26 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Attachment: (was: HDFS-12667-001.patch)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-25 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Attachment: HDFS-12667-001.patch

attaching a preliminary patch.
By no means it is a committable patch.

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-25 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Status: Patch Available  (was: Open)

> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-12667-001.patch
>
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...
>  readLock(keyName);
> E val = keyQueue.poll();
> readUnlock(keyName);
>  ...
>   }
>   private void submitRefillTask(final String keyName,
>   final Queue keyQueue) throws InterruptedException {
>   ...
>   ...
>   writeLock(keyName); // It holds the write lock while the key is 
> being asynchronously fetched. So the read requests for all the keys that 
> hashes to this bucket will essentially be blocked.
>   try {
> if (keyQueue.size() < threshold && !isCanceled()) {
>   refiller.fillQueueForKey(name, keyQueue,
>   cacheSize - keyQueue.size());
> }
>  ...
>   } finally {
> writeUnlock(keyName);
>   }
> }
>   }
> {code}
> According to above code chunk, if two keys (lets say key1 and key2) hashes to 
> the same bucket (between 1 and 16), then if key1 is asynchronously being 
> refetched then all the getKey for key2 will be blocked.
> 2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
> synchronous to other handler threads.
> I understand that locks were added so that we don't kick off multiple 
> asynchronous refilling thread for the same key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12667) KMSClientProvider#ValueQueue does synchronous fetch of edeks in background async thread.

2017-10-16 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-12667:
--
Description: 
There are couple of issues in KMSClientProvider#ValueQueue.
1.
 {code:title=ValueQueue.java|borderStyle=solid}
  private final LoadingCache keyQueues;
  // Stripped rwlocks based on key name to synchronize the queue from
  // the sync'ed rw-thread and the background async refill thread.
  private final List lockArray =
  new ArrayList<>(LOCK_ARRAY_SIZE);
{code}
It hashes the key name into 16 buckets.
In the code chunk below,

 {code:title=ValueQueue.java|borderStyle=solid}
public List getAtMost(String keyName, int num) throws IOException,
  ExecutionException {
 ...
 ...
 readLock(keyName);
E val = keyQueue.poll();
readUnlock(keyName);
 ...
  }

  private void submitRefillTask(final String keyName,
  final Queue keyQueue) throws InterruptedException {
  ...
  ...
  writeLock(keyName); // It holds the write lock while the key is 
being asynchronously fetched. So the read requests for all the keys that hashes 
to this bucket will essentially be blocked.
  try {
if (keyQueue.size() < threshold && !isCanceled()) {
  refiller.fillQueueForKey(name, keyQueue,
  cacheSize - keyQueue.size());
}
 ...
  } finally {
writeUnlock(keyName);
  }
}
  }
{code}
According to above code chunk, if two keys (lets say key1 and key2) hashes to 
the same bucket (between 1 and 16), then if key1 is asynchronously being 
refetched then all the getKey for key2 will be blocked.

2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
synchronous to other handler threads.

I understand that locks were added so that we don't kick off multiple 
asynchronous refilling thread for the same key.


  was:
There are couple of issues in KMSClientProvider#ValueQueue.
1.
 {code:title=ValueQueue.java|borderStyle=solid}
  private final LoadingCache keyQueues;
  // Stripped rwlocks based on key name to synchronize the queue from
  // the sync'ed rw-thread and the background async refill thread.
  private final List lockArray =
  new ArrayList<>(LOCK_ARRAY_SIZE);
{code}
It hashes the key name into 16 buckets.
In the code chunk below,

 {code:title=ValueQueue.java|borderStyle=solid}
public List getAtMost(String keyName, int num) throws IOException,
  ExecutionException {
 ...
 ...
 readLock(keyName);
E val = keyQueue.poll();
readUnlock(keyName);
 ...
  }

  private void submitRefillTask(final String keyName,
  final Queue keyQueue) throws InterruptedException {
  ...
  ...
  writeLock(keyName); // It holds the write lock while the key is 
being asynchronously fetched. So the read requests for all the keys that hashes 
to this bucket will essentially be blocked.
  try {
if (keyQueue.size() < threshold && !isCanceled()) {
  refiller.fillQueueForKey(name, keyQueue,
  cacheSize - keyQueue.size());
}
 ...
  } finally {
writeUnlock(keyName);
  }
}
  }
{code}
According to above code chunk, if two keys (lets say key1 and key2) hashes to 
the same bucket (between 1 and 16), then if key1 is asynchronously being 
refetched then all the getKey for key2 will be blocked.

2. Due to stripped rw locks, the asynchronous behavior of refill keys is now 
synchronous to other handler threads.



> KMSClientProvider#ValueQueue does synchronous fetch of edeks in background 
> async thread.
> 
>
> Key: HDFS-12667
> URL: https://issues.apache.org/jira/browse/HDFS-12667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, kms
>Affects Versions: 3.0.0-alpha4
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> There are couple of issues in KMSClientProvider#ValueQueue.
> 1.
>  {code:title=ValueQueue.java|borderStyle=solid}
>   private final LoadingCache keyQueues;
>   // Stripped rwlocks based on key name to synchronize the queue from
>   // the sync'ed rw-thread and the background async refill thread.
>   private final List lockArray =
>   new ArrayList<>(LOCK_ARRAY_SIZE);
> {code}
> It hashes the key name into 16 buckets.
> In the code chunk below,
>  {code:title=ValueQueue.java|borderStyle=solid}
> public List getAtMost(String keyName, int num) throws IOException,
>   ExecutionException {
>  ...
>  ...