[
https://issues.apache.org/jira/browse/HADOOP-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097334#comment-15097334
]
Xiao Chen commented on HADOOP-12699:
------------------------------------
Thanks Andrew and Zhe for the ideas. I agree if this is a valid scenario, then
we should definitely fix it.
I've spent some time on understanding the architecture, and tried to poke
around. I'm pretty sure that the problem is on the server side cache ([code
here|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/EagerKeyGeneratorKeyProviderCryptoExtension.java#L77]),
because disabling the async filling on that (while client side cache
untouched) will prevent the issue from happening. Specifically, the
{{fillQueueForKey}} running from the async thread could execute any time, hence
refilling the queue with old values.
As to the fix, I haven't come up with an elegant way of solving it. :( I tried
to 1) cancel all running threads for the key when draining, 2) detect the async
filling, if key changed then discard the old keys. But even when both changes
in place, same failure is still easily reproduced. I'm not sure I'm on the
correct direction of this, and would love to hear some advice.. Also including
original authors [~tucu00] and [~asuresh] for help. Thanks a lot!
I attached my patch attempting to fix this, to further support my explanation.
> TestKMS#testKMSProvider intermittently fails during 'test rollover draining'
> ----------------------------------------------------------------------------
>
> Key: HADOOP-12699
> URL: https://issues.apache.org/jira/browse/HADOOP-12699
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Xiao Chen
> Assignee: Xiao Chen
> Attachments: HADOOP-12699.01.patch, HADOOP-12699.02.patch,
> HADOOP-12699.03.patch, HADOOP-12699.04.patch, HADOOP-12699.repro.2,
> HADOOP-12699.repro.patch
>
>
> I've seen several failures of testKMSProvider, all failed in the following
> snippet:
> {code}
> // test rollover draining
> KeyProviderCryptoExtension kpce = KeyProviderCryptoExtension.
> createKeyProviderCryptoExtension(kp);
> .....
> EncryptedKeyVersion ekv1 = kpce.generateEncryptedKey("k6");
> kpce.rollNewVersion("k6");
> EncryptedKeyVersion ekv2 = kpce.generateEncryptedKey("k6");
> Assert.assertNotEquals(ekv1.getEncryptionKeyVersionName(),
> ekv2.getEncryptionKeyVersionName());
> {code}
> with error message
> {quote}Values should be different. Actual: k6@0{quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)