[
https://issues.apache.org/jira/browse/HDFS-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849472#comment-17849472
]
ASF GitHub Bot commented on HDFS-13603:
---------------------------------------
yzhang559 commented on code in PR #6774:
URL: https://github.com/apache/hadoop/pull/6774#discussion_r1614786507
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirEncryptionZoneOp.java:
##########
@@ -580,15 +583,15 @@ public void run() {
final int logCoolDown = 10000; // periodically print error log (if any)
int sinceLastLog = logCoolDown; // always print the first failure
boolean success = false;
+ int retryCount = 0;
IOException lastSeenIOE = null;
long warmUpEDEKStartTime = monotonicNow();
- while (true) {
+
+ while (!success && retryCount < maxRetries) {
try {
kp.warmUpEncryptedKeys(keyNames);
- NameNode.LOG
- .info("Successfully warmed up {} EDEKs.", keyNames.length);
+ NameNode.LOG.info("Successfully warmed up {} EDEKs.",
keyNames.length);
success = true;
- break;
} catch (IOException ioe) {
lastSeenIOE = ioe;
if (sinceLastLog >= logCoolDown) {
Review Comment:
good catch, remove them since log are bounded now
> Warmup NameNode EDEK thread retries continuously if there's an invalid key
> ---------------------------------------------------------------------------
>
> Key: HDFS-13603
> URL: https://issues.apache.org/jira/browse/HDFS-13603
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: encryption, namenode
> Affects Versions: 2.8.0
> Reporter: Antony Jay
> Priority: Major
> Labels: pull-request-available
>
> https://issues.apache.org/jira/browse/HDFS-9405 adds a background thread to
> pre-warm EDEK cache.
> However this fails and retries continuously if key retrieval fails for one
> encryption zone. In our usecase, we have temporarily removed keys for certain
> encryption zones. Currently namenode and kms log is filled up with errors
> related to background thread retrying warmup for ever .
> The pre-warm thread should
> * Continue to refresh other encryption zones even if it fails for one
> * Should retry only if it fails for all encryption zones, which will be the
> case when kms is down.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]