vikaskr22 commented on code in PR #6001:
URL: https://github.com/apache/hadoop/pull/6001#discussion_r1317243402
##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java:
##########
@@ -520,24 +525,32 @@ protected int incrementDelegationTokenSeqNum() {
// The secret manager will keep a local range of seq num which won't be
// seen by peers, so only when the range is exhausted it will ask zk for
// another range again
- if (currentSeqNum >= currentMaxSeqNum) {
- try {
- // after a successful batch request, we can get the range starting
point
- currentSeqNum = incrSharedCount(delTokSeqCounter, seqNumBatchSize);
- currentMaxSeqNum = currentSeqNum + seqNumBatchSize;
- LOG.info("Fetched new range of seq num, from {} to {} ",
- currentSeqNum+1, currentMaxSeqNum);
- } catch (InterruptedException e) {
- // The ExpirationThread is just finishing.. so dont do anything..
- LOG.debug(
- "Thread interrupted while performing token counter increment", e);
- Thread.currentThread().interrupt();
- } catch (Exception e) {
- throw new RuntimeException("Could not increment shared counter !!", e);
+ try{
+ this.currentSeqNumLock.lock();
Review Comment:
> because the znode could be updated by different processes, such as
DFSRouter,
is it possible that other processes like DFSRouter can write on these ZNodes
? As far as I know, each process defines their separate znode path, like in KMS
we define kms root node path and only KMS instances can write on the specified
path. Please correct me if my understanding is wrong.
On race condition, that I mentioned earlier, let me reiterate one more time,
consider the following example without any lock:
1. Thread1 has executed the L-532 // currentSeqNum = 10,
currentMaxSeqNum=11
2. Now Thread2 is about to execute the L-531 and it executes it before
Thread1 finishes this method. //currentSeqNum = 11, currentMaxSeqNum=12
3. Now Thread1 returns the incremented value, that is 12
4. Then Thread2 returns the incremented value , that is 13. // this is out
of range.
In this case, the returned value would be out of range. I mean, it would be
greater than currentMaxSeqNum, that would create functional issues. because
same sequenceNumber might be reserved by other instances.
I hope I am able to clarify the scenario
@Hexiaoqiao , please confirm if this race condition is possible or not? If
Yes, then following solution could be considered:
1. Make currentMaxSeqNum, currentSeqNumLock AtomicInteger // as suggested by
you.
2. Now, instead of simple lock,, use ReentrantReadWriteLock
currentSeqNumLock;
3. Now in incrementDelegationTokenSeqNum() method, following pattern could
be used:
currentSeqNumLock.readLock().lock();
if (currentSeqNum.get() >= currentMaxSeqNum.get()) {
currentSeqNumLock.readLock().unlock(); // release the read lock
currentSeqNumLock.writeLock().lock();
// Double check the condition one more time:
if (currentSeqNum.get() >= currentMaxSeqNum.get()) {
// Now increment the seqNum on ZK and update the variables.
}
}
finally unlock the acquired locks.
here, all threads will be able to acquire readLock and will not block each
other. AtomicInteger will ensure atomicity and visibility issue. And write lock
will address the race condition and it will be required only when it needs to
interact with ZK.
Above is a simple pseudo code, please review this approach and share your
feedback.
And thanks for your review efforts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]