Are you using a PromotedToLock? Combined with a reasonable retry it should make 
failures almost never happen. You can also just set the number of retries to a 
huge number.

-Jordan


On September 19, 2014 at 12:38:52 PM, Purshotam Shah ([email protected]) 
wrote:

Hi Jordan,

Same issue with Curator 2.5.0.
I read one of your mail thread (can't find it now) where you said that 
DistributedAtomicLong is not guarantee to succeed in multithread env, we have 
to keep on trying until it succeed.

Is that true? This is becoming bottleneck in our stress testing. When try to 
call DistributedAtomicLong  concurrently from multiple thread (30 thread), we 
see few failures (with retry policy "ExponentialBackoffRetry(1000, 3))".

What is the best way to guarantee that DistributedAtomicLong will always 
succeed?


Thanks,
Puru

From: Purshotam Shah <[email protected]>
Date: Wednesday, June 25, 2014 at 4:12 PM
To: Jordan Zimmerman <[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: DistributedAtomicLong fails in multithread env.

Thanks. 
Looks like it’s working fine with Curator 2.5.0.
Will do some more testing. Will respond if it fails with 2.5.0.

Thanks,
Puru.



From: Jordan Zimmerman <[email protected]>
Date: Tuesday, June 24, 2014 at 6:38 PM
To: Purshotam Shah <[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: DistributedAtomicLong fails in multithread env.

This sounds like https://issues.apache.org/jira/browse/CURATOR-108 - Curator 
2.5.0 added a new method, initialize(), to work around this issue. Please try 
that and let me know.

-Jordan


From: Purshotam Shah [email protected]
Reply: [email protected][email protected]
Date: June 24, 2014 at 8:20:18 PM
To: [email protected][email protected]
Subject:  DistributedAtomicLong fails in multithread env.

We are using DistributedAtomicLong to use job sequenceID in ZK.

We noticed that getZKId in multithread env fails. value.preValue() and 
value.postValue() value = 0 and succeeded = false.

If we synchronized the function it works fine, but I don't think it's a right 
approach.

Other approach is to retry multiple time, but how many times. We need to make 
sure that getZKId return sequence.

What is the best approach?


    DistributedAtomicLong atomicIdGenerator;
    PromotedToLock.Builder lockBuilder = PromotedToLock.builder()
                    
.lockPath(getPromotedLock()).retryPolicy(ZKUtils.getRetryPloicy())
                    .timeout(Service.lockTimeout, TimeUnit.MILLISECONDS);
     atomicIdGenerator = new DistributedAtomicLong(zk.getClient(), 
ZK_SEQUENCE_PATH, ZKUtils.getRetryPloicy(),
                    lockBuilder.build());

    private  long getZKId( ) {
        if (atomicIdGenerator == null) {
            throw new RuntimeException("Sequence generator can't be null. Path 
: " + ZK_SEQUENCE_PATH);
        }
        AtomicValue<Long> value = null;
        try {
            value = atomicIdGenerator.increment();
        }
        catch (Exception e) {
            throw new RuntimeException("Exception incrementing UID for session 
", e);
        }
        finally {
            if (value != null && value.succeeded()) {
                return value.preValue();
            }
            else {
                throw new RuntimeException("Exception incrementing UID for 
session ");
            }
        }

    }

Reply via email to