[ 
https://issues.apache.org/jira/browse/CASSANDRA-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumava Ghosh updated CASSANDRA-5830:
-------------------------------------

    Description: 
Following is the code segment (StorageProxy.java:328) which causes the issue: 

Start is the start time of the paxos, is always less than the current system 
time, and therefore the negative difference is always less than the timeout. 

{code:title=StorageProxy.java|borderStyle=solid}
private static UUID beginAndRepairPaxos(long start, ByteBuffer key, CFMetaData 
metadata, List<InetAddress> liveEndpoints, int requiredParticipants, 
ConsistencyLevel consistencyForPaxos)
    throws WriteTimeoutException
    {
        long timeout = 
TimeUnit.MILLISECONDS.toNanos(DatabaseDescriptor.getCasContentionTimeout());

        PrepareCallback summary = null;
        while (start - System.nanoTime() < timeout)
        {
            long ballotMillis = summary == null
                              ? System.currentTimeMillis()
                              : Math.max(System.currentTimeMillis(), 1 + 
UUIDGen.unixTimestamp(summary.inProgressCommit.ballot));
            UUID ballot = UUIDGen.getTimeUUID(ballotMillis);
{code}

Here, the paxos gets stuck when PREPARE returns 'true' but with 
inProgressCommit. The code in StorageProxy.java:beginAndRepairPaxos() then 
tries to issue a PREPARE and COMMIT for the inProgressCommit, and if it 
repeatedly receives 'false' as a PREPARE_RESPONSE it gets stuck in an endless 
loop until PREPARE_RESPONSE is true. 

  was:
Following is the code segment (StorageProxy.java:328) which causes the issue: 

Start is the start time of the paxos, is always less than the current system 
time, and therefore the negative difference is always less than the timeout. 

private static UUID beginAndRepairPaxos(long start, ByteBuffer key, CFMetaData 
metadata, List<InetAddress> liveEndpoints, int requiredParticipants, 
ConsistencyLevel consistencyForPaxos)
    throws WriteTimeoutException
    {
        long timeout = 
TimeUnit.MILLISECONDS.toNanos(DatabaseDescriptor.getCasContentionTimeout());

        PrepareCallback summary = null;
        while (start - System.nanoTime() < timeout)
        {
            long ballotMillis = summary == null
                              ? System.currentTimeMillis()
                              : Math.max(System.currentTimeMillis(), 1 + 
UUIDGen.unixTimestamp(summary.inProgressCommit.ballot));
            UUID ballot = UUIDGen.getTimeUUID(ballotMillis);


Here, the paxos gets stuck when PREPARE returns 'true' but with 
inProgressCommit. The code in StorageProxy.java:beginAndRepairPaxos() then 
tries to issue a PREPARE and COMMIT for the inProgressCommit, and if it 
repeatedly receives 'false' as a PREPARE_RESPONSE it gets stuck in an endless 
loop until PREPARE_RESPONSE is true. 

    
> Paxos loops endlessly due to faulty condition check
> ---------------------------------------------------
>
>                 Key: CASSANDRA-5830
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5830
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 2.0 beta 2
>            Reporter: Soumava Ghosh
>
> Following is the code segment (StorageProxy.java:328) which causes the issue: 
> Start is the start time of the paxos, is always less than the current system 
> time, and therefore the negative difference is always less than the timeout. 
> {code:title=StorageProxy.java|borderStyle=solid}
> private static UUID beginAndRepairPaxos(long start, ByteBuffer key, 
> CFMetaData metadata, List<InetAddress> liveEndpoints, int 
> requiredParticipants, ConsistencyLevel consistencyForPaxos)
>     throws WriteTimeoutException
>     {
>         long timeout = 
> TimeUnit.MILLISECONDS.toNanos(DatabaseDescriptor.getCasContentionTimeout());
>         PrepareCallback summary = null;
>         while (start - System.nanoTime() < timeout)
>         {
>             long ballotMillis = summary == null
>                               ? System.currentTimeMillis()
>                               : Math.max(System.currentTimeMillis(), 1 + 
> UUIDGen.unixTimestamp(summary.inProgressCommit.ballot));
>             UUID ballot = UUIDGen.getTimeUUID(ballotMillis);
> {code}
> Here, the paxos gets stuck when PREPARE returns 'true' but with 
> inProgressCommit. The code in StorageProxy.java:beginAndRepairPaxos() then 
> tries to issue a PREPARE and COMMIT for the inProgressCommit, and if it 
> repeatedly receives 'false' as a PREPARE_RESPONSE it gets stuck in an endless 
> loop until PREPARE_RESPONSE is true. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to