[ 
https://issues.apache.org/jira/browse/CASSANDRA-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282898#comment-15282898
 ] 

Marcus Olsson commented on CASSANDRA-11258:
-------------------------------------------

bq. I still don't see this in the latest commits, is there anything I'm 
missing? It seems that we return null on an exception on processSilent, and 
threat null responses as failure to acquire lock on renewLease. How about 
handling the exception on renewLease (instead of processSilent) and checking 
holdsLease before returning the lock?
Sorry, after I added that check I merged the code for update/insert into the 
{{renewLease()}}-method and it got lost in that merge, I'll add it back. Since 
{{renewLease()}} is both for creating new leases and renewing them I think it 
should check both that the lease is held by the local node and that the TTL has 
been updated, otherwise the call to {{lease.renew()}} might be successful even 
though the update of the lease was unsuccessful.

Also, in the case when we renew a lease and it fails by a WTE, what should be 
returned to the caller? Should we throw an exception or return a simple false? 
A lease renewal could be handled either as:
{code}
while(lease.isValid() && !lease.renew(30))
{
 Thread.sleep(1000);
}

if (!lease.isValid())
{
 // Stop repair
}
{code}
or:
{code}
boolean leaseRenewed = false;
while(lease.isValid() && !leaseRenewed)
{
 try {
  lease.renew(30);
  leaseRenewed = true;
 } catch (LeaseException e)
 {
  logger.warn("Unable to renew lease", e);
  Thread.sleep(1000);
 }
}

if (!lease.isValid())
{
 // Stop repair
}
{code}
I'm more for the "return false" approach here the code feels both easier and 
cleaner, WDYT?

---

bq. I don't quite get this case, why won't the row get updated when renewing 
the lease? Can you clarify with an example?
Sure, here's an example where some data is inserted and then updated it with a 
TTL of 1. When the updated data expires the row is still in the database.
{code}
cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh> CREATE TABLE test.row_ttl (a text, b text, c text, PRIMARY KEY(a));
cqlsh> INSERT INTO test.row_ttl (a, b, c) VALUES ('a', 'b', 'c') IF NOT EXISTS;

 [applied]
-----------
      True

cqlsh> SELECT * FROM test.row_ttl;
 a | b | c
---+---+---
 a | b | c

(1 rows)
cqlsh> UPDATE test.row_ttl USING TTL 1 SET b = 'b', c = 'c' WHERE a = 'a' IF b 
= 'b';

 [applied]
-----------
      True

# Wait until the data expires
cqlsh> SELECT * FROM test.row_ttl;

 a | b    | c
---+------+------
 a | null | null

(1 rows)
{code}
As you can see the column {{a}} is set, while the other columns are null. This 
makes the next {{INSERT IF NOT EXISTS}} fail as:
{code}
cqlsh> INSERT INTO test.row_ttl (a, b, c) VALUES ('a', 'b', 'c') IF NOT EXISTS;

 [applied] | a | b    | c
-----------+---+------+------
     False | a | null | null
{code}

If you try to update the column {{a}} with the other columns you get the 
following error:
{code}
cqlsh> UPDATE test.row_ttl USING TTL 1 SET a = 'a', b = 'b', c = 'c' WHERE a = 
'a';
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY part a found in 
SET part"
{code}

In our case we have a TTL of 60 seconds for the primary key column so this is 
only a problem if you create the lease and renew it directly for a small 
duration like 10 seconds and let it expire. In that case {{lease.cancel()}} 
wouldn't try to clear the lease and this would make the primary key column 
remain for the full 60 seconds and prevent new leases from being obtained. One 
option could be to always delete the row in {{lease.cancel()}}, but that should 
be an unnecessary operation if the lease is expired. As I stated before, this 
is maybe a small enough issue that it might not make a difference but I'm not 
too fond of having the inconsistency that we sometimes have that column and 
sometimes not, since it disappears after 60 seconds.

> Repair scheduling - Resource locking API
> ----------------------------------------
>
>                 Key: CASSANDRA-11258
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11258
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Marcus Olsson
>            Assignee: Marcus Olsson
>            Priority: Minor
>
> Create a resource locking API & implementation that is able to lock a 
> resource in a specified data center. It should handle priorities to avoid 
> node starvation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to