[
https://issues.apache.org/jira/browse/CASSANDRA-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282898#comment-15282898
]
Marcus Olsson commented on CASSANDRA-11258:
-------------------------------------------
bq. I still don't see this in the latest commits, is there anything I'm
missing? It seems that we return null on an exception on processSilent, and
threat null responses as failure to acquire lock on renewLease. How about
handling the exception on renewLease (instead of processSilent) and checking
holdsLease before returning the lock?
Sorry, after I added that check I merged the code for update/insert into the
{{renewLease()}}-method and it got lost in that merge, I'll add it back. Since
{{renewLease()}} is both for creating new leases and renewing them I think it
should check both that the lease is held by the local node and that the TTL has
been updated, otherwise the call to {{lease.renew()}} might be successful even
though the update of the lease was unsuccessful.
Also, in the case when we renew a lease and it fails by a WTE, what should be
returned to the caller? Should we throw an exception or return a simple false?
A lease renewal could be handled either as:
{code}
while(lease.isValid() && !lease.renew(30))
{
Thread.sleep(1000);
}
if (!lease.isValid())
{
// Stop repair
}
{code}
or:
{code}
boolean leaseRenewed = false;
while(lease.isValid() && !leaseRenewed)
{
try {
lease.renew(30);
leaseRenewed = true;
} catch (LeaseException e)
{
logger.warn("Unable to renew lease", e);
Thread.sleep(1000);
}
}
if (!lease.isValid())
{
// Stop repair
}
{code}
I'm more for the "return false" approach here the code feels both easier and
cleaner, WDYT?
---
bq. I don't quite get this case, why won't the row get updated when renewing
the lease? Can you clarify with an example?
Sure, here's an example where some data is inserted and then updated it with a
TTL of 1. When the updated data expires the row is still in the database.
{code}
cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
'replication_factor': 1};
cqlsh> CREATE TABLE test.row_ttl (a text, b text, c text, PRIMARY KEY(a));
cqlsh> INSERT INTO test.row_ttl (a, b, c) VALUES ('a', 'b', 'c') IF NOT EXISTS;
[applied]
-----------
True
cqlsh> SELECT * FROM test.row_ttl;
a | b | c
---+---+---
a | b | c
(1 rows)
cqlsh> UPDATE test.row_ttl USING TTL 1 SET b = 'b', c = 'c' WHERE a = 'a' IF b
= 'b';
[applied]
-----------
True
# Wait until the data expires
cqlsh> SELECT * FROM test.row_ttl;
a | b | c
---+------+------
a | null | null
(1 rows)
{code}
As you can see the column {{a}} is set, while the other columns are null. This
makes the next {{INSERT IF NOT EXISTS}} fail as:
{code}
cqlsh> INSERT INTO test.row_ttl (a, b, c) VALUES ('a', 'b', 'c') IF NOT EXISTS;
[applied] | a | b | c
-----------+---+------+------
False | a | null | null
{code}
If you try to update the column {{a}} with the other columns you get the
following error:
{code}
cqlsh> UPDATE test.row_ttl USING TTL 1 SET a = 'a', b = 'b', c = 'c' WHERE a =
'a';
InvalidRequest: code=2200 [Invalid query] message="PRIMARY KEY part a found in
SET part"
{code}
In our case we have a TTL of 60 seconds for the primary key column so this is
only a problem if you create the lease and renew it directly for a small
duration like 10 seconds and let it expire. In that case {{lease.cancel()}}
wouldn't try to clear the lease and this would make the primary key column
remain for the full 60 seconds and prevent new leases from being obtained. One
option could be to always delete the row in {{lease.cancel()}}, but that should
be an unnecessary operation if the lease is expired. As I stated before, this
is maybe a small enough issue that it might not make a difference but I'm not
too fond of having the inconsistency that we sometimes have that column and
sometimes not, since it disappears after 60 seconds.
> Repair scheduling - Resource locking API
> ----------------------------------------
>
> Key: CASSANDRA-11258
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11258
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Marcus Olsson
> Assignee: Marcus Olsson
> Priority: Minor
>
> Create a resource locking API & implementation that is able to lock a
> resource in a specified data center. It should handle priorities to avoid
> node starvation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)