[ 
https://issues.apache.org/jira/browse/CASSANDRA-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-7489.
---------------------------------------

    Resolution: Won't Fix

This is a very complex change with lots of caveats and corner cases, and it 
really doesn't give us all that much over hourly incremental repair.  (Killing 
TS after an hour vs after a minute isn't that big a win when you're not 
constantly performing major compactions.)

So, I'm glad we have this for the interesting ideas pile, but let's not push 
that rock uphill in the near future.

> Track lower bound necessary for a repair, live, without actually repairing
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7489
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7489
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>              Labels: performance, repair
>
> We will need a few things in place to get this right, but it should be 
> possible to track live what the current health of a single range is across 
> the cluster. If we force an owning node to be the coordinator for an update 
> (so if a non-smart client sends a mutation to a non-owning node, it just 
> proxies it on to an owning node to coordinate the update; this should tend to 
> minimal overhead as smart clients become the norm, and smart clients scale up 
> to cope with huge clusters), then each owner can maintain the oldest known 
> timestamp it has coordinated an update for that was not acknowledged by every 
> owning node it propagated it to. The minimum of all of these for a region is 
> the lower bound from which we need to either repair, or retain tombstones. 
> With vnode file segregation we can mark an entire vnode range as repaired up 
> to the most recently determined healthy lower bound.
> There are some subtleties with this, but it means tombstones can be cleared 
> potentially only minutes after they are generated, instead of days or weeks. 
> It also means even repairs can be even more incremental, only operating over 
> ranges and time periods we know to be potentially out of sync.
> It will most likely need RAMP transactions in place, so that atomic batch 
> mutations are not serialized on non-owning nodes. Having owning nodes 
> coordinate updates is to ensure robustness in case of a single node failure - 
> in this case all ranges owned by the node are considered to have a lower 
> bound of -Inf. Without this a single node being down would result in the 
> entire cluster being considered out of sync.
> We will still need a short grace period for clients to send timestamps, and 
> we would have to outright reject any updates that arrived with a timestamp 
> near to that window expiring. But that window could safely be just minutes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to