[ 
https://issues.apache.org/jira/browse/CASSANDRA-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649460#comment-14649460
 ] 

T Jake Luciani commented on CASSANDRA-8368:
-------------------------------------------

I don't have a clear design but sounds like you did?

{quote}
as an alternative, once CASSANDRA-7237 is complete, I suggest we stop relying 
on hints at all, and do this instead:
1. Store the consistency level as batch metadata
2. On replay, hint in case of a timeout, but not if the node is down as per FD
3. If CL is met, consider the batch replayed and discard it, but not account 
the hints towards CL (as per usual write patch), unless CL.ANY is being used
4. If CL is not met, write a new batch with contents of the current one, but 
with timeuuid set in the future, for later replay (delayed by fixed 
configurable time or exponentially backed off). With that new batch store the 
list of nodes we've delivered the hint to, so that next time we replay it we 
don't waste writes.
{quote}


My concern is we currently don't provide atomicity by defaulting to hints.  So 
if all replicas are down we end up CL.ANY which is not correct or safe since 
hints can be disabled or dropped after max_hint_window_in_ms

{code}
    ReplayWriteResponseHandler<Mutation> handler = replayHandlers.get(i);
                try
                {
                    handler.get();
                }
                catch (WriteTimeoutException|WriteFailureException e)
                {
                    logger.debug("Failed replaying a batched mutation to a 
node, will write a hint");
                    logger.debug("Failure was : {}", e.getMessage());
                    // writing hints for the rest to hints, starting from i
                    writeHintsForUndeliveredEndpoints(i);
                    return;
                }
{code}

> Consider not using hints for batchlog replay, in any capacity
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-8368
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8368
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Aleksey Yeschenko
>             Fix For: 3.0.0 rc1
>
>
> Currently, when replaying a batch, if a request times out, we simply write a 
> hint for it and call it a day.
> It's simple, but it does tie us to hints, which some people prefer to disable 
> altogether (and some still will even after CASSANDRA-6230).
> It also potentially violates the consistency level of the original request.
> As an alternative, once CASSANDRA-7237 is complete, I suggest we stop relying 
> on hints at all, and do this instead:
> 1. Store the consistency level as batch metadata
> 2. On replay, hint in case of a timeout, but not if the node is down as per FD
> 3. If CL is met, consider the batch replayed and discard it, but not account 
> the hints towards CL (as per usual write patch), unless CL.ANY is being used
> 4. If CL is *not* met, write a new batch with contents of the current one, 
> but with timeuuid set in the future, for later replay (delayed by fixed 
> configurable time or exponentially backed off). With that new batch store the 
> list of nodes we've delivered the hint to, so that next time we replay it we 
> don't waste writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to