[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17082690#comment-17082690
 ] 

David Capwell commented on CASSANDRA-15674:
-------------------------------------------

bq. why you chose the hooks that were added vs using the existing 
onCommit/onAbort API with a delegate or sub-class implementation used only for 
IndexSummaryRedistribution.

IndexSummaryRedistribution mutates disk in-place but relies on the metrics to 
be updated via transactions.  Right now the commit case works since 
IndexSummaryRedistribution will delete the current size (before the mutation) 
so the readd will update the size.  In the abort case we need to know the 
deltas (can only compute right after mutating), so would need a list of deltas 
to apply in onAbort.  This logic is very specific to IndexSummaryRedistribution 
and would rely on extending LifecycleTransaction to apply this list. I felt 
that commit/abort hooks were a generalization of this and could be used if 
other use cases needed.

bq. Pending counter: can you say a bit more about how you would see an operator 
using this metric?

PendingSSTableReleases was mostly added for tests to know when everything is 
fully released (couldn't find any other way).  Its also the reason 
liveDiskSpaceUsed !=totalDiskSpaceUsed, so it also exposes to operators that 
things line up (if they are not equal then pending should be > 0; else there is 
a bug).

bq. consider renaming Listener#onPreSample to Listener#beforeResample

Done.

bq. IndexSummaryRedistribution constructor should be marked @VisibleForTest

Done

bq. DiskSpaceMetricsTest: Is the TODO on L125 still valid or can it be removed? 
Looks like the latter

There is desire to extract the jvm dtest tests out so they can run against any 
version.  This TODO is mostly commenting that the test could not be written 
generic to versions since it depends on JMX values which are not implemented in 
jvm dtests.  

I can file a ticket and remove the TODO.


> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15674
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction, Observability/Metrics
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to