[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-05-06 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100762#comment-17100762
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15674:
-

Hi [~dcapwell] and [~marcuse],
Is there anything else required for this one?


> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095868#comment-17095868
 ] 

David Capwell commented on CASSANDRA-15674:
---

The trunk build Python dtests failed because of CASSANDRA-15776, sent out patch 
to fix those (unrelated to this one).

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-28 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094973#comment-17094973
 ] 

David Capwell commented on CASSANDRA-15674:
---

Patches per version

||branch||link||CI|
|3.11|[Git 
HubI|https://github.com/dcapwell/cassandra/tree/bug/negativeDiskSpace-3.11]|[Circle
 
CI|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=bug%2FnegativeDiskSpace-3.11]|

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-20 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088022#comment-17088022
 ] 

David Capwell commented on CASSANDRA-15674:
---

[~marcuse] pushed based off your feedback 
[here|https://github.com/apache/cassandra/pull/500/commits/27e5cd7132515ab0bd114731417913c40e4d7789].
 The only change from your branch is I still verify totalDiskSpaceUsed (was 
removed from your patch, not sure why).

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-20 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088023#comment-17088023
 ] 

David Capwell commented on CASSANDRA-15674:
---

[Circle 
CI|https://circleci.com/workflow-run/c5d5c7c5-9c75-4e7e-a122-9771663e5451]

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-17 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085667#comment-17085667
 ] 

Marcus Eriksson commented on CASSANDRA-15674:
-

The patch fixes the immediate bug, so +1, but we should really refactor how 
index summary redistribution uses transactions as it is quite broken (not 
really transactional at all), I'll create a new jira for that.

pushed a few updates 
[here|https://github.com/krummas/cassandra/commits/david/15674] - this cleans 
up the test a bit to avoid having to add the metric and the Listener interface 
to IndexSummaryRedistribution


> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-15 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084281#comment-17084281
 ] 

David Capwell commented on CASSANDRA-15674:
---

[~marcuse] I moved the test to use CQLTester based off your comment: see 
[here|https://github.com/apache/cassandra/pull/500/commits/902354e378429292b63c7127171856312102da22]

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-14 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083438#comment-17083438
 ] 

David Capwell commented on CASSANDRA-15674:
---

bq. We should probably stick to consistent nomenclature, of commit/abort, 
rather than introduce commit/rollback?

Fixed.

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-13 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082750#comment-17082750
 ] 

Jordan West commented on CASSANDRA-15674:
-

Ah yes. Good catch [~benedict] 

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-13 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082749#comment-17082749
 ] 

Benedict Elliott Smith commented on CASSANDRA-15674:


I haven't looked closely at the rationale, but I'm at least a bit surprised at 
the mixed terminology.  We should probably stick to consistent nomenclature, of 
commit/abort, rather than introduce commit/rollback?

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-13 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082696#comment-17082696
 ] 

Jordan West commented on CASSANDRA-15674:
-

Thanks for the explanations. I'm +1 on the patch. Moving the TODO to a ticket 
where it can be tracked wfm. 

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-13 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082690#comment-17082690
 ] 

David Capwell commented on CASSANDRA-15674:
---

bq. why you chose the hooks that were added vs using the existing 
onCommit/onAbort API with a delegate or sub-class implementation used only for 
IndexSummaryRedistribution.

IndexSummaryRedistribution mutates disk in-place but relies on the metrics to 
be updated via transactions.  Right now the commit case works since 
IndexSummaryRedistribution will delete the current size (before the mutation) 
so the readd will update the size.  In the abort case we need to know the 
deltas (can only compute right after mutating), so would need a list of deltas 
to apply in onAbort.  This logic is very specific to IndexSummaryRedistribution 
and would rely on extending LifecycleTransaction to apply this list. I felt 
that commit/abort hooks were a generalization of this and could be used if 
other use cases needed.

bq. Pending counter: can you say a bit more about how you would see an operator 
using this metric?

PendingSSTableReleases was mostly added for tests to know when everything is 
fully released (couldn't find any other way).  Its also the reason 
liveDiskSpaceUsed !=totalDiskSpaceUsed, so it also exposes to operators that 
things line up (if they are not equal then pending should be > 0; else there is 
a bug).

bq. consider renaming Listener#onPreSample to Listener#beforeResample

Done.

bq. IndexSummaryRedistribution constructor should be marked @VisibleForTest

Done

bq. DiskSpaceMetricsTest: Is the TODO on L125 still valid or can it be removed? 
Looks like the latter

There is desire to extract the jvm dtest tests out so they can run against any 
version.  This TODO is mostly commenting that the test could not be written 
generic to versions since it depends on JMX values which are not implemented in 
jvm dtests.  

I can file a ticket and remove the TODO.


> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-04-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080854#comment-17080854
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15674:
-

Hi [~jrwest],
I see you are assigned as a reviewer, are you looking into this?
Do you need help?

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-03-31 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072028#comment-17072028
 ] 

David Capwell commented on CASSANDRA-15674:
---

to handle this I added 2 hooks into the transaction: onCommit, onRollback.  
Both the hooks apply the update, but apply different updates.  onCommit deletes 
the whole file since the transaction will add it back with the latest size, 
onRollback removes the diff (the disk mutation is done outside of the 
transaction, so still need to apply).

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15674) liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted

2020-03-27 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069158#comment-17069158
 ] 

David Capwell commented on CASSANDRA-15674:
---

Prototyping and I see that the reason for the transaction is to swap 
SSTableReaders into the view.  Summary is a local disk thing but we store it 
in-memory; the swap will replace in-memory with the new summary.

The issue remains that we mutate in-place so roll back doesn't happen.

> liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if 
> IndexSummaryRedistribution gets interrupted
> -
>
> Key: CASSANDRA-15674
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15674
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Observability/Metrics
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>
> IndexSummaryRedistribution is a compaction task and as such extends Holder 
> and supports cancelation by throwing a CompactionInterruptedException.  The 
> issue is that IndexSummaryRedistribution tries to use transactions, but 
> mutates the sstable in-place; transaction is unable to roll back.
> This would be fine (only updates summary) if it wasn’t for the fact the task 
> attempts to also mutate the two metrics liveDiskSpaceUsed and 
> totalDiskSpaceUsed, since these can’t be rolled back any cancelation could 
> corrupt these metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org