[ 
https://issues.apache.org/jira/browse/CASSANDRA-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960291#comment-14960291
 ] 

Benedict commented on CASSANDRA-10538:
--------------------------------------

Yes, it looks like we did. This only matters for abort, since commit we want to 
throw either way - but we expect to do this in the caller 
({{LifecycleTransaction}}, so catching and returning them in both _is_ most 
suitable.

Once things quiet down we should really try to introduce fault injection tests 
for this subsystem so we can easily cover this kind of scenario.


> Assertion failed in LogFile when disk is full
> ---------------------------------------------
>
>                 Key: CASSANDRA-10538
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10538
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 3.x
>
>         Attachments: 
> ma_txn_compaction_67311da0-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_696059b0-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_8ac58b70-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_8be24610-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_95500fc0-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_a41caa90-72b4-11e5-9eb9-b14fa4bbe709.log
>
>
> [~carlyeks] was running a stress job which filled up the disk. At the end of 
> the system logs there are several assertion errors:
> {code}
> ERROR [CompactionExecutor:1] 2015-10-14 20:46:55,467 CassandraDaemon.java:195 
> - Exception in thread Thread[CompactionExecutor:1,1,main]
> java.lang.RuntimeException: Insufficient disk space to write 2097152 bytes
>         at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.getWriteDirectory(CompactionAwareWriter.java:156)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:77)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:110)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:220)
>  ~[main/:na]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_40]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_40]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_40]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_40]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
> INFO  [IndexSummaryManager:1] 2015-10-14 21:10:40,099 
> IndexSummaryManager.java:257 - Redistributing index summaries
> ERROR [IndexSummaryManager:1] 2015-10-14 21:10:42,275 
> CassandraDaemon.java:195 - Exception in thread 
> Thread[IndexSummaryManager:1,1,main]
> java.lang.AssertionError: Already completed!
>         at org.apache.cassandra.db.lifecycle.LogFile.abort(LogFile.java:221) 
> ~[main/:na]
>         at 
> org.apache.cassandra.db.lifecycle.LogTransaction.doAbort(LogTransaction.java:376)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.doAbort(LifecycleTransaction.java:259)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:193)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.close(Transactional.java:158)
>  ~[main/:na]
>         at 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:242)
>  ~[main/:na]
>         at 
> org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:134)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
>         at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolE
> {code}
> We should not have an assertion if it can happen when the disk is full, we 
> should rather have a runtime exception.
> I also would like to understand exactly what triggered the assertion. 
> {{LifecycleTransaction}} can throw at the beginning of the commit method if 
> it cannot write the record to disk, in which case all we have to do is ensure 
> we update the records in memory after writing to disk (currently we update 
> them before). However, I am not sure this is what happened here, it looks 
> more like abort was called twice, which should never happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to