[jira] [Commented] (CASSANDRA-10538) Assertion failed in LogFile when disk is full

Ariel Weisberg (JIRA) Wed, 11 Nov 2015 10:44:56 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000871#comment-15000871
 ]


Ariel Weisberg commented on CASSANDRA-10538:
--------------------------------------------

How does Throwables.perfom handle AssertionError? It looks like it swallows it? 
Seems like AssertionError shouldn't be caught and should be allowed to 
terminate the JVM?

To make sure I understand the fix. The issue was that we marked something 
committed in memory when committing (or aborting) fails to persist to disk 
because the disk is full. The fix was to write to disk first then memory, and 
if writing to disk for commit fails we can hit the abort path and then that can 
fail as well. 

Or is this hitting abort and abort like you would expect given that the disk is 
full and the transaction probably can't complete successfully?

> Assertion failed in LogFile when disk is full
> ---------------------------------------------
>
>                 Key: CASSANDRA-10538
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10538
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 3.x
>
>         Attachments: 
> ma_txn_compaction_67311da0-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_696059b0-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_8ac58b70-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_8be24610-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_95500fc0-72b4-11e5-9eb9-b14fa4bbe709.log, 
> ma_txn_compaction_a41caa90-72b4-11e5-9eb9-b14fa4bbe709.log
>
>
> [~carlyeks] was running a stress job which filled up the disk. At the end of 
> the system logs there are several assertion errors:
> {code}
> ERROR [CompactionExecutor:1] 2015-10-14 20:46:55,467 CassandraDaemon.java:195 
> - Exception in thread Thread[CompactionExecutor:1,1,main]
> java.lang.RuntimeException: Insufficient disk space to write 2097152 bytes
>         at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.getWriteDirectory(CompactionAwareWriter.java:156)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:77)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:110)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:220)
>  ~[main/:na]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_40]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_40]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_40]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_40]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
> INFO  [IndexSummaryManager:1] 2015-10-14 21:10:40,099 
> IndexSummaryManager.java:257 - Redistributing index summaries
> ERROR [IndexSummaryManager:1] 2015-10-14 21:10:42,275 
> CassandraDaemon.java:195 - Exception in thread 
> Thread[IndexSummaryManager:1,1,main]
> java.lang.AssertionError: Already completed!
>         at org.apache.cassandra.db.lifecycle.LogFile.abort(LogFile.java:221) 
> ~[main/:na]
>         at 
> org.apache.cassandra.db.lifecycle.LogTransaction.doAbort(LogTransaction.java:376)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.doAbort(LifecycleTransaction.java:259)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:193)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.close(Transactional.java:158)
>  ~[main/:na]
>         at 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:242)
>  ~[main/:na]
>         at 
> org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:134)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
>         at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolE
> {code}
> We should not have an assertion if it can happen when the disk is full, we 
> should rather have a runtime exception.
> I also would like to understand exactly what triggered the assertion. 
> {{LifecycleTransaction}} can throw at the beginning of the commit method if 
> it cannot write the record to disk, in which case all we have to do is ensure 
> we update the records in memory after writing to disk (currently we update 
> them before). However, I am not sure this is what happened here, it looks 
> more like abort was called twice, which should never happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10538) Assertion failed in LogFile when disk is full

Reply via email to