[
https://issues.apache.org/jira/browse/CASSANDRA-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952941#comment-14952941
]
Benedict commented on CASSANDRA-10421:
--------------------------------------
bq. It doesn't however change the fact that we write more and have therefore
more chances of failing to do so
Right. This isn't exposing us to anything too terrible though, really, since
this clearly should be infrequent or there are bigger problems. If there is a
disk failure it is quite reasonable to mark the disk as unavailable and start
again (from scratch)\*.
bq. Unless you meant to not store the relative file path (the path of an
sstable relative to a txn file),
Yes, I think now it is probably best to store the full path (or, if we wanted
to be helpful, the relative path to the common ancestor of all data disks)
bq. To be clear, we would roll back the transaction if we failed to write a
record to any file except for the last record, where we could tolerate failures
provided we have written to at least one file?
That's how I would do it, yes, as it has the most semantic consistency. With
one addendum: on restart, I would consider *any* incomplete (or missing) final
records to be the same as our current incomplete final record behaviour. i.e.
we enter paranoid mode and make sure all of the files are present. If they
aren't, we leave them all, just in case. This means we leave some extra
duplicate data in disk in some cases during a restart when we had some
temporary disk problems, but that's a fine tradeoff for simplicity and safety
in my book, since these should be rare scenarios, and are clearly scenarios of
hardware issues, in which paranoia is probably best. This does technically
break the semantic consistency argument on restart, but I think acceptably.
\* It may be that we could detect disk failure here and also retain more data,
but it is highly unlikely this complexity would pay dividends given the window
of effect, and the per-disk vnode allocations coming in the near future.
> Potential issue with LogTransaction as it only checks in a single directory
> for files
> -------------------------------------------------------------------------------------
>
> Key: CASSANDRA-10421
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10421
> Project: Cassandra
> Issue Type: Bug
> Reporter: Marcus Eriksson
> Assignee: Stefania
> Priority: Blocker
> Fix For: 3.0.0 rc2
>
>
> When creating a new LogTransaction we try to create the new logfile in the
> same directory as the one we are writing to, but as we use
> {{[directories.getDirectoryForNewSSTables()|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LogTransaction.java#L125]}}
> this might end up in "any" of the configured data directories. If it does,
> we will not be able to clean up leftovers as we check for files in the same
> directory as the logfile was created:
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/lifecycle/LogRecord.java#L163
> cc [~Stefania]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)