[ 
https://issues.apache.org/jira/browse/CASSANDRA-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-14383:
---------------------------------------
    Description: 
We can't catch fsync errors and continue so we shouldn't have code that does 
that in C*. There was a Postgres bug where fsync returned an error and the FS 
lost data, but subsequent fsyncs succeeded.

The [LastErrorException code in 
NativeLibrary.trySync|https://github.com/apache/cassandra/commit/be313935e54be450d9aaabda7965a2f266e922c9#diff-4258621cdf765f0fea6770db5d40038fR307]
 looks a little janky. What's up with that? When would trySync be something we 
would merely try? If try is good enough why do it at all considering try is the 
default behavior of a series of unsynced filesystem operations.

-Also when we fsync in FD it's not just fsyncing that file the FS is 
potentially fsyncing other data and the error code we get could be related to 
that other data so we can't safely ignore it. The filesystem could be 
internally inconsistent as well. This happens because the FS journaling may 
force the FS to flush other data as well to preserve the ordering requirements 
of journaled metadata.- I'm actually not 100% sure when/if this is the case.

If we ignore fsync errors it needs to be for whitelisted reasons such as a bad 
FD.

I know we have FSErrorHandler and it makes sense for reads, but I'm not sold on 
it being the right answer for writes. We don't retry flushing a memtable or 
writing to the commit log to my knowledge. We could go read only and I need to 
check if that is what we do in practice.

  was:
We can't catch fsync errors and continue so we shouldn't have code that does 
that in C*. There was a Postgres bug where fsync returned an error and the FS 
lost data, but subsequent fsyncs succeeded.

The [LastErrorException code in 
NativeLibrary.trySync|https://github.com/apache/cassandra/commit/be313935e54be450d9aaabda7965a2f266e922c9#diff-4258621cdf765f0fea6770db5d40038fR307]
 looks a little janky. What's up with that? When would trySync be something we 
would merely try? If try is good enough why do it at all considering try is the 
default behavior of a series of unsynced filesystem operations.

Also when we fsync in FD it's not just fsyncing that file the FS is potentially 
fsyncing other data and the error code we get could be related to that other 
data so we can't safely ignore it. The filesystem could be internally 
inconsistent as well. This happens because the FS journaling may force the FS 
to flush other data as well to preserve the ordering requirements of journaled 
metadata.

If we ignore fsync errors it needs to be for whitelisted reasons such as a bad 
FD.

I know we have FSErrorHandler and it makes sense for reads, but I'm not sold on 
it being the right answer for writes. We don't retry flushing a memtable or 
writing to the commit log to my knowledge. We could go read only and I need to 
check if that is w


> If fsync fails it's always an issue and continuing execution is suspect
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-14383
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14383
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>            Priority: Major
>             Fix For: 2.1.x, 3.0.x, 3.11.x, 4.0.x
>
>
> We can't catch fsync errors and continue so we shouldn't have code that does 
> that in C*. There was a Postgres bug where fsync returned an error and the FS 
> lost data, but subsequent fsyncs succeeded.
> The [LastErrorException code in 
> NativeLibrary.trySync|https://github.com/apache/cassandra/commit/be313935e54be450d9aaabda7965a2f266e922c9#diff-4258621cdf765f0fea6770db5d40038fR307]
>  looks a little janky. What's up with that? When would trySync be something 
> we would merely try? If try is good enough why do it at all considering try 
> is the default behavior of a series of unsynced filesystem operations.
> -Also when we fsync in FD it's not just fsyncing that file the FS is 
> potentially fsyncing other data and the error code we get could be related to 
> that other data so we can't safely ignore it. The filesystem could be 
> internally inconsistent as well. This happens because the FS journaling may 
> force the FS to flush other data as well to preserve the ordering 
> requirements of journaled metadata.- I'm actually not 100% sure when/if this 
> is the case.
> If we ignore fsync errors it needs to be for whitelisted reasons such as a 
> bad FD.
> I know we have FSErrorHandler and it makes sense for reads, but I'm not sold 
> on it being the right answer for writes. We don't retry flushing a memtable 
> or writing to the commit log to my knowledge. We could go read only and I 
> need to check if that is what we do in practice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to