[ 
https://issues.apache.org/jira/browse/CASSANDRA-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharvanath Pathak updated CASSANDRA-10534:
------------------------------------------
    Description: 
I was seeing SSTable corruption due to a CompressionInfo.db file of size 0, 
this happened multiple times in our testing with hard node reboots. After some 
investigation it seems like these file is not being fsynced, and that can 
potentially lead to data corruption. I am wroking with version 2.1.9.

I checked for fsync calls using strace, and found them happening for all but 
the following components: CompressionInfo, TOC.txt and digest.sha1. All seem 
tolerable but the  CompressionInfo seem tolerable. Also a quick look through 
the code and did not revealed any fsync calls. Moreover, I suspect the commit  
4e95953f29d89a441dfe06d3f0393ed7dd8586df 
(https://github.com/apache/cassandra/commit/4e95953f29d89a441dfe06d3f0393ed7dd8586df#diff-b7e48a1398e39a936c11d0397d5d1966R344)
 to have caused the regression. Which removed the 
{noformat}
 getChannel().force(true);
{noformat}
from CompressionMetadata.Writer.close.

  was:
I was seeing SSTable corruption due to a CompressionInfo.db file of size 0, 
this happened multiple times in our testing with hard node reboots. After some 
investigation it seems like these file is not being fsynced, and that can 
potentially lead to data corruption. 
I checked for fsync calls using strace, and found them happening for all but 
the following components: CompressionInfo, TOC.txt and digest.sha1. All seem 
tolerable but the  CompressionInfo seem tolerable. Also a quick look through 
the code and did not revealed any fsync calls. Moreover, I suspect the commit  
4e95953f29d89a441dfe06d3f0393ed7dd8586df 
(https://github.com/apache/cassandra/commit/4e95953f29d89a441dfe06d3f0393ed7dd8586df#diff-b7e48a1398e39a936c11d0397d5d1966R344)
 to have caused the regression. Which removed the 
{noformat}
 getChannel().force(true);
{noformat}
from CompressionMetadata.Writer.close.


> CompressionInfo not being fsynced on close
> ------------------------------------------
>
>                 Key: CASSANDRA-10534
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10534
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Sharvanath Pathak
>
> I was seeing SSTable corruption due to a CompressionInfo.db file of size 0, 
> this happened multiple times in our testing with hard node reboots. After 
> some investigation it seems like these file is not being fsynced, and that 
> can potentially lead to data corruption. I am wroking with version 2.1.9.
> I checked for fsync calls using strace, and found them happening for all but 
> the following components: CompressionInfo, TOC.txt and digest.sha1. All seem 
> tolerable but the  CompressionInfo seem tolerable. Also a quick look through 
> the code and did not revealed any fsync calls. Moreover, I suspect the commit 
>  4e95953f29d89a441dfe06d3f0393ed7dd8586df 
> (https://github.com/apache/cassandra/commit/4e95953f29d89a441dfe06d3f0393ed7dd8586df#diff-b7e48a1398e39a936c11d0397d5d1966R344)
>  to have caused the regression. Which removed the 
> {noformat}
>  getChannel().force(true);
> {noformat}
> from CompressionMetadata.Writer.close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to