[ 
https://issues.apache.org/jira/browse/CASSANDRA-20692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-20692:
---------------------------------------
    Description: 
[~maxwellguo] spotted this a few days ago.

The commit log is not safe as currently written with Direct IO. It writes to 
the file, but doesn't sync the metadata on flush. That means the commit log may 
claim it has flushed (and made durable) the data, but the filesystem journal 
has not been flushed so the file length could be wrong and could truncate the 
file on restart.

Additionally Direct IO doesn't actually make data durable on disk (emit write 
barriers) it just bypasses the page cache. [It doesn't even guarantee the disk 
transaction is 
complete.|https://man7.org/linux/man-pages/man2/open.2.html#:~:text=The%20O_DIRECT%20flag,for%20further%20discussion.]
 If the disk cache is volatile then it can lose metadata and data.

It can probably be fixed pretty trivially by opening the file with {{D_SYNC}} 
because the commit log writes up to the entire segment when it flushes so there 
is no issue with needing to add buffering to avoid too many small writes.

[~amit_pawar] [~jlewandowski] [~blambov] 

  was:
[~maxwellguo] spotted this a few days ago.

The commit log is not safe as currently written with Direct IO. It writes to 
the file, but doesn't sync the metadata on flush. That means the commit log may 
claim it has flushed (and made durable) the data, but the filesystem journal 
has not been flushed so the file length could be wrong and could truncate the 
file on restart.

Additionally Direct IO doesn't actually make data durable on disk (emit write 
barriers) it just bypasses the page cache. It doesn't even guarantee the disk 
transaction is complete. If the disk cache is volatile then it can lose 
metadata and data.

It can probably be fixed pretty trivially by opening the file with {{D_SYNC}} 
because the commit log writes up to the entire segment when it flushes so there 
is no issue with needing to add buffering to avoid too many small writes.

[~amit_pawar] [~jlewandowski] [~blambov] 


> Direct IO commit log does not flush data safely
> -----------------------------------------------
>
>                 Key: CASSANDRA-20692
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20692
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log
>            Reporter: Ariel Weisberg
>            Assignee: guo Maxwell
>            Priority: Urgent
>             Fix For: 5.0.x, 5.x
>
>
> [~maxwellguo] spotted this a few days ago.
> The commit log is not safe as currently written with Direct IO. It writes to 
> the file, but doesn't sync the metadata on flush. That means the commit log 
> may claim it has flushed (and made durable) the data, but the filesystem 
> journal has not been flushed so the file length could be wrong and could 
> truncate the file on restart.
> Additionally Direct IO doesn't actually make data durable on disk (emit write 
> barriers) it just bypasses the page cache. [It doesn't even guarantee the 
> disk transaction is 
> complete.|https://man7.org/linux/man-pages/man2/open.2.html#:~:text=The%20O_DIRECT%20flag,for%20further%20discussion.]
>  If the disk cache is volatile then it can lose metadata and data.
> It can probably be fixed pretty trivially by opening the file with {{D_SYNC}} 
> because the commit log writes up to the entire segment when it flushes so 
> there is no issue with needing to add buffering to avoid too many small 
> writes.
> [~amit_pawar] [~jlewandowski] [~blambov] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to