[ 
https://issues.apache.org/jira/browse/CASSANDRA-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517742#comment-17517742
 ] 

Elliott Sims commented on CASSANDRA-17473:
------------------------------------------

Commented on the thread, but also adding it here in case it's useful:
I've found that GNU tar interprets ctime changes as file changes.  That 
includes inode metadata changes like the hardlink count, which would be 
expected to change in Cassandra if the original sstable is compacted away or if 
additional overlapping snapshots are created.  There's some more info in a very 
old discussion here:  
[https://lists.gnu.org/archive/html/bug-tar/2007-08/msg00013.html] .  I ended 
up working around it by using bsdtar instead, which does not interpret ctime 
changes as file changes.

My guess is that it's what's happening here.  Turning off compaction and not 
taking any other further snapshots while the tar is running would probably also 
work around it.

> sstables changing in snapshots
> ------------------------------
>
>                 Key: CASSANDRA-17473
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17473
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: James Brown
>            Priority: Normal
>
> We use cassandra snapshots and tar to make full backups of our cassandra 
> clusters. Sometimes, tar fails with a message like
> {{tar: 
> data/addresses/addresses-eb0196100b7d11ec852b1541747d640a/snapshots/backup20220318183708/nb-167-big-Data.db:
>  file changed as we read it}}
> This is kind of strange, since we're reading from a snapshot.
> The (very simplified) relevant snippet looks roughly like
> {code:java}
> nice nodetool "${JMX_ARGS[@]}" snapshot -t "$TAG" "${KEYSPACES[@]}"
> tar --hard-dereference -czpf data///snapshots/"$TAG"/{code}
> This happens maybe 1% of the time when taking backups.
> There are no concurrent snapshots going on, but there are concurrent 
> compactions and repairs, of course. If it matters, this cluster _is_ running 
> incremental repairs.
> This is on Cassandra 4.0.3.
> It seems wrong to me that an sstable could ever be written to while it's in a 
> snapshot.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to