Re: Snapshot SSTable modified??

2018-05-30 Thread Max C.
Oh, thanks Elliott for the explanation!  I had no idea about that little tidbit 
concerning ctime.   Now it all makes sense!

- Max

> On May 28, 2018, at 10:24 pm, Elliott Sims  wrote:
> 
> Unix timestamps are a bit odd.  "mtime/Modify" is file changes, 
> "ctime/Change/(sometimes called create)" is file metadata changes, and a link 
> count change is a metadata change.  This seems like an odd decision on the 
> part of GNU tar, but presumably there's a good reason for it.
> 
> When the original sstable is compacted away, it's removed and therefore the 
> link count on the snapshot file is decremented.  The file's contents haven't 
> changed so mtime is identical, but ctime does get updated.  BSDtar doesn't 
> seem to interpret link count changes as a file change, so it's pretty 
> effective as a workaround.
> 
> 
> 
> On Fri, May 25, 2018 at 8:00 PM, Max C  > wrote:
> I looked at the source code for GNU tar, and it looks for a change in the 
> create time or (more likely) a change in the size.
> 
> This seems very strange to me — I would think that creating a snapshot would 
> cause a flush and then once the SSTables are written, hardlinks would be 
> created and the SSTables wouldn't be written to after that.
> 
> Our solution is to wait 5 minutes and retry the tar if an error occurs.  This 
> isn't ideal - but it's the best I could come up with.  :-/
> 
> Thanks Jeff & others for your responses.
> 
> - Max
> 
>> On May 25, 2018, at 5:05pm, Elliott Sims > > wrote:
>> 
>> I've run across this problem before - it seems like GNU tar interprets 
>> changes in the link count as changes to the file, so if the file gets 
>> compacted mid-backup it freaks out even if the file contents are unchanged.  
>> I worked around it by just using bsdtar instead.
>> 
>> On Thu, May 24, 2018 at 6:08 AM, Nitan Kainth > > wrote:
>> Jeff,
>> 
>> Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't 
>> impact backup operation right?
>> 
>> 
>> Regards,
>> Nitan K.
>> Cassandra and Oracle Architect/SME
>> Datastax Certified Cassandra expert
>> Oracle 10g Certified
>> 
>> On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa > > wrote:
>> In versions before 3.0, sstables were written with a -tmp filename and 
>> copied/moved to the final filename when complete. This changes in 3.0 - we 
>> write into the file with the final name, and have a journal/log to let uss 
>> know when it's done/final/live.
>> 
>> Therefore, you can no longer just watch for a -Data.db file to be created 
>> and uploaded - you have to watch the log to make sure it's not being written.
>> 
>> 
>> On Wed, May 23, 2018 at 2:18 PM, Max C. > > wrote:
>> Hi Everyone,
>> 
>> We’ve noticed a few times in the last few weeks that when we’re doing 
>> backups, tar has complained with messages like this:
>> 
>> tar: 
>> /var/lib/cassandra/data/mars/test_instances_by_test_id-6a9440a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
>>  file changed as we read it
>> 
>> Any idea what might be causing this?
>> 
>> We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our 
>> backup process:
>> 
>> 
>> SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
>> nodetool snapshot -t $SNAPSHOT_NAME
>> 
>> for each keyspace
>> - dump schema to “schema.cql"
>> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_MMDD_HHMMSS.tgz 
>> schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME
>> 
>> nodetool clearsnapshot -t $SNAPSHOT_NAME
>> 
>> Thanks.
>> 
>> - Max
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>> 
>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>> 
>> 
>> 
>> 
>> 
> 
> 



Re: Snapshot SSTable modified??

2018-05-28 Thread Elliott Sims
Unix timestamps are a bit odd.  "mtime/Modify" is file changes,
"ctime/Change/(sometimes called create)" is file metadata changes, and a
link count change is a metadata change.  This seems like an odd decision on
the part of GNU tar, but presumably there's a good reason for it.

When the original sstable is compacted away, it's removed and therefore the
link count on the snapshot file is decremented.  The file's contents
haven't changed so mtime is identical, but ctime does get updated.  BSDtar
doesn't seem to interpret link count changes as a file change, so it's
pretty effective as a workaround.



On Fri, May 25, 2018 at 8:00 PM, Max C  wrote:

> I looked at the source code for GNU tar, and it looks for a change in the
> create time or (more likely) a change in the size.
>
> This seems very strange to me — I would think that creating a snapshot
> would cause a flush and then once the SSTables are written, hardlinks would
> be created and the SSTables wouldn't be written to after that.
>
> Our solution is to wait 5 minutes and retry the tar if an error occurs.
> This isn't ideal - but it's the best I could come up with.  :-/
>
> Thanks Jeff & others for your responses.
>
> - Max
>
> On May 25, 2018, at 5:05pm, Elliott Sims  wrote:
>
> I've run across this problem before - it seems like GNU tar interprets
> changes in the link count as changes to the file, so if the file gets
> compacted mid-backup it freaks out even if the file contents are
> unchanged.  I worked around it by just using bsdtar instead.
>
> On Thu, May 24, 2018 at 6:08 AM, Nitan Kainth 
> wrote:
>
>> Jeff,
>>
>> Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't
>> impact backup operation right?
>>
>>
>> Regards,
>> Nitan K.
>> Cassandra and Oracle Architect/SME
>> Datastax Certified Cassandra expert
>> Oracle 10g Certified
>>
>> On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa  wrote:
>>
>>> In versions before 3.0, sstables were written with a -tmp filename and
>>> copied/moved to the final filename when complete. This changes in 3.0 - we
>>> write into the file with the final name, and have a journal/log to let uss
>>> know when it's done/final/live.
>>>
>>> Therefore, you can no longer just watch for a -Data.db file to be
>>> created and uploaded - you have to watch the log to make sure it's not
>>> being written.
>>>
>>>
>>> On Wed, May 23, 2018 at 2:18 PM, Max C.  wrote:
>>>
 Hi Everyone,

 We’ve noticed a few times in the last few weeks that when we’re doing
 backups, tar has complained with messages like this:

 tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-6a944
 0a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
 file changed as we read it

 Any idea what might be causing this?

 We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of
 our backup process:

 
 SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
 nodetool snapshot -t $SNAPSHOT_NAME

 for each keyspace
 - dump schema to “schema.cql"
 - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_MMDD_HHMMSS.tgz
 schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME

 nodetool clearsnapshot -t $SNAPSHOT_NAME

 Thanks.

 - Max
 -
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: user-h...@cassandra.apache.org


>>>
>>
>
>


Re: Snapshot SSTable modified??

2018-05-25 Thread Max C
I looked at the source code for GNU tar, and it looks for a change in the 
create time or (more likely) a change in the size.

This seems very strange to me — I would think that creating a snapshot would 
cause a flush and then once the SSTables are written, hardlinks would be 
created and the SSTables wouldn't be written to after that.

Our solution is to wait 5 minutes and retry the tar if an error occurs.  This 
isn't ideal - but it's the best I could come up with.  :-/

Thanks Jeff & others for your responses.

- Max

> On May 25, 2018, at 5:05pm, Elliott Sims  wrote:
> 
> I've run across this problem before - it seems like GNU tar interprets 
> changes in the link count as changes to the file, so if the file gets 
> compacted mid-backup it freaks out even if the file contents are unchanged.  
> I worked around it by just using bsdtar instead.
> 
> On Thu, May 24, 2018 at 6:08 AM, Nitan Kainth  > wrote:
> Jeff,
> 
> Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't 
> impact backup operation right?
> 
> 
> Regards,
> Nitan K.
> Cassandra and Oracle Architect/SME
> Datastax Certified Cassandra expert
> Oracle 10g Certified
> 
> On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa  > wrote:
> In versions before 3.0, sstables were written with a -tmp filename and 
> copied/moved to the final filename when complete. This changes in 3.0 - we 
> write into the file with the final name, and have a journal/log to let uss 
> know when it's done/final/live.
> 
> Therefore, you can no longer just watch for a -Data.db file to be created and 
> uploaded - you have to watch the log to make sure it's not being written.
> 
> 
> On Wed, May 23, 2018 at 2:18 PM, Max C.  > wrote:
> Hi Everyone,
> 
> We’ve noticed a few times in the last few weeks that when we’re doing 
> backups, tar has complained with messages like this:
> 
> tar: 
> /var/lib/cassandra/data/mars/test_instances_by_test_id-6a9440a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
>  file changed as we read it
> 
> Any idea what might be causing this?
> 
> We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our 
> backup process:
> 
> 
> SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
> nodetool snapshot -t $SNAPSHOT_NAME
> 
> for each keyspace
> - dump schema to “schema.cql"
> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_MMDD_HHMMSS.tgz 
> schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME
> 
> nodetool clearsnapshot -t $SNAPSHOT_NAME
> 
> Thanks.
> 
> - Max
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
> 
> 
> 
> 



Re: Snapshot SSTable modified??

2018-05-25 Thread Elliott Sims
I've run across this problem before - it seems like GNU tar interprets
changes in the link count as changes to the file, so if the file gets
compacted mid-backup it freaks out even if the file contents are
unchanged.  I worked around it by just using bsdtar instead.

On Thu, May 24, 2018 at 6:08 AM, Nitan Kainth  wrote:

> Jeff,
>
> Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't
> impact backup operation right?
>
>
> Regards,
> Nitan K.
> Cassandra and Oracle Architect/SME
> Datastax Certified Cassandra expert
> Oracle 10g Certified
>
> On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa  wrote:
>
>> In versions before 3.0, sstables were written with a -tmp filename and
>> copied/moved to the final filename when complete. This changes in 3.0 - we
>> write into the file with the final name, and have a journal/log to let uss
>> know when it's done/final/live.
>>
>> Therefore, you can no longer just watch for a -Data.db file to be created
>> and uploaded - you have to watch the log to make sure it's not being
>> written.
>>
>>
>> On Wed, May 23, 2018 at 2:18 PM, Max C.  wrote:
>>
>>> Hi Everyone,
>>>
>>> We’ve noticed a few times in the last few weeks that when we’re doing
>>> backups, tar has complained with messages like this:
>>>
>>> tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-6a944
>>> 0a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
>>> file changed as we read it
>>>
>>> Any idea what might be causing this?
>>>
>>> We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our
>>> backup process:
>>>
>>> 
>>> SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
>>> nodetool snapshot -t $SNAPSHOT_NAME
>>>
>>> for each keyspace
>>> - dump schema to “schema.cql"
>>> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_MMDD_HHMMSS.tgz
>>> schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME
>>>
>>> nodetool clearsnapshot -t $SNAPSHOT_NAME
>>>
>>> Thanks.
>>>
>>> - Max
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>
>


Re: Snapshot SSTable modified??

2018-05-24 Thread Nitan Kainth
Jeff,

Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't
impact backup operation right?


Regards,
Nitan K.
Cassandra and Oracle Architect/SME
Datastax Certified Cassandra expert
Oracle 10g Certified

On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa  wrote:

> In versions before 3.0, sstables were written with a -tmp filename and
> copied/moved to the final filename when complete. This changes in 3.0 - we
> write into the file with the final name, and have a journal/log to let uss
> know when it's done/final/live.
>
> Therefore, you can no longer just watch for a -Data.db file to be created
> and uploaded - you have to watch the log to make sure it's not being
> written.
>
>
> On Wed, May 23, 2018 at 2:18 PM, Max C.  wrote:
>
>> Hi Everyone,
>>
>> We’ve noticed a few times in the last few weeks that when we’re doing
>> backups, tar has complained with messages like this:
>>
>> tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-6a944
>> 0a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
>> file changed as we read it
>>
>> Any idea what might be causing this?
>>
>> We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our
>> backup process:
>>
>> 
>> SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
>> nodetool snapshot -t $SNAPSHOT_NAME
>>
>> for each keyspace
>> - dump schema to “schema.cql"
>> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_MMDD_HHMMSS.tgz
>> schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME
>>
>> nodetool clearsnapshot -t $SNAPSHOT_NAME
>>
>> Thanks.
>>
>> - Max
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>


Re: Snapshot SSTable modified??

2018-05-23 Thread Jeff Jirsa
In versions before 3.0, sstables were written with a -tmp filename and
copied/moved to the final filename when complete. This changes in 3.0 - we
write into the file with the final name, and have a journal/log to let uss
know when it's done/final/live.

Therefore, you can no longer just watch for a -Data.db file to be created
and uploaded - you have to watch the log to make sure it's not being
written.


On Wed, May 23, 2018 at 2:18 PM, Max C.  wrote:

> Hi Everyone,
>
> We’ve noticed a few times in the last few weeks that when we’re doing
> backups, tar has complained with messages like this:
>
> tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-
> 6a9440a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
> file changed as we read it
>
> Any idea what might be causing this?
>
> We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our
> backup process:
>
> 
> SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
> nodetool snapshot -t $SNAPSHOT_NAME
>
> for each keyspace
> - dump schema to “schema.cql"
> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_MMDD_HHMMSS.tgz
> schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME
>
> nodetool clearsnapshot -t $SNAPSHOT_NAME
>
> Thanks.
>
> - Max
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Snapshot SSTable modified??

2018-05-23 Thread Max C.
Hi Everyone,

We’ve noticed a few times in the last few weeks that when we’re doing backups, 
tar has complained with messages like this:

tar: 
/var/lib/cassandra/data/mars/test_instances_by_test_id-6a9440a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
 file changed as we read it

Any idea what might be causing this?

We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our backup 
process:


SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
nodetool snapshot -t $SNAPSHOT_NAME

for each keyspace
- dump schema to “schema.cql"
- tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_MMDD_HHMMSS.tgz 
schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME

nodetool clearsnapshot -t $SNAPSHOT_NAME

Thanks.

- Max
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org