Re: Restore Snapshot

2017-06-28 Thread kurt greaves
Hm, I did recall seeing a ticket for this particular use case, which is
certainly useful, I just didn't think it had been implemented yet. Turns
out it's been in since 2.0.7, so you should be receiving writes with
join_ring=false. If you confirm you aren't receiving writes then we have an
issue. https://issues.apache.org/jira/browse/CASSANDRA-6961


Re: Restore Snapshot

2017-06-28 Thread Anuj Wadehra
Thanks Kurt. 

I think the main scenario which MUST be addressed by snapshot is Backup/Restore 
so that a node can be restored with minimal time and the lengthy procedure of 
boottsrapping with join_ring=false followed by full repair can be avoided. The 
plain restore snapshot + repair scenario seems to be broken. The situation is 
less critical when you use join_ring=false.   .
Changing consistency level to ALL is not an optimal solution or workaround 
because it may impact performance. Moreover, it is an unreasonable and unstated 
assumption that Cassandra users can dynamically change CL and then revert it 
back after the repair.
Ideal restore process should be :1. Restore Snapshot2. Start the node with 
join_ring=false
3. Cassandra should ACCEPT writes in this phase just like bootstrap with 
join_ring=false.4. Repair the node5. Join the node.
Point 3 seems to be missing in current implementation of join_ring. Thus, at 
step 5 when the node joins the ring, it will NOT lead to inconsistent writes as 
all the data updates after the snapshot was taken and before the snapshot was 
restored are consistent on all the nodes. BUT now, the node has missed on 
important updates done while the repair was going on. So, full repair didn't 
synced entire data. It fixed inconsistencies and prevented inconsistent reeads 
and lead to NEW inconsistencies. You need another full repair on the node :(
I will conduct a test to be 100% sure that join_ring is not accepting writes 
and if  I get same results, I will create a JIRA.
We are updating file system on nodes and doing it one node a time to avoid 
downtime. Snapshot cuts down on excessive streaming and lengthy procedure 
(boostrap+repair), so we were evaluating snapshot restore as an option.

ThanksAnuj


   

On Wednesday, 28 June 2017 5:56 PM, kurt greaves  
wrote:
 

 There are many scenarios where it can be useful, but to address what seems to 
be your main concern; you could simply restore and then only read at ALL until 
your repair completes.
If you use snapshot restore with commitlog archiving you're in a better state, 
but granted the case you described can still occur. To some extent, if you have 
to restore a snapshot you will have to perform some kind of repair. It's not 
really possible to restore to an older point and expect strong consistency.

Snapshots are also useful for creating a clone of a cluster/node.
But really why are you only restoring a snapshot on one node? If you lost all 
the data, it would be much simpler to just replace the node.​

   

Re: Restore Snapshot

2017-06-28 Thread kurt greaves
There are many scenarios where it can be useful, but to address what seems
to be your main concern; you could simply restore and then only read at ALL
until your repair completes.

If you use snapshot restore with commitlog archiving you're in a better
state, but granted the case you described can still occur. To some extent,
if you have to restore a snapshot you will have to perform some kind of
repair. It's not really possible to restore to an older point and expect
strong consistency.

Snapshots are also useful for creating a clone of a cluster/node.

But really why are you only restoring a snapshot on one node? If you lost
all the data, it would be much simpler to just replace the node.
​


RE: RE Restore snapshot

2012-08-14 Thread mdione.ext
De : mdione@orange.com [mailto:mdione@orange.com]
   In particular, I'm thinking on a restore like this:
 
 * the app does something stupid.
 * (if possible) I stop writes to the KS or CF.

  In fact, given that I'm about to restore the KS/CF to an old state, I can 
safely do this:

  * drop and restore the schema.

 * remove the db files with the wrong data.
 * I put the db files from the restore.
 * nodetool refresh
 * profit!

  This works perfectly.

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



RE: RE Restore snapshot

2012-08-13 Thread mdione.ext
 De : Sylvain Lebresne [mailto:sylv...@datastax.com]
  2) copy the snapshot sstable in the right place and call the JMX
  method
  loadNewSSTables() (in the column family MBean, which mean you need to
  do that per-CF).
 
   How does this affect the contents of the CommitLogs? I mean, I
 imagine using this for rollback'ing errors in the app side, like
 some/many/all objects deleted. How will updates/deletes saved in
 CommitLogs impact the data restored this way? Also: while I'm restoring
 a KS of CF, is it possible to cut write requests but not reads? I hope
 I'm explaining myself clearly...

  In particular, I'm thinking on a restore like this:

* the app does something stupid.
* (if possible) I stop writes to the KS or CF.
* remove the db files with the wrong data.
* I put the db files from the restore.
* nodetool refresh
* profit!

  But on Friday I did it slightly different (too tired, I was confused), 
and only did the refresh on one node and then a repair on the same, and 
it failed like this:

* state just after the app did something stupid:

root@pnscassandra04:~# tree /var/opt/hosting/db/cassandra/data/one_cf/cf_1/
/var/opt/hosting/db/cassandra/data/one_cf/cf_1/
|-- one_cf-cf_1-hd-13-CompressionInfo.db
|-- one_cf-cf_1-hd-13-Data.db
|-- one_cf-cf_1-hd-13-Filter.db
|-- one_cf-cf_1-hd-13-Index.db
|-- one_cf-cf_1-hd-13-Statistics.db
|-- one_cf-cf_1-hd-14-CompressionInfo.db
|-- one_cf-cf_1-hd-14-Data.db
|-- one_cf-cf_1-hd-14-Filter.db
|-- one_cf-cf_1-hd-14-Index.db
|-- one_cf-cf_1-hd-14-Statistics.db
|-- one_cf-cf_1-hd-15-CompressionInfo.db
|-- one_cf-cf_1-hd-15-Data.db
|-- one_cf-cf_1-hd-15-Filter.db
|-- one_cf-cf_1-hd-15-Index.db
|-- one_cf-cf_1-hd-15-Statistics.db
`-- snapshots
`-- 1344609231532
|-- one_cf-cf_1-hd-10-CompressionInfo.db
|-- one_cf-cf_1-hd-10-Data.db
|-- one_cf-cf_1-hd-10-Filter.db
|-- one_cf-cf_1-hd-10-Index.db
|-- one_cf-cf_1-hd-10-Statistics.db
|-- one_cf-cf_1-hd-11-CompressionInfo.db
|-- one_cf-cf_1-hd-11-Data.db
|-- one_cf-cf_1-hd-11-Filter.db
|-- one_cf-cf_1-hd-11-Index.db
|-- one_cf-cf_1-hd-11-Statistics.db
|-- one_cf-cf_1-hd-12-CompressionInfo.db
|-- one_cf-cf_1-hd-12-Data.db
|-- one_cf-cf_1-hd-12-Filter.db
|-- one_cf-cf_1-hd-12-Index.db
|-- one_cf-cf_1-hd-12-Statistics.db
|-- one_cf-cf_1-hd-9-CompressionInfo.db
|-- one_cf-cf_1-hd-9-Data.db
|-- one_cf-cf_1-hd-9-Filter.db
|-- one_cf-cf_1-hd-9-Index.db
`-- one_cf-cf_1-hd-9-Statistics.db

* I remove the 'wrong' databases

root@pnscassandra04:~# rm -v /var/opt/hosting/db/cassandra/data/one_cf/cf_1/*.db
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-13-CompressionInfo.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-13-Data.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-13-Filter.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-13-Index.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-13-Statistics.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-14-CompressionInfo.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-14-Data.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-14-Filter.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-14-Index.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-14-Statistics.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-15-CompressionInfo.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-15-Data.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-15-Filter.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-15-Index.db'
removed 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-15-Statistics.db'

* restore

root@pnscassandra04:~# for i in 
/var/opt/hosting/db/cassandra/data/one_cf/cf_1/snapshots/1344609231532/*; do 
ln -v $i /var/opt/hosting/db/cassandra/data/one_cf/cf_1/; 
done
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-10-CompressionInfo.db'
 = 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/snapshots/1344609231532/one_cf-cf_1-hd-10-CompressionInfo.db'
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-10-Data.db' = 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/snapshots/1344609231532/one_cf-cf_1-hd-10-Data.db'
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-10-Filter.db' = 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/snapshots/1344609231532/one_cf-cf_1-hd-10-Filter.db'
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-10-Index.db' = 
`/var/opt/hosting/db/cassandra/data/one_cf/cf_1/snapshots/1344609231532/one_cf-cf_1-hd-10-Index.db'

RE: RE Restore snapshot

2012-08-13 Thread mdione.ext
De : mdione@orange.com [mailto:mdione@orange.com]
  De : Sylvain Lebresne [mailto:sylv...@datastax.com]
   2) copy the snapshot sstable in the right place and call the JMX
   method
   loadNewSSTables() (in the column family MBean, which mean you need
   to do that per-CF).
 
  How does this affect the contents of the CommitLogs? I mean, I
  imagine using this for rollback'ing errors in the app side, like
  some/many/all objects deleted. How will updates/deletes saved in
  CommitLogs impact the data restored this way? Also: while I'm
  restoring a KS of CF, is it possible to cut write requests but not
  reads? I hope I'm explaining myself clearly...
 
   In particular, I'm thinking on a restore like this:
 
 * the app does something stupid.
 * (if possible) I stop writes to the KS or CF.
 * remove the db files with the wrong data.
 * I put the db files from the restore.
 * nodetool refresh
 * profit!
 [...]
 * and here's the error at repair time:
 
 root@pnscassandra04:~# nodetool repair one_cf cf_1
 
 INFO [RMI TCP Connection(56)-10.234.169.244] 2012-08-10 17:03:12,354
 StorageService.java (line 1925) Starting repair command #3, repairing 5
 ranges.
 INFO [AntiEntropySessions:5] 2012-08-10 17:03:12,358
 AntiEntropyService.java (line 666) [repair #811be330-e2fc-11e1--
 b64817724cbd] new session: will sync
 pnscassandra04.vdev.bas.s1.p.fti.net/10.234.169.244, /10.234.169.245,
 /10.234.169.241, /10.234.169.242, /10.234.169.243 on range
 (34028236692093846346337460743176821145,6805647338418769269267492148635
 3642291] for one_cf.[cf_1] INFO [AntiEntropySessions:5] 2012-08-10
 17:03:12,359 AntiEntropyService.java (line 871) [repair #811be330-e2fc-
 11e1--b64817724cbd] requesting merkle trees for cf_1 (to
 [/10.234.169.245, /10.234.169.241, /10.234.169.242, /10.234.169.243,
 pnscassandra04.vdev.bas.s1.p.fti.net/10.234.169.244])
 ERROR [ValidationExecutor:2] 2012-08-10 17:03:12,361
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[ValidationExecutor:2,1,main]
 java.io.IOError: java.io.FileNotFoundException:
 /var/opt/hosting/db/cassandra/data/one_cf/cf_1/one_cf-cf_1-hd-13-
 Data.db (No such file or directory) [...]
 
   I understand that this procedure (refresh+repair in one node) makes
 absolutely no sense, but the error leads me to think that after
 refresh, the 'wrong' databases are still taken in account. Is it that
 the whole procedure is bad from the start and I should use
 SSTableLoader instead?

  Ok, I just did the refresh without the repair on all the nodes at the 
same time, and the error is still there. Should I drop the CF or KS before
doing this kind of restore?

--
Marcos Dione
SysAdmin
Astek Sud-Est
pour FT/TGPF/OPF/PORTAIL/DOP/HEBEX @ Marco Polo
04 97 12 62 45 - mdione@orange.com

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



RE: RE Restore snapshot

2012-08-09 Thread mdione.ext
De : Sylvain Lebresne [mailto:sylv...@datastax.com]
 2) copy the snapshot sstable in the right place and call the JMX method
 loadNewSSTables() (in the column family MBean, which mean you need to
 do that per-CF).

How does this affect the contents of the CommitLogs? I mean, 
I imagine using this for rollback'ing errors in the app side, like 
some/many/all objects deleted. How will updates/deletes saved in 
CommitLogs impact the data restored this way? Also: while I'm 
restoring a KS of CF, is it possible to cut write requests but 
not reads? I hope I'm explaining myself clearly...

--
Marcos Dione
SysAdmin
Astek Sud-Est
pour FT/TGPF/OPF/PORTAIL/DOP/HEBEX @ Marco Polo
04 97 12 62 45 - mdione@orange.com

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



Re: RE Restore snapshot

2012-08-07 Thread Jonathan Ellis
Yes.

On Thu, Aug 2, 2012 at 5:33 AM, Radim Kolar h...@filez.com wrote:

 1) I assume that I have to call the loadNewSSTables() on each node?

 this is same as nodetool refresh?



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: RE Restore snapshot

2012-08-03 Thread Tyler Hobbs
On Thu, Aug 2, 2012 at 6:14 AM, Romain HARDOUIN
romain.hardo...@urssaf.frwrote:


 Then http://www.datastax.com/docs/1.1/operations/backup_restore should
 mention it  :-)


I opened a ticket with our docs team to cover that. Thanks!

-- 
Tyler Hobbs
DataStax http://datastax.com/


RE Restore snapshot

2012-08-02 Thread Romain HARDOUIN
No it's not possible

Desimpel, Ignace ignace.desim...@nuance.com a écrit sur 01/08/2012 
14:58:49 :

 Hi,
 
 Is it possible to restore a snapshot of a keyspace on a live 
 cassandra cluster (I mean without restarting)? 
 

Re: RE Restore snapshot

2012-08-02 Thread Sylvain Lebresne
Actually that's wrong, it is perfectly possible to restore a snapshot
on a live cassandra cluster.
There is even basically 2 solutions:
1) use the sstableloader (http://www.datastax.com/dev/blog/bulk-loading)
2) copy the snapshot sstable in the right place and call the JMX
method loadNewSSTables() (in the column family MBean, which mean you
need to do that per-CF).

--
Sylvain

On Thu, Aug 2, 2012 at 9:16 AM, Romain HARDOUIN
romain.hardo...@urssaf.fr wrote:

 No it's not possible

 Desimpel, Ignace ignace.desim...@nuance.com a écrit sur 01/08/2012
 14:58:49 :

 Hi,

 Is it possible to restore a snapshot of a keyspace on a live
 cassandra cluster (I mean without restarting)?



RE: RE Restore snapshot

2012-08-02 Thread Desimpel, Ignace
Great! I will use the hardlinks to 'restore' the data files on each node (super 
fast)!
I have some related questions :

1) I assume that I have to call the loadNewSSTables() on each node?

2) To be on the save side, I guess I better drop the existing keyspace and then 
recreate using the definition at the time of the snapshot. But is it allowed 
the copy the 'old' data files after that with respect to new internal ids 
versus ids maintained (if any) in the data files?

3) A quick look at the code (took 1.0.5), is it possible that the Table.open is 
also calling the initCaches on the CFs, but the loadNewSSTables is not?

4) As a solution to 3) : I'm working with embedded Cassandra servers, so I 
think it would be possible for me to do the following
  *Drop KS x if present
  *Create KS x from old definition
  *On each node :
  *** Table.clear(x)
  *** Delete any remaining files in the directory x
  *** Restore data files from snapshot for KS x
  *** Table.open(x);

-Original Message-
From: Sylvain Lebresne [mailto:sylv...@datastax.com] 
Sent: donderdag 2 augustus 2012 11:46
To: user@cassandra.apache.org
Subject: Re: RE Restore snapshot

Actually that's wrong, it is perfectly possible to restore a snapshot on a live 
cassandra cluster.
There is even basically 2 solutions:
1) use the sstableloader (http://www.datastax.com/dev/blog/bulk-loading)
2) copy the snapshot sstable in the right place and call the JMX method 
loadNewSSTables() (in the column family MBean, which mean you need to do that 
per-CF).

--
Sylvain

On Thu, Aug 2, 2012 at 9:16 AM, Romain HARDOUIN romain.hardo...@urssaf.fr 
wrote:

 No it's not possible

 Desimpel, Ignace ignace.desim...@nuance.com a écrit sur 01/08/2012
 14:58:49 :

 Hi,

 Is it possible to restore a snapshot of a keyspace on a live 
 cassandra cluster (I mean without restarting)?



Re: RE Restore snapshot

2012-08-02 Thread Radim Kolar



1) I assume that I have to call the loadNewSSTables() on each node?

this is same as nodetool refresh?


Re: RE Restore snapshot

2012-08-02 Thread Romain HARDOUIN
Then http://www.datastax.com/docs/1.1/operations/backup_restore should 
mention it  :-)

Sylvain Lebresne sylv...@datastax.com a écrit sur 02/08/2012 11:45:46 :

 Actually that's wrong, it is perfectly possible to restore a snapshot
 on a live cassandra cluster.
 There is even basically 2 solutions:
 1) use the sstableloader (http://www.datastax.com/dev/blog/bulk-loading)
 2) copy the snapshot sstable in the right place and call the JMX
 method loadNewSSTables() (in the column family MBean, which mean you
 need to do that per-CF).
 
 --
 Sylvain