Re: Manually deleting sstables

2014-08-21 Thread Robert Wille
 
 2) Are there any other recommended procedures for this?

0) stop writes to columnfamily
1) TRUNCATE columnfamily;
2) nodetool clearsnapshot # on the snapshot that results
3) DROP columnfamily;

My two cents here is that this process is extremely difficult to automate,
making testing that involves dropping column families very difficult.

Robert





Re: Manually deleting sstables

2014-08-21 Thread Tyler Hobbs
You can always set autosnapshot: false in cassandra.yaml for testing
environments.


On Thu, Aug 21, 2014 at 9:57 AM, Robert Wille rwi...@fold3.com wrote:


 2)  Are there any other recommended procedures for this?

 0) stop writes to columnfamily
 1) TRUNCATE columnfamily;
 2) nodetool clearsnapshot # on the snapshot that results
 3) DROP columnfamily;

 My two cents here is that this process is extremely difficult to automate,
 making testing that involves dropping column families very difficult.

 Robert




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Latest 2.1 and Datastax driver questions...

2014-08-21 Thread Tyler Hobbs
On Wed, Aug 20, 2014 at 3:44 PM, Tony Anecito adanec...@yahoo.com wrote:


 Will there be a datastax CQL driver available then?


The Python and C# drivers already have 2.1.0 versions available.  The Java
driver has a 2.1-rc release, and should have a 2.1.0 final release soon.


-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Latest 2.1 and Datastax driver questions...

2014-08-21 Thread Tony Anecito
Thanks Tyler that is good to know.

-Tony



On Thursday, August 21, 2014 10:20 AM, Tyler Hobbs ty...@datastax.com wrote:
 




On Wed, Aug 20, 2014 at 3:44 PM, Tony Anecito adanec...@yahoo.com wrote:



Will there be a datastax CQL driver available then?

The Python and C# drivers already have 2.1.0 versions available.  The Java 
driver has a 2.1-rc release, and should have a 2.1.0 final release soon.



-- 
Tyler Hobbs
DataStax

Re: Latest 2.1 and Datastax driver questions...

2014-08-21 Thread Robert Coli
On Wed, Aug 20, 2014 at 7:19 PM, Benedict Elliott Smith 
belliottsm...@datastax.com wrote:

 The run versions = x.x.6 is IMO an out of date trope. Not only does 2.1
 have more than twice as many engineers working full time on it, and five
 times the QA engineers (which is cumulative with prior QA), we are also
 seeing many more users in the wild running release candidates and providing
 valuable feedback before release. This is evidenced by the fact there have
 been six release candidates, instead of just two for 2.0.


While I agree that Datastax is recently dedicating meaningful resources to
the QA side of Cassandra, I do not feel that it is controversial to say
that the QA of Cassandra as it relates to production operability has
historically been lacking. I feel much the same way about Cassandra QA that
Gandhi apocryphally felt about Western Civilization... I think it would be
a good idea.

Based on user reports and associated JIRA, the 2.0 series has been the
least stable series of Cassandra since 0.8 or so. It's wonderful that a
pre-release version has had slightly more testing than previous versions,
but IMO the proof is in the pudding. When there is an actual released
series of Cassandra in which I can recommend running a version under x.y.6
[1], I will be sure to stop linking that trope.

=Rob
[1] (x.y.~8 for 2.0...)


RE: Delete By Partition Key Implementation

2014-08-21 Thread Modha, Digant
If you delete entire row, do the records in the row still get counted towards 
computing TombstoneOverwhelmingException?  It seems like they still do.

From: DuyHai Doan [mailto:doanduy...@gmail.com]
Sent: Saturday, August 09, 2014 3:21 AM
To: user@cassandra.apache.org
Subject: Re: Delete By Partition Key Implementation


Thanks graham for the hints.

I've digged into the source code and found out those 2 classes:

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/DeletionInfo.java

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/RangeTombstoneList.java

 They are quite self-explanatory.
A deletion of an entire row is a single row tombstone, and yes there are range 
tombstones for marking deletion of a range of columns also

On Aug 8, 2014, at 2:17 PM, Kevin Burton 
bur...@spinn3r.commailto:bur...@spinn3r.com wrote:


This is a good question.. I'd love to find out the answer.  Seems like a 
tombstone with prefixes for the keys would work well.

Also, can't any key prefixes work in theory?

On Thu, Aug 7, 2014 at 8:33 AM, DuyHai Doan 
doanduy...@gmail.commailto:doanduy...@gmail.com wrote:
Hello all

 Usually, when using DELETE in CQL3 on some fields, C* creates tombstone 
columns for those fields.

 Now if I delete a whole PARTITION (delete from MyTable where 
partitionKey=...), what will C* do ? Will it create as many tombstones as there 
are physical columns on this partition or will it just mark this partition as 
deleted (Row Key deletion marker) ?

 On a side note, if I insert a bunch of physical columns in one partition with 
the SAME ttl value, after a while they will appear as expired, would C* need to 
scan the whole partition on disk to see which columns to expire or could it see 
that the whole partition is indeed expired thanks to meta data/ Partition key 
cache kept in memory ?  I was thinking about the estimate histograms for TTL 
but I don't know in detail how it work

 Regards

 Duy Hai  DOAN




--

Founder/CEO Spinn3r.comhttp://spinn3r.com/
Location: San Francisco, CA
blog: http://burtonator.wordpress.comhttp://burtonator.wordpress.com/
… or check out my Google+ 
profilehttps://plus.google.com/102718274791889610666/posts
[cid:~WRD001.jpg]http://spinn3r.com/



TD Securities disclaims any liability or losses either direct or consequential 
caused by the use of this information. This communication is for informational 
purposes only and is not intended as an offer or solicitation for the purchase 
or sale of any financial instrument or as an official confirmation of any 
transaction. TD Securities is neither making any investment recommendation nor 
providing any professional or advisory services relating to the activities 
described herein. All market prices, data and other information are not 
warranted as to completeness or accuracy and are subject to change without 
notice Any products described herein are (i) not insured by the FDIC, (ii) not 
a deposit or other obligation of, or guaranteed by, an insured depository 
institution and (iii) subject to investment risks, including possible loss of 
the principal amount invested. The information shall not be further distributed 
or duplicated in whole or in part by any means without the prior written 
consent of TD Securities. TD Securities is a trademark of The Toronto-Dominion 
Bank and represents TD Securities (USA) LLC and certain investment banking 
activities of The Toronto-Dominion Bank and its subsidiaries.


stalled nodetool repair?

2014-08-21 Thread Kevin Burton
How do I watch the progress of nodetool repair.

Looks like the folklore from the list says to just use

nodetool compactionstats
nodetool netstats

… but the repair seems locked/stalled and neither of these are showing any
progress..

granted , this is a lot of data, but it would be nice to at least see some
progress.

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
https://plus.google.com/102718274791889610666/posts
http://spinn3r.com


Re: stalled nodetool repair?

2014-08-21 Thread Aiman Parvaiz
If nodetool compactionstats says there are no Validation compactions
running (and the compaction queue is empty)  and netstats says there is
nothing streaming there is a a good chance the repair is finished or dead.

Source:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-it-safe-to-stop-a-read-repair-and-any-suggestion-on-speeding-up-repairs-td6607367.html

You might find this helpful.

Thanks


On Thu, Aug 21, 2014 at 12:32 PM, Kevin Burton bur...@spinn3r.com wrote:

 How do I watch the progress of nodetool repair.

 Looks like the folklore from the list says to just use

 nodetool compactionstats
 nodetool netstats

 … but the repair seems locked/stalled and neither of these are showing any
 progress..

 granted , this is a lot of data, but it would be nice to at least see some
 progress.

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com




Re: stalled nodetool repair?

2014-08-21 Thread Robert Coli
On Thu, Aug 21, 2014 at 12:32 PM, Kevin Burton bur...@spinn3r.com wrote:

 How do I watch the progress of nodetool repair.


This is a very longstanding operational problem in Cassandra. Repair barely
works and is opaque, yet one is expected to run it once a week in the
default configuration.

An unreasonably-hostile-in-tone-but-otherwise-accurate description of the
status quo before the re-write of streaming in 2.0 :

https://issues.apache.org/jira/browse/CASSANDRA-5396

A proposal to change the default for gc_grace_seconds to 34 days, so that
this fragile and heavyweight operation only has to be done once a month :

https://issues.apache.org/jira/browse/CASSANDRA-5850


 granted , this is a lot of data, but it would be nice to at least see some
 progress.


Here's the rewrite of streaming, where progress indication improves
dramatically over the prior status quo :

https://issues.apache.org/jira/browse/CASSANDRA-5286

And here's two open tickets on making repair less opaque (thx yukim@#cassandra)
:

https://issues.apache.org/jira/browse/CASSANDRA-5483
https://issues.apache.org/jira/browse/CASSANDRA-5839

=Rob


Re: stalled nodetool repair?

2014-08-21 Thread Ben Bromhead
https://github.com/mstump/cassandra_range_repair

Also very useful. 

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359




On 22/08/2014, at 6:12 AM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Aug 21, 2014 at 12:32 PM, Kevin Burton bur...@spinn3r.com wrote:
 How do I watch the progress of nodetool repair.
 
 This is a very longstanding operational problem in Cassandra. Repair barely 
 works and is opaque, yet one is expected to run it once a week in the default 
 configuration.
 
 An unreasonably-hostile-in-tone-but-otherwise-accurate description of the 
 status quo before the re-write of streaming in 2.0 :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5396
 
 A proposal to change the default for gc_grace_seconds to 34 days, so that 
 this fragile and heavyweight operation only has to be done once a month :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5850
  
 granted , this is a lot of data, but it would be nice to at least see some 
 progress.
 
 Here's the rewrite of streaming, where progress indication improves 
 dramatically over the prior status quo :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5286
 
 And here's two open tickets on making repair less opaque (thx 
 yukim@#cassandra) :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5483
 https://issues.apache.org/jira/browse/CASSANDRA-5839
 
 =Rob
 
 



Re: stalled nodetool repair?

2014-08-21 Thread DuyHai Doan
Thanks Ben for the link. Still this script does not work with vnodes, which
exclude a wide range of C* config


On Thu, Aug 21, 2014 at 5:51 PM, Ben Bromhead b...@instaclustr.com wrote:

 https://github.com/mstump/cassandra_range_repair

 Also very useful.

 Ben Bromhead
 Instaclustr | www.instaclustr.com | @instaclustr
 http://twitter.com/instaclustr | +61 415 936 359




 On 22/08/2014, at 6:12 AM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Aug 21, 2014 at 12:32 PM, Kevin Burton bur...@spinn3r.com wrote:

 How do I watch the progress of nodetool repair.


 This is a very longstanding operational problem in Cassandra. Repair
 barely works and is opaque, yet one is expected to run it once a week in
 the default configuration.

 An unreasonably-hostile-in-tone-but-otherwise-accurate description of the
 status quo before the re-write of streaming in 2.0 :

 https://issues.apache.org/jira/browse/CASSANDRA-5396

 A proposal to change the default for gc_grace_seconds to 34 days, so that
 this fragile and heavyweight operation only has to be done once a month :

 https://issues.apache.org/jira/browse/CASSANDRA-5850


 granted , this is a lot of data, but it would be nice to at least see
 some progress.


 Here's the rewrite of streaming, where progress indication improves
 dramatically over the prior status quo :

 https://issues.apache.org/jira/browse/CASSANDRA-5286

 And here's two open tickets on making repair less opaque (thx 
 yukim@#cassandra)
 :

 https://issues.apache.org/jira/browse/CASSANDRA-5483
 https://issues.apache.org/jira/browse/CASSANDRA-5839

 =Rob






Re: stalled nodetool repair?

2014-08-21 Thread Ben Bromhead
Ah sorry that is the original repo, see 
https://github.com/BrianGallew/cassandra_range_repair for the updated version 
of the script with vnode support 

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 22 Aug 2014, at 2:19 pm, DuyHai Doan doanduy...@gmail.com wrote:

 Thanks Ben for the link. Still this script does not work with vnodes, which 
 exclude a wide range of C* config
 
 
 On Thu, Aug 21, 2014 at 5:51 PM, Ben Bromhead b...@instaclustr.com wrote:
 https://github.com/mstump/cassandra_range_repair
 
 Also very useful. 
 
 Ben Bromhead
 Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359
 
 
 
 
 On 22/08/2014, at 6:12 AM, Robert Coli rc...@eventbrite.com wrote:
 
 On Thu, Aug 21, 2014 at 12:32 PM, Kevin Burton bur...@spinn3r.com wrote:
 How do I watch the progress of nodetool repair.
 
 This is a very longstanding operational problem in Cassandra. Repair barely 
 works and is opaque, yet one is expected to run it once a week in the 
 default configuration.
 
 An unreasonably-hostile-in-tone-but-otherwise-accurate description of the 
 status quo before the re-write of streaming in 2.0 :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5396
 
 A proposal to change the default for gc_grace_seconds to 34 days, so that 
 this fragile and heavyweight operation only has to be done once a month :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5850
  
 granted , this is a lot of data, but it would be nice to at least see some 
 progress.
 
 Here's the rewrite of streaming, where progress indication improves 
 dramatically over the prior status quo :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5286
 
 And here's two open tickets on making repair less opaque (thx 
 yukim@#cassandra) :
 
 https://issues.apache.org/jira/browse/CASSANDRA-5483
 https://issues.apache.org/jira/browse/CASSANDRA-5839
 
 =Rob
 
 
 
 



Re: stalled nodetool repair?

2014-08-21 Thread DuyHai Doan
Great! Many thanks


On Thu, Aug 21, 2014 at 9:35 PM, Ben Bromhead b...@instaclustr.com wrote:

 Ah sorry that is the original repo, see
 https://github.com/BrianGallew/cassandra_range_repair for the updated
 version of the script with vnode support

 Ben Bromhead
 Instaclustr | www.instaclustr.com | @instaclustr
 http://twitter.com/instaclustr | +61 415 936 359

 On 22 Aug 2014, at 2:19 pm, DuyHai Doan doanduy...@gmail.com wrote:

 Thanks Ben for the link. Still this script does not work with vnodes,
 which exclude a wide range of C* config


 On Thu, Aug 21, 2014 at 5:51 PM, Ben Bromhead b...@instaclustr.com wrote:

 https://github.com/mstump/cassandra_range_repair

 Also very useful.

  Ben Bromhead
 Instaclustr | www.instaclustr.com | @instaclustr
 http://twitter.com/instaclustr | +61 415 936 359




 On 22/08/2014, at 6:12 AM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Aug 21, 2014 at 12:32 PM, Kevin Burton bur...@spinn3r.com
 wrote:

 How do I watch the progress of nodetool repair.


 This is a very longstanding operational problem in Cassandra. Repair
 barely works and is opaque, yet one is expected to run it once a week in
 the default configuration.

 An unreasonably-hostile-in-tone-but-otherwise-accurate description of the
 status quo before the re-write of streaming in 2.0 :

 https://issues.apache.org/jira/browse/CASSANDRA-5396

 A proposal to change the default for gc_grace_seconds to 34 days, so that
 this fragile and heavyweight operation only has to be done once a month :

 https://issues.apache.org/jira/browse/CASSANDRA-5850


 granted , this is a lot of data, but it would be nice to at least see
 some progress.


 Here's the rewrite of streaming, where progress indication improves
 dramatically over the prior status quo :

 https://issues.apache.org/jira/browse/CASSANDRA-5286

 And here's two open tickets on making repair less opaque (thx 
 yukim@#cassandra)
 :

 https://issues.apache.org/jira/browse/CASSANDRA-5483
 https://issues.apache.org/jira/browse/CASSANDRA-5839

 =Rob