High Bloom filter false ratio

2016-02-17 Thread Anishek Agarwal
Hello,

We have a table with composite partition key with humungous cardinality,
its a combination of (long,long). On the table we have
bloom_filter_fp_chance=0.01.

On doing "nodetool cfstats" on the 5 nodes we have in the cluster we are
seeing  "Bloom filter false ratio:" in the range of 0.7 -0.9.

I thought over time the bloom filter would adjust to the key space
cardinality, we have been running the cluster for a long time now but have
added significant traffic from Jan this year, which would not lead to
writes in the db but would lead to high reads to see if are any values.

Are there any settings that can be changed to allow better ratio.

Thanks
Anishek


Re: Cassandra nodes reduce disks per node

2016-02-17 Thread Anishek Agarwal
Hey Branton,

Please do let us know if you face any problems  doing this.

Thanks
anishek

On Thu, Feb 18, 2016 at 3:33 AM, Branton Davis 
wrote:

> We're about to do the same thing.  It shouldn't be necessary to shut down
> the entire cluster, right?
>
> On Wed, Feb 17, 2016 at 12:45 PM, Robert Coli 
> wrote:
>
>>
>>
>> On Tue, Feb 16, 2016 at 11:29 PM, Anishek Agarwal 
>> wrote:
>>>
>>> To accomplish this can I just copy the data from disk1 to disk2 with in
>>> the relevant cassandra home location folders, change the cassanda.yaml
>>> configuration and restart the node. before starting i will shutdown the
>>> cluster.
>>>
>>
>> Yes.
>>
>> =Rob
>>
>>
>
>


Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-17 Thread Mike Heffner
Jaydeep,

No, we don't use any light weight transactions.

Mike

On Wed, Feb 17, 2016 at 6:44 PM, Jaydeep Chovatia <
chovatia.jayd...@gmail.com> wrote:

> Are you guys using light weight transactions in your write path?
>
> On Thu, Feb 11, 2016 at 12:36 AM, Fabrice Facorat <
> fabrice.faco...@gmail.com> wrote:
>
>> Are your commitlog and data on the same disk ? If yes, you should put
>> commitlogs on a separate disk which don't have a lot of IO.
>>
>> Others IO may have great impact impact on your commitlog writing and
>> it may even block.
>>
>> An example of impact IO may have, even for Async writes:
>>
>> https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic
>>
>> 2016-02-11 0:31 GMT+01:00 Mike Heffner :
>> > Jeff,
>> >
>> > We have both commitlog and data on a 4TB EBS with 10k IOPS.
>> >
>> > Mike
>> >
>> > On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa > >
>> > wrote:
>> >>
>> >> What disk size are you using?
>> >>
>> >>
>> >>
>> >> From: Mike Heffner
>> >> Reply-To: "user@cassandra.apache.org"
>> >> Date: Wednesday, February 10, 2016 at 2:24 PM
>> >> To: "user@cassandra.apache.org"
>> >> Cc: Peter Norton
>> >> Subject: Re: Debugging write timeouts on Cassandra 2.2.5
>> >>
>> >> Paulo,
>> >>
>> >> Thanks for the suggestion, we ran some tests against CMS and saw the
>> same
>> >> timeouts. On that note though, we are going to try doubling the
>> instance
>> >> sizes and testing with double the heap (even though current usage is
>> low).
>> >>
>> >> Mike
>> >>
>> >> On Wed, Feb 10, 2016 at 3:40 PM, Paulo Motta > >
>> >> wrote:
>> >>>
>> >>> Are you using the same GC settings as the staging 2.0 cluster? If not,
>> >>> could you try using the default GC settings (CMS) and see if that
>> changes
>> >>> anything? This is just a wild guess, but there were reports before of
>> >>> G1-caused instabilities with small heap sizes (< 16GB - see
>> CASSANDRA-10403
>> >>> for more context). Please ignore if you already tried reverting back
>> to CMS.
>> >>>
>> >>> 2016-02-10 16:51 GMT-03:00 Mike Heffner :
>> 
>>  Hi all,
>> 
>>  We've recently embarked on a project to update our Cassandra
>>  infrastructure running on EC2. We are long time users of 2.0.x and
>> are
>>  testing out a move to version 2.2.5 running on VPC with EBS. Our
>> test setup
>>  is a 3 node, RF=3 cluster supporting a small write load (mirror of
>> our
>>  staging load).
>> 
>>  We are writing at QUORUM and while p95's look good compared to our
>>  staging 2.0.x cluster, we are seeing frequent write operations that
>> time out
>>  at the max write_request_timeout_in_ms (10 seconds). CPU across the
>> cluster
>>  is < 10% and EBS write load is < 100 IOPS. Cassandra is running with
>> the
>>  Oracle JDK 8u60 and we're using G1GC and any GC pauses are less than
>> 500ms.
>> 
>>  We run on c4.2xl instances with GP2 EBS attached storage for data and
>>  commitlog directories. The nodes are using EC2 enhanced networking
>> and have
>>  the latest Intel network driver module. We are running on HVM
>> instances
>>  using Ubuntu 14.04.2.
>> 
>>  Our schema is 5 tables, all with COMPACT STORAGE. Each table is
>> similar
>>  to the definition here:
>>  https://gist.github.com/mheffner/4d80f6b53ccaa24cc20a
>> 
>>  This is our cassandra.yaml:
>> 
>> https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-yaml
>> 
>>  Like I mentioned we use 8u60 with G1GC and have used many of the GC
>>  settings in Al Tobey's tuning guide. This is our upstart config with
>> JVM and
>>  other CPU settings:
>> https://gist.github.com/mheffner/dc44613620b25c4fa46d
>> 
>>  We've used several of the sysctl settings from Al's guide as well:
>>  https://gist.github.com/mheffner/ea40d58f58a517028152
>> 
>>  Our client application is able to write using either Thrift batches
>>  using Asytanax driver or CQL async INSERT's using the Datastax Java
>> driver.
>> 
>>  For testing against Thrift (our legacy infra uses this) we write
>> batches
>>  of anywhere from 6 to 1500 rows at a time. Our p99 for batch
>> execution is
>>  around 45ms but our maximum (p100) sits less than 150ms except when
>> it
>>  periodically spikes to the full 10seconds.
>> 
>>  Testing the same write path using CQL writes instead demonstrates
>>  similar behavior. Low p99s except for periodic full timeouts. We
>> enabled
>>  tracing for several operations but were unable to get a trace that
>> completed
>>  successfully -- Cassandra started logging many messages as:
>> 
>>  INFO  [ScheduledTasks:1] - MessagingService.java:946 - _TRACE
>> messages
>>  were dropped in last 5000 ms: 52499 for internal timeout and 0 for
>> cross
>>  node timeout
>> 
>>  

Re: Forming a cluster of embedded Cassandra instances

2016-02-17 Thread Binil Thomas
Thanks for sharing your experience! I also found a similar solution in
TitanDB[1], but that also seem to be intended for development use. I think
the consensus here seems to be that one should not be embedding Cassandra
into another JVM.

> For production, we have to support single node clusters (not
> embedded though), and it has been challenging for pretty much
> all the reasons you find people saying not to do so.

What challenges did you face with single-node Cassandra deployment?

[1]:
https://github.com/thinkaurelius/titan/blob/titan10/titan-cassandra/src/main/java/com/thinkaurelius/titan/diskstorage/cassandra/utils/CassandraDaemonWrapper.java

On Sun, Feb 14, 2016 at 11:05 AM, John Sanda  wrote:

> The project I work on day to day uses an embedded instance of Cassandra,
> but it is intended for primarily for development. We embed Cassandra in a
> WildFly (i.e., JBoss) server. It is packaged and deployed as an EAR. I
> personally do not do this. I use and recommend ccm
>  for development. If you do you WildFly,
> there is also wildfly-cassandra
>  which deploys Cassandra
> as a custom WildFly extension. In other words it is deployed in WildFly
> like other subsystems like EJB, web, etc, not like an application. There
> isn't a whole lot of active development on this, but it could be another
> option.
>
> For production, we have to support single node clusters (not embedded
> though), and it has been challenging for pretty much all the reasons you
> find people saying not to do so.
>
> As for failure detection and cluster membership changes, are you using the
> Datastax driver? You can register an event listener with the driver to
> receive notifications for those things.
>
> On Sat, Feb 13, 2016 at 6:33 PM, Jonathan Haddad 
> wrote:
>
>> +1 to what jack said. Don't mess with embedded till you understand the
>> basics of the db. You're not making your system any less complex, I'd say
>> you're most likely going to shoot yourself in the foot.
>> On Sat, Feb 13, 2016 at 2:22 PM Jack Krupansky 
>> wrote:
>>
>>> HA requires an odd number of replicas - 3, 5, 7 - so that split-brain
>>> can be avoided. Two nodes would not support HA. You need to be able to
>>> reach a quorum, which is defined as n/2+1 where n is the number of
>>> replicas. IOW, you cannot update the data if a quorum cannot be reached.
>>> The data on any given node needs to be replicated on at least two other
>>> nodes.
>>>
>>> Embedded Cassandra is only for extremely sophisticated developers - not
>>> those who are new to Cassandra, with a "superficial understanding".
>>>
>>> As a general proposition, you should not be running application code on
>>> Cassandra nodes.
>>>
>>> That said, if any of the senior Cassandra developers wish to personally
>>> support your efforts towards embedded clusters, they are certainly free to
>>> do so. we'll see if any of them step forward.
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Sat, Feb 13, 2016 at 3:47 PM, Binil Thomas <
>>> binil.thomas.pub...@gmail.com> wrote:
>>>
 Hi all,

 TL;DR: I have a very superficial understanding of Cassandra and am
 currently evaluating it for a project.

 * Can Cassandra be embedded into another JVM application?
 * Can such embedded instances form a cluster?
 * Can the application use the the failure detection and cluster
 membership dissemination infrastructure of embedded Cassandra?

 

 I am in the process of re-packaging a SaaS system written in Java to be
 deployed on-premise by customers. The SaaS system currently uses AWS
 DynamoDB. The data storage needs for this application are modest, but I
 would like to keep the deployment complexity to a minimum. Here are three
 different usecases the on-premise system should support:

 1. single-node deployments with minimal complexity
 2. two-node HA deployments; the data and processing needs dictated by
 the load on the system are well under what a single node can do, but the
 second node is there to satisfy the HA requirement as a hot standby
 3. a multi-node clustered deployment, where higher operational
 complexity is justified

 I am considering Cassandra for these usecases.

 For usecase #1, I hope to embed Cassandra into the same JVM as my
 application. I read on the web that CassandraDaemon can be used this way.
 Is that accurate? What other applications embed Cassandra this way? I
 *think* JetBrains Upsource does, but do you know other ones? (Incidentally,
 my Java application embeds Jetty webserver also).

 For usecase #2, I am hoping that I can deploy two instances of this
 ensemble and have the embedded Cassandra instances form a cluster. If I
 configure every write to be replicated on both nodes synchronously, then it
 

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-17 Thread Jaydeep Chovatia
Are you guys using light weight transactions in your write path?

On Thu, Feb 11, 2016 at 12:36 AM, Fabrice Facorat  wrote:

> Are your commitlog and data on the same disk ? If yes, you should put
> commitlogs on a separate disk which don't have a lot of IO.
>
> Others IO may have great impact impact on your commitlog writing and
> it may even block.
>
> An example of impact IO may have, even for Async writes:
>
> https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic
>
> 2016-02-11 0:31 GMT+01:00 Mike Heffner :
> > Jeff,
> >
> > We have both commitlog and data on a 4TB EBS with 10k IOPS.
> >
> > Mike
> >
> > On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa 
> > wrote:
> >>
> >> What disk size are you using?
> >>
> >>
> >>
> >> From: Mike Heffner
> >> Reply-To: "user@cassandra.apache.org"
> >> Date: Wednesday, February 10, 2016 at 2:24 PM
> >> To: "user@cassandra.apache.org"
> >> Cc: Peter Norton
> >> Subject: Re: Debugging write timeouts on Cassandra 2.2.5
> >>
> >> Paulo,
> >>
> >> Thanks for the suggestion, we ran some tests against CMS and saw the
> same
> >> timeouts. On that note though, we are going to try doubling the instance
> >> sizes and testing with double the heap (even though current usage is
> low).
> >>
> >> Mike
> >>
> >> On Wed, Feb 10, 2016 at 3:40 PM, Paulo Motta 
> >> wrote:
> >>>
> >>> Are you using the same GC settings as the staging 2.0 cluster? If not,
> >>> could you try using the default GC settings (CMS) and see if that
> changes
> >>> anything? This is just a wild guess, but there were reports before of
> >>> G1-caused instabilities with small heap sizes (< 16GB - see
> CASSANDRA-10403
> >>> for more context). Please ignore if you already tried reverting back
> to CMS.
> >>>
> >>> 2016-02-10 16:51 GMT-03:00 Mike Heffner :
> 
>  Hi all,
> 
>  We've recently embarked on a project to update our Cassandra
>  infrastructure running on EC2. We are long time users of 2.0.x and are
>  testing out a move to version 2.2.5 running on VPC with EBS. Our test
> setup
>  is a 3 node, RF=3 cluster supporting a small write load (mirror of our
>  staging load).
> 
>  We are writing at QUORUM and while p95's look good compared to our
>  staging 2.0.x cluster, we are seeing frequent write operations that
> time out
>  at the max write_request_timeout_in_ms (10 seconds). CPU across the
> cluster
>  is < 10% and EBS write load is < 100 IOPS. Cassandra is running with
> the
>  Oracle JDK 8u60 and we're using G1GC and any GC pauses are less than
> 500ms.
> 
>  We run on c4.2xl instances with GP2 EBS attached storage for data and
>  commitlog directories. The nodes are using EC2 enhanced networking
> and have
>  the latest Intel network driver module. We are running on HVM
> instances
>  using Ubuntu 14.04.2.
> 
>  Our schema is 5 tables, all with COMPACT STORAGE. Each table is
> similar
>  to the definition here:
>  https://gist.github.com/mheffner/4d80f6b53ccaa24cc20a
> 
>  This is our cassandra.yaml:
> 
> https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-yaml
> 
>  Like I mentioned we use 8u60 with G1GC and have used many of the GC
>  settings in Al Tobey's tuning guide. This is our upstart config with
> JVM and
>  other CPU settings:
> https://gist.github.com/mheffner/dc44613620b25c4fa46d
> 
>  We've used several of the sysctl settings from Al's guide as well:
>  https://gist.github.com/mheffner/ea40d58f58a517028152
> 
>  Our client application is able to write using either Thrift batches
>  using Asytanax driver or CQL async INSERT's using the Datastax Java
> driver.
> 
>  For testing against Thrift (our legacy infra uses this) we write
> batches
>  of anywhere from 6 to 1500 rows at a time. Our p99 for batch
> execution is
>  around 45ms but our maximum (p100) sits less than 150ms except when it
>  periodically spikes to the full 10seconds.
> 
>  Testing the same write path using CQL writes instead demonstrates
>  similar behavior. Low p99s except for periodic full timeouts. We
> enabled
>  tracing for several operations but were unable to get a trace that
> completed
>  successfully -- Cassandra started logging many messages as:
> 
>  INFO  [ScheduledTasks:1] - MessagingService.java:946 - _TRACE messages
>  were dropped in last 5000 ms: 52499 for internal timeout and 0 for
> cross
>  node timeout
> 
>  And all the traces contained rows with a "null" source_elapsed row:
> 
> https://gist.githubusercontent.com/mheffner/1d68a70449bd6688a010/raw/0327d7d3d94c3a93af02b64212e3b7e7d8f2911b/trace.out
> 
> 
>  We've exhausted as many configuration option permutations that we can
>  think of. This 

Re: Cassandra nodes reduce disks per node

2016-02-17 Thread Ben Bromhead
you can do this in a "rolling" fashion (one node at a time).

On Wed, 17 Feb 2016 at 14:03 Branton Davis 
wrote:

> We're about to do the same thing.  It shouldn't be necessary to shut down
> the entire cluster, right?
>
> On Wed, Feb 17, 2016 at 12:45 PM, Robert Coli 
> wrote:
>
>>
>>
>> On Tue, Feb 16, 2016 at 11:29 PM, Anishek Agarwal 
>> wrote:
>>>
>>> To accomplish this can I just copy the data from disk1 to disk2 with in
>>> the relevant cassandra home location folders, change the cassanda.yaml
>>> configuration and restart the node. before starting i will shutdown the
>>> cluster.
>>>
>>
>> Yes.
>>
>> =Rob
>>
>>
>
> --
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Do I have to use repair -inc with the option -par forcely?

2016-02-17 Thread Jean Carlo
Hi ,

Thx @alain for you reply. Yes we have the 2.1.12. We are  definitely
 facing CASSANDRA-10422
.
However I cannot run incremental repairs without add -par


@carlos if what you say it's correct, it will be really nice because the
process to migrate it's quite tedious if you have many nodes


Saludos

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay

On Tue, Feb 16, 2016 at 4:39 PM, Carlos Rolo  wrote:

> +1 on what Alain said, but I do think if you are high enough on a 2.1.x
> (will look later) version you don't need to follow the documentation. It is
> outdated. Run a full repair, the you can start incremental repairs since
> the SSTables will have the metadata on them about the last repair.
>
>  Wait someone to confirm this/or confirm the docs are correct.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
> *
> Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Tue, Feb 16, 2016 at 1:45 PM, Alain RODRIGUEZ 
> wrote:
>
>> Hi,
>>
>> I am testing repairs repairs -inc -par and I can see that in all my nodes
>>> the numbers of sstables explode to 5k from 5 sstables.
>>
>>
>> This looks like a known issue, see
>> https://issues.apache.org/jira/browse/CASSANDRA-10422
>> Make sure your version is higher than 2.1.12, 2.2.4, 3.0.1, 3.1 to avoid
>> this (and you are indeed facing CASSANDRA-10422).
>> I am not sure you are facing this though, as you don't seem to be using
>> subranges (nodetool repair -st  and -et options)
>>
>> *It is anyway to run repairs incrementals but not -par ?*
>>>
>>> I know that is it not possible to run sequential repair with incremental
>>> repair at the same time.
>>>
>>
>>
>> From http://www.datastax.com/dev/blog/more-efficient-repairs
>> "Incremental repairs can be opted into via the -inc option to nodetool
>> repair. This is compatible with both sequential and parallel (-par)
>> repair, e.g., bin/nodetool -par -inc  ."
>> So you should be able to remove -par. Not sure this will solve your issue
>> though.
>>
>>
>> Did you respect this process to migrate to incremental repairs?
>>
>> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesMigration.html#opsRepairNodesMigration__ol_dxj_gp5_2s
>>
>> C*heers,
>> -
>> Alain Rodriguez
>> France
>>
>> The Last Pickle
>> http://www.thelastpickle.com
>>
>>
>>
>> 2016-02-10 17:45 GMT+01:00 Jean Carlo :
>>
>>> Hi guys; The question is on the subject.
>>>
>>> I am testing repairs repairs -inc -par and I can see that in all my
>>> nodes the numbers of sstables explode to 5k from 5 sstables.
>>>
>>> I cannot permit this behaivor on my cluster in production.
>>>
>>> *It is anyway to run repairs incrementals but not -par ?*
>>>
>>> I know that is it not possible to run sequential repair with incremental
>>> repair at the same time.
>>>
>>> Best regards
>>>
>>> Jean Carlo
>>>
>>> "The best way to predict the future is to invent it" Alan Kay
>>>
>>
>>
>
> --
>
>
>
>


Re: Re : decommissioned nodes shows up in "nodetool describecluster" as UNREACHABLE in 2.1.12 version

2016-02-17 Thread Alain RODRIGUEZ
Hi,

nodetool gossipinfo shows the decommissioned nodes as "LEFT"


I believe this is the expected behavior, we keep some a trace of leaving
nodes for a few days, this shouldn't be an issue for you

nodetool describecluster shows the decommissioned nodes as UNREACHABLE.
>

This is a weird behaviour I haven't see for a while. You might want to dig
this some more.

Restarting the entire cluster,  everytime a node is decommissioned does not
> seem right
>

Meanwhile, if you are sure the node is out and streams have ended, I guess
it could be ok to use a JMX client (MX4J, JConsole...) and then use the JMX
method Gossiper.unsafeAssassinateEndpoints(ip_address) to assassinate the
gone node from any of the remaining nodes.

How to -->
http://tumblr.doki-pen.org/post/22654515359/assassinating-cassandra-nodes
(3 years old post, I partially read it, but I think it might still be
relevant)

Has anybody experienced similar behaviour


FTR, 3 years old similar issue I faced -->
http://grokbase.com/t/cassandra/user/127knx7nn0/unreachable-node-not-in-nodetool-ring

FWIW, people using C* = 3.x, this is exposed through nodetool -->
https://docs.datastax.com/en/cassandra/3.x/cassandra/tools/toolsAssassinate.html

Keep in mind that something called 'unsafe' and 'assassinate' at the same
time is not something you want to use in a regular decommissioning process
as it drop the node with no file transfer, you basically totally lose a
node (unless node is out already which seems to be your case, it should be
safe to use it in your case). I only used it to fix gossip status in the
past or at some point when forcing a removenode was not working, followed
by full repairs on remaining nodes.

C*heers,
-
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-02-16 20:08 GMT+01:00 sai krishnam raju potturi :

> hi;
> we have a 12 node cluster across 2 datacenters. We are currently using
> cassandra 2.1.12 version.
>
> SNITCH : GossipingPropertyFileSnitch
>
> When we decommissioned few nodes in a particular datacenter and observed
> the following :
>
> nodetool status shows only the live nodes in the cluster.
>
> nodetool describecluster shows the decommissioned nodes as UNREACHABLE.
>
> nodetool gossipinfo shows the decommissioned nodes as "LEFT"
>
>
> When the live nodes were restarted, "nodetool describecluster" shows only
> the live nodes, which is expected.
>
> Purging the gossip info too did not help.
>
> INFO  17:27:07 InetAddress /X.X.X.X is now DOWN
> INFO  17:27:07 Removing tokens [125897680671740685543105407593050165202,
> 140213388002871593911508364312533329916,
>  98576967436431350637134234839492449485] for /X.X.X.X
> INFO  17:27:07 InetAddress /X.X.X.X is now DOWN
> INFO  17:27:07 Removing tokens [6977666116265389022494863106850615,
> 111270759969411259938117902792984586225,
> 138611464975439236357814418845450428175] for /X.X.X.X
>
> Has anybody experienced similar behaviour. Restarting the entire cluster,
>  everytime a node is decommissioned does not seem right. Thanks in advance
> for the help.
>
>
> thanks
> Sai
>
>
>


Re: Cassandra nodes reduce disks per node

2016-02-17 Thread Robert Coli
On Tue, Feb 16, 2016 at 11:29 PM, Anishek Agarwal  wrote:
>
> To accomplish this can I just copy the data from disk1 to disk2 with in
> the relevant cassandra home location folders, change the cassanda.yaml
> configuration and restart the node. before starting i will shutdown the
> cluster.
>

Yes.

=Rob


Re: Re : decommissioned nodes shows up in "nodetool describecluster" as UNREACHABLE in 2.1.12 version

2016-02-17 Thread sai krishnam raju potturi
thanks Rajesh. What we have observed is the decommissioned nodes show up as
"UNREACHABLE" in "nodetool describecluster" command. Their status shows up
as "LEFT" in "nodetool gossipinfo". This is observed in 2.1.12 version.

Decommissioned nodes did not show up in the "nodetool describecluster" and
"nodetool gossipinfo" in 2.0.14 version that we use in another cluster.


thanks
Sai

On Tue, Feb 16, 2016 at 2:08 PM, sai krishnam raju potturi <
pskraj...@gmail.com> wrote:

> hi;
> we have a 12 node cluster across 2 datacenters. We are currently using
> cassandra 2.1.12 version.
>
> SNITCH : GossipingPropertyFileSnitch
>
> When we decommissioned few nodes in a particular datacenter and observed
> the following :
>
> nodetool status shows only the live nodes in the cluster.
>
> nodetool describecluster shows the decommissioned nodes as UNREACHABLE.
>
> nodetool gossipinfo shows the decommissioned nodes as "LEFT"
>
>
> When the live nodes were restarted, "nodetool describecluster" shows only
> the live nodes, which is expected.
>
> Purging the gossip info too did not help.
>
> INFO  17:27:07 InetAddress /X.X.X.X is now DOWN
> INFO  17:27:07 Removing tokens [125897680671740685543105407593050165202,
> 140213388002871593911508364312533329916,
>  98576967436431350637134234839492449485] for /X.X.X.X
> INFO  17:27:07 InetAddress /X.X.X.X is now DOWN
> INFO  17:27:07 Removing tokens [6977666116265389022494863106850615,
> 111270759969411259938117902792984586225,
> 138611464975439236357814418845450428175] for /X.X.X.X
>
> Has anybody experienced similar behaviour. Restarting the entire cluster,
>  everytime a node is decommissioned does not seem right. Thanks in advance
> for the help.
>
>
> thanks
> Sai
>
>
>


Re: Cassandra nodes reduce disks per node

2016-02-17 Thread Anishek Agarwal
Additional note we are using cassandra 2.0.15 have 5 nodes in cluster ,
going to expand to 8 nodes.

On Wed, Feb 17, 2016 at 12:59 PM, Anishek Agarwal  wrote:

> Hello,
>
> We started with two 800GB SSD on each cassandra node based on our initial
> estimations of read/write rate. As we started on boarding additional
> traffic we find that CPU is becoming a bottleneck and we are not able to
> run the NICE jobs like compaction very well. We have started expanding the
> cluster and this would lead to less data per node. It looks like at this
> point once we expand the cluster, the current 2 X 800 GB SSD will be too
> much and it might be better to have just one SSD.
>
> To accomplish this can I just copy the data from disk1 to disk2 with in
> the relevant cassandra home location folders, change the cassanda.yaml
> configuration and restart the node. before starting i will shutdown the
> cluster.
>
> Thanks
> anishek
>


Cassandra nodes reduce disks per node

2016-02-17 Thread Anishek Agarwal
Hello,

We started with two 800GB SSD on each cassandra node based on our initial
estimations of read/write rate. As we started on boarding additional
traffic we find that CPU is becoming a bottleneck and we are not able to
run the NICE jobs like compaction very well. We have started expanding the
cluster and this would lead to less data per node. It looks like at this
point once we expand the cluster, the current 2 X 800 GB SSD will be too
much and it might be better to have just one SSD.

To accomplish this can I just copy the data from disk1 to disk2 with in the
relevant cassandra home location folders, change the cassanda.yaml
configuration and restart the node. before starting i will shutdown the
cluster.

Thanks
anishek


Re: Sudden disk usage

2016-02-17 Thread Ben Bromhead
+1 to checking for snapshots. Cassandra by default will automatically
snapshot tables before destructive actions like drop or truncate.

Some general advice regarding cleanup. Cleanup will result in a temporary
increase in both disk I/O load and disk space usage (especially with STCS).
It should only be used as part of a planned increase in capacity when you
still have plenty of disk space left on your existing nodes.

If you are running Cassandra in the cloud (AWS, Azure etc) you can add an
EBS volume, copy your sstables to it then bind mount it to the troubled CF
directory. This will give you some emergency disk space to let compaction
and cleanup do its thing safely.

On Tue, 16 Feb 2016 at 10:57 Robert Coli  wrote:

> On Sat, Feb 13, 2016 at 4:30 PM, Branton Davis  > wrote:
>
>> We use SizeTieredCompaction.  The nodes were about 67% full and we were
>> planning on adding new nodes (doubling the cluster to 6) soon.
>>
>
> Be sure to add those new nodes one at a time.
>
> Have you checked for, and cleared, old snapshots? Snapshots are
> automatically taken at various times and have the unusual property of
> growing larger over time. This is because they are hard links of data files
> and do not take up disk space of their own until the files they link to are
> compacted into new files.
>
> =Rob
>
>
-- 
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer