I request 1-2 TB of disk per node, depending on how large the data is estimated
to be (for larger data, 2 TB). I have some dense nodes (4+ TB of disk
available). They are harder to manage for repairs, bootstrapping, compaction,
etc. because it takes so long to stream the data, etc. For the
Is this Java 8 with the G1 garbage collector or CMS? With Java 7 and CMS,
garbage collection can cause delays like you are seeing. I haven’t seen that
problem with G1, but garbage collection is where I would start looking.
Sean Durity
From: Ted Pearson [mailto:t...@tedpearson.com]
Sent:
A couple thoughts (for after you up/downgrade to one version for all nodes):
- 16 GB of total RAM on a node is a minimum I would use; 32 would be
much better
- With a lower amount of memory, I think would keep memtables on-heap
in order to keep a tighter rein on how much they
Version number may help.
Sean Durity
From: Anshu Vajpayee [mailto:anshu.vajpa...@gmail.com]
Sent: Tuesday, January 03, 2017 10:09 AM
To: user@cassandra.apache.org
Subject: Re: Growing Hints
Anyone aware about issue ?
Hints are still growing although gossip and repair was successfull. Gossip
Down and Durity - Cassandra Admin Discussion
Now, you are running several Cassandra clusters (or leaning heavily that way).
How do you deploy them, monitor them, and do various other administrative
tasks? Come and join in a discussion and let's learn from each other.
Sean Durity, our
A few of the many companies that rely on Cassandra are mentioned here:
http://cassandra.apache.org
Apple, Netflix, Weather Channel, etc.
(Not nearly as good as the Planet Cassandra list that DataStax used to
maintain. Boo for the Apache/DataStax squabble!)
DataStax has a list of many case
Perhaps you can set the default TTL when you create the table instead of on
every insert:
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/tabProp.html
(see default_time_to_live property)
Then no need for UDF.
Sean Durity
lord of the (C*) rings (Lead Cassandra Admin)
Big DATA Team
MTC
Thanks for the updates on later versions. My experience on authentication was
mostly 1.1 – 2.0. I am glad that it is improving a bit. However, it does seem
that it is still wise to start rings with authentication on to avoid this
activation procedure.
Sean Durity
From: li...@beobal.com
Which Cassandra version? Most of my authentication-from-non-authentication
experience is from Cassandra 1.1 – 2.0. After that, I just enable from the
beginning.
Sean Durity – Lead Cassandra Admin
Big DATA Team
MTC 2250
For support, create a
I answered a similar question here:
https://groups.google.com/forum/#!topic/nosql-databases/lLBebUCjD8Y
Sean Durity – Lead Cassandra Admin
From: Felipe Esteves [mailto:felipe.este...@b2wdigital.com]
Sent: Tuesday, June 07, 2016 12:07 PM
To: user@cassandra.apache.org
Subject: Change
Eric is right on.
Let me share my experience. I have found that dense nodes over 4 TB are a pain
to manage (for rebuilds, repair, compaction, etc.) with size-tiered compaction
and basically a single table schema. However, 1 TB nodes that yield only about
500 GB of usable space can create rings
Just wanted to chime in that this is a very well-written and explained answer.
Nice job, Jeff!
Sean Durity
From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Wednesday, May 18, 2016 11:41 PM
To: user@cassandra.apache.org
Subject: Re: Replication lag between data center
Cassandra isn’t a
https://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configLogArchive_t.html
Sean Durity
From: Anuj Wadehra [mailto:anujw_2...@yahoo.co.in]
Sent: Wednesday, April 27, 2016 10:44 PM
To: user@cassandra.apache.org
Subject: RE: Inconsistent Reads after Restoring Snapshot
No.We are not
What about the commitlogs? Are you saving those off anywhere in between the
snapshot and the crash?
Sean Durity
From: Anuj Wadehra [mailto:anujw_2...@yahoo.co.in]
Sent: Monday, April 25, 2016 10:26 PM
To: User
Subject: Inconsistent Reads after Restoring Snapshot
Hi,
We have 2.0.14. We use
Do the clients already send the credentials? That is the first thing to address.
Setting up a cluster for authentication (and authorization) requires a restart
with the properties turned on in cassandra.yaml. However, the actual keyspace
(system_auth) and tables are not created until the last
Cassandra is not good for table scan type queries (which count(*) typically
is). While there are some attempts to do that (as noted below), this is a path
I avoid.
Sean Durity
From: Max C [mailto:mc_cassan...@core43.com]
Sent: Saturday, April 09, 2016 6:19 PM
To: user@cassandra.apache.org
What do you mean by “automatically starts syncing?” Are you seeing streaming of
existing data or just the addition of new, incoming data? Do you have repairs
running as part of your automated maintenance, perhaps?
I would expect that new, incoming data would be replicated to the new DC after
I think this one is better…
You might take a look at this previous conversation on queue-type applications
and Cassandra. Generally this is an anti-pattern for a distributed system like
Cassandra.
I am not sure the status logger output helps determine the problem. However,
the dropped mutations and the status logger output is what I see when there is
too high of a load on one or more Cassandra nodes. It could be long GC pauses,
something reading too much data (a large row or a
Is this from the 1.1 line, perhaps? In my experience it could be very flappy
for no particular reason we could discover. 1.1 is a pretty dusty version.
Upgrading into the 2.1 or later would be a good idea. If you have to upgrade in
place without down time, you will need to go through many
Are you using any of the Tuning Policies
(https://docs.datastax.com/en/developer/java-driver/2.0/common/drivers/reference/tuningPolicies_c.html)?
It could be that you are hitting some peak load and the driver is not retrying
hosts once they are marked “down.”
Sean Durity – Lead Cassandra
As you have it, this is not a good model for Cassandra. Your partition key has
only 2 specific values. You would end up with only 2 partitions (perhaps owned
by just 2 nodes) that would quickly get huge (and slow). Also, secondary
indexes are generally a bad idea. You would either want to
What anti-pattern are you mocking me for exactly?
Sean Durity
From: daemeon reiydelle [mailto:daeme...@gmail.com]
Sent: Tuesday, February 23, 2016 11:21 AM
To: user@cassandra.apache.org
Subject: RE: Restart Cassandra automatically
Cassandra nodes do not go down "for no reason". They are not
Yes, I can see the potential problem in theory. However, we never do your #2.
Generally, we don’t have unused spare hardware. We just fix the host that is
down and run repairs. (Side note: while I have seen nodes fight it out over who
owns a particular token in earlier versions, it seems that
You didn’t mention version, but I saw this kind of thing very often in the 1.1
line. Often this is connected to network flakiness. Are these VMs? In the
cloud? Connected over a WAN? You mention that ping seems fine. Take a look at
the phi_convict_threshold in c assandra.yaml. You may need to
I see the sstablemetadata tool as far back as 1.2.19 (in tools/bin).
Sean Durity
From: Anishek Agarwal [mailto:anis...@gmail.com]
Sent: Tuesday, February 23, 2016 3:37 AM
To: user@cassandra.apache.org
Subject: Re: High Bloom filter false ratio
Looks like that sstablemetadata is available in 2.2
Call me naïve, but we do use an in-house built program for keeping nodes
started (based on a flag-check). The program is something that was written for
all kinds of daemon processes here, not Cassandra specifically. The basic idea
is that is runs a status check. If that fails, and the flag is
What client are you using?
It is possible that the client saw nodes down and has kept them marked that way
(without retrying). Depending on the client, you may have options to set in
RetryPolicy, FailoverPolicy, etc. A bounce of the client will probably fix the
problem for now.
Sean Durity
Can you wipe all the data directories, saved_cache, and commitlog and let the
node bootstrap again?
Sean Durity
From: Nimi Wariboko Jr [mailto:n...@channelmeter.com]
Sent: Monday, January 25, 2016 6:59 PM
To: cassandra-u...@apache.org
Subject: NullPointerException when trying to compact under
This is a very strange move considering how well DataStax has supported open
source Cassandra. I hope there is a reasonable and well-publicized explanation
for this apparent change in direction.
Sean Durity
From: George Sigletos [mailto:sigle...@textkernel.nl]
Sent: Tuesday, January 26, 2016
Thank, I appreciate the correction to my understanding.
Sean Durity
From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Friday, January 22, 2016 1:04 PM
To: user@cassandra.apache.org
Subject: Re: Using TTL for data purge
"As I understand TTL, if there is a compaction of a cell (or row)
An upsert is a second insert. Cassandra’s sstables are immutable. There are no
real “overwrites” (of the data on disk). It is another record/row. Upon read,
it acts like an overwrite, because Cassandra will read both inserts and take
the last one in as the correct data. This strategy will work
If you know how long the records should last, TTL is a good way to go. Remember
that neither TTL or deletes are right-away purge strategies. Each inserts a
special record called a tombstone to indicate a deleted record. After
compaction (that is after gc_grace_seconds for the table, default 10
You can write your own using the appropriate interface(s) (for authentication
and authorization). However, the interfaces have changed over the years, so it
is likely not a write it and forget it bit of code.
External security is one of the important features of DSE, though.
Sean Durity – Lead
It shouldn’t be called an aggregate. That is more like a user defined function.
If you are correct, the term “aggregate” will lead people to do “bad things” –
just like secondary indexes. I think the dev team needs a naming expert.
Sean Durity – Lead Cassandra Admin
From: Robert Stupp
An aggregate only within a partition? That is rather useless and shouldn’t be
called an aggregate.
I am hoping the functionality can be used to support at least “normal” types of
aggregates like count, sum, avg, etc.
Sean Durity – Lead Cassandra Admin
From: Jonathan Haddad
Is there a JIRA for the discussion of dropping PropertyFileSnitch? That is all
that we use, and it is much easier to package and distribute than
GossipingPropertyFileSnitch. I would vote against dropping the more useful
PropertyFileSnitch.
Sean Durity – Lead Cassandra Admin
From: Paulo Motta
Thank you! I made my comments in the JIRA.
Sean Durity – Lead Cassandra Admin
From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com]
Sent: Monday, December 14, 2015 3:25 PM
To: user@cassandra.apache.org
Subject: Re: Issues on upgrading from 2.2.3 to 3.0
I highly recommend DataStax support, although we have not done Windows. They
are in a growth phase, so I expect there will be growing pains with support.
However, they have some top notch folks in place.
Sean Durity – Lead Cassandra Admin
From: Troy Collinsworth
I ended up writing some of my own utilities and aliases to make output more
useful for me (and reduce some typing, too). Resolving host names was a big one
for me, too. Ip addresses are almost useless. Up time in seconds is useless.
The –r in nodetool is a nice addition, but I like the short
How many nodes are you planning to add? How many replicas do you want? In
general, there shouldn't be a problem adding nodes and then altering the
keyspace to change replication. You will want to run repairs to stream the data
to the new replicas. You shouldn't need downtime or data migration
I took (and passed) the Admin exam at the Cassandra Summit. Tushar does hit the
main points in his article. One of the conditions of the test is to not reveal
details on the test, so we shouldn’t reveal anything very substantial. I will
say that the exam was about the same difficulty level as
You CAN run mixed versions in your cluster for a short time when you are
upgrading. However, you will most likely not be able to run repairs or any
other streaming operations (adding nodes, adding a DC, etc.) while in this
state.
The advice below is correct. Upgrade your current ring to the
Yes, you should run upgradesstables on each node. If the sstable structure has
changed, you will need this completed before you can do streaming operations
like repairs or adding nodes.
As for running in parallel, that should be fine. It is a “within the node”
operation that pounds I/O (but is
This sounds like something you do on the client side BEFORE you insert. Or are
you wanting to limit the size of the list coming out to the client?
Sean Durity
Lead Cassandra Admin, Big Data Team
From: yuankui [mailto:kui.y...@fraudmetrix.cn]
Sent: Thursday, August 13, 2015 9:06 AM
To:
It is a bit hard to follow. Perhaps you could include your proposed schema
(annotated with your size predictions) to spur more discussion. To me, it
sounds a bit convoluted. Why is a batch so big (up to 100 million rows)? Is a
row in the primary only associated with one batch?
Sean Durity -
That’s ok for a single node, but to answer the question, “how big is my table
across the cluster?” it would be much better if the cluster could provide an
answer.
Sean Durity
From: Jonathan Haddad [mailto:j...@jonhaddad.com]
Sent: Monday, June 29, 2015 8:15 AM
To: user
Subject: Re: How to
It seems to me that running repair on any given node may also induce repairs to
related replica nodes. For example, if I run repair on node A and node B has
some replicas, data might stream from A to B (assuming A has newer/more data).
Now, that does NOT mean that node B will be fully repaired.
I will note here that the limitations on ad-hoc querying (and aggregates) make
it much more difficult to deal with data quality problems, QA testing, and
similar efforts, especially where people are used to a more relational, ad-hoc
model. We have often had to extract data from Cassandra to
Right. I have had very few problems running mixed versions for normal
operations (as long as the versions are “close”). During upgrades, I turn off
repairs. Adding/replacing nodes is very infrequent for me, so not much of a
consideration. We upgrade as quickly as we can, however, to protect
With 3.0, what happens to existing Thrift-based tables (with dynamic column
names, etc.)?
Sean Durity
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Wednesday, June 10, 2015 10:30 AM
To: user
Subject: Cassandra 2.2, 3.0, and beyond
As you know, we've split our post-2.1 release into two
In my experience, you don’t want to do streaming operations (repairs or
bootstraps) with mixed Cassandra versions. Upgrade the ring to the new version,
and then add nodes (or add the nodes at the current version, and then upgrade).
Sean Durity
From: Aiman Parvaiz [mailto:ai...@flipagram.com]
We use nodetool repair -pr on each node through the week. Basically I have a
cron job that checks a schedule on each host to see “should I run repair
today?” If yes, it kicks off repair.
Sean Durity
From: Tiwari, Tarun [mailto:tarun.tiw...@kronos.com]
Sent: Monday, May 25, 2015 12:41 AM
To:
We run 2 nodes (from 2 different rings) on the same physical host. One is for a
random ring; the other is byteordered to support some alphabetic range queries.
Each instance has its own binary install, data directory and ports. One
limitation - with one install of OpsCenter agent, it can only
I have run plenty of 1.2.x Cassandra versions on the Oracle JVM 1.7. I have
used both 1.7.0_40 and 1.7.0_72 with no issues. Also have 3.2.7 DSE running on
1.7.0_72 in PR with no issues.
Sean Durity – Cassandra Admin, Big Data Team
To engage the team, create a
I think you have over-simplified it just a bit here, though I may be wrong.
In order to get a tombstone on a TTL row or column, some kind of read has to
occur. The tombstones don’t magically appear (remember, a tombstone is a
special kind of insert). So, I think it takes at least two
Just to add to this. It seems that reads done for authentication and
authorization (using the built-in security classes) are included in the read
request counts.
Sean Durity – Cassandra Admin, Big Data Team
To engage the team, create a
I do two types of node monitoring. On each host, we have a process monitor
looking for the cassandra process. If it goes down, it will get restarted (if a
flag is set appropriately).
Secondly, from a remote host, I have an hourly check of all nodes where I
essentially log in to each node and
Yes, run upgradesstables on all nodes - unless you already force major
compactions on all tables. I run them on a few nodes at a time to minimize
impact to performance. The upgrade is not complete until upgradesstables
completes on all nodes. Then you are safe to resume any streaming operations
Yes, for over 2 years.
As for #2 - you would keep all CFs in both DCs. But, maybe only do RF=2 in OLTP
and 3 in reporting. Not sure of all your requirements. Writes are fast and
cheap in Cassandra, so I wouldn’t be concerned with “extra” writes in the OLTP
DC.
Sean Durity – Cassandra Admin,
We run two cassandra nodes on the same host for a use case that requires a
random ordered ring and a byte ordered ring. It is technically feasible.
However, it makes administration of the rings a bit tougher (different ports
for one, etc.). OpsCenter agents can only connect to one of the rings
This is not for the long haul, but in order to accomplish an OS upgrade across
the cluster, without taking an outage.
Sean Durity
From: Jonathan Haddad [mailto:j...@jonhaddad.com]
Sent: Monday, March 02, 2015 1:15 PM
To: user@cassandra.apache.org
Subject: Re: Running Cassandra on mixed OS
I
In my experience, you do not want to stay on 1.1 very long. 1.08 was very
stable. 1.1 can get bad in a hurry. 1.2 (with many things moved off-heap) is
very much better.
Sean Durity – Cassandra Admin, Big Data Team
From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Monday, March 02, 2015
Cassandra 1.2.19
We would like to turn on Cassandra's internal security (PasswordAuthenticator
and CassandraAuthorizer) on the ring (away from AllowAll). (Clients are already
passing credentials in their connections.) However, I know all nodes have to be
switched to those before the basic
What snitch are you using? You may need to do some work on your topology file
(or rackdc) to make sure you have the topology you want. Also, it is possible
you may need to restart OpsCenter agents and/or your browser to see the nodes
represented properly in OpsCenter.
Sean Durity – Cassandra
Full table scans are not the best use case for Cassandra. Without some kind of
pagination, the node taking the request (the coordinator node) will try to
assemble the data from all nodes to return to the client. With a dataset of any
decent size, it will overwhelm the single node.
Pagination
SimpleSnitch is not rack aware. You would want to choose seed nodes and then
not change them. Seed nodes apparently don’t bootstrap. All nodes need the same
seeds in the yaml file. Here is more info:
68 matches
Mail list logo