Is there any mention of this limitation anywhere in the Cassandra
documentation? I don't see it mentioned in the 'Anti-patterns in Cassandra'
section of the DataStax 2.0 documentation or anywhere else.
When starting out with Cassandra as a store for a multi-tenant application
it seems very
Hello All,
We are on 1.2.18 (running on Ubuntu 12.04) and we recently tried to add a
second DC on our demo environment, just before trying it on live. The
existing DC1 has two nodes which approximately hold 10G of data (RF=2). In
order to add the second DC, DC2, we followed this procedure:
On
Hi, All,
I want to run 'update keyspace with strategy_options={dc1:3, dc2:3}' from
cassandra-cli to update the strategy options of some keyspace
in a multi-DC environment.
When the command returns successfully, does it mean that the strategy options
have been updated successfully or I need to
Try the show keyspaces command and look for Options under each keyspace.
Thanks
Rahul
On Tue, Aug 5, 2014 at 2:01 PM, Lu, Boying boying...@emc.com wrote:
Hi, All,
I want to run ‘update keyspace with strategy_options={dc1:3, dc2:3}’ from
cassandra-cli to update the strategy options of
Thanks. yes. I can use the ‘show keyspace’ command to check and see the
strategy does changed.
But what I want to know is if the ‘update keyspace with strategy_options …’
command is
a ‘sync’ operation or a ‘async’ operation.
From: Rahul Menon [mailto:ra...@apigee.com]
Sent: 2014年8月5日 16:38
Changing the strategy options, and in particular the replication factor,
does not perform any data replication by itself. You need to run a repair
to ensure data is replicated following the new replication.
On Tue, Aug 5, 2014 at 10:52 AM, Lu, Boying boying...@emc.com wrote:
Thanks. yes. I can
Hi Phil,
In theory, the max number of column families would be in the low number of
hundreds. In practice the limit is related the amount of heap you have, as
each column family will consume 1 MB of heap due to arena allocation.
To segregate customer data, you could:
- Use customer specific
Yes.
Sorry for not say it clearly.
What I want to know is “are the strategy changed ?’ after the ‘udpate keyspace
with strategy_options…’ command returns successfully
Not the data change.
e.g. say I run the command ‘update keyspace with strategy_opitons [dc1: 3,
dc2:3]’ , when this command
On Tue, Aug 5, 2014 at 11:40 AM, Lu, Boying boying...@emc.com wrote:
What I want to know is “are the *strategy* changed ?’ after the ‘udpate
keyspace with strategy_options…’ command returns successfully
Like all schema changes, not necessarily on all nodes. You will have to
check for schema
Try running describe cluster from Cassandra-CLI to see if all nodes have the
same schema version.
Rahul Neelakantan
On Aug 5, 2014, at 6:13 AM, Sylvain Lebresne sylv...@datastax.com wrote:
On Tue, Aug 5, 2014 at 11:40 AM, Lu, Boying boying...@emc.com wrote:
What I want to know is “are the
Hi all,
I want to add a data-center to an existing single data-center cluster.
First I have to make the existing cluster multi data-center compatible.
The existing cluster is a 12 node cluster with:
- Replication factor = 3
- Placement strategy = SimpleStrategy
- Endpoint snitch = SimpleSnitch
Yes, you must run a full repair for the reasons stated in the yaml file.
Mark
On Tue, Aug 5, 2014 at 11:52 AM, Rene Kochen rene.koc...@schange.com
wrote:
Hi all,
I want to add a data-center to an existing single data-center cluster.
First I have to make the existing cluster multi
What I understand is that SimpleStrategy determines the endpoints for
replica's by traversing the ring clock-wise.
NetworkTopologyStrategy determines the replica's by traversing the ring
clock-wise and taking into account the racks and DC locations.
Since the file used by PropertyFileSnitch puts
Hi Mark,
Mark Reddy wrote
To segregate customer data, you could:
- Use customer specific column families under a single keyspace
- Use a keyspace per customer
These effectively amount to the same thing and they both fall foul to the
limit in the number of column families so do not scale.
Hi,
I'm having an issue with ALLOW FILTERING with Cassandra 2.0.8. See a
minimal example here:
https://gist.github.com/JensRantil/ec43622c26acb56e5bc9
I expect the second last to fail, but the last query to return a single
row. In particular I expect the last SELECT to first select using the
Multi-tenant remain a challenge - for most technologies. Yes, you can do
what you suggest, but... you need to exercise great care and test and
provision your cluster with great care. It's not like a free resource that
scales wildly in all directions with no forethought or care.
It is
- Use a keyspace per customer
These effectively amount to the same thing and they both fall foul to the
limit in the number of column families so do not scale.
But then you can scale by moving some of the customers to a new cluster
easily. If you keep everything in a single keyspace or - worse
Hi,
we experienced a strange problem after intermittent network failure when
the affected node did not reconnect to the rest of the cluster but did
allow to autenticate users (which was not possible during the actual
network outage, see below). The cluster consists of 1 node in each of 3
If you look at VisualVM metadata, it'll show that what's return is
java.lang.Object which is different than Meters or Counters.
Looking at the source for metrics-core, it seems that this is a feature
of Gauges because unlike Meters or Counters, Gauges can be of various types
-- long, double, etc.
Thanks Patricia for your response!
On the new node, I just see a lot of the following:
INFO [FlushWriter:75] 2014-08-05 09:53:04,394 Memtable.java (line 400)
Writing Memtable
INFO [CompactionExecutor:3] 2014-08-05 09:53:11,132 CompactionTask.java
(line 262) Compacted 12 sstables to
so basically
Thanks a lot.
So the ‘strategy’ change may not be seen by all nodes when the ‘upgrade
keyspace …’ command returns and I can use ’describe cluster’ to check if
the change has taken effect on all nodes right?
From: Rahul Neelakantan [mailto:ra...@rahul.be]
Sent: 2014年8月5日 18:46
To:
Yes num_tokens is set to 256. initial_token is blank on all nodes including
the new one.
On Tue, Aug 5, 2014 at 10:03 AM, Mark Reddy mark.re...@boxever.com wrote:
My understanding was that if initial_token is left empty on the new node,
it just contacts the heaviest node and bisects its token
Also not sure if this is relevant but just noticed the nodetool tpstats
output:
Pool NameActive Pending Completed Blocked All
time blocked
FlushWriter 0 0 1136 0
512
Looks like about 50% of flushes are
Yes num_tokens is set to 256. initial_token is blank on all nodes
including the new one.
Ok so you have num_tokens set to 256 for all nodes with initial_token
commented out, this means you are using vnodes and the new node will
automatically grab a list of tokens to take over responsibility
This is incorrect. Network Topology w/ Vnodes will be fine, assuming
you've got RF= # of racks. For each token, replicas are chosen based
on the strategy. Essentially, you could have a wild imbalance in
token ownership, but it wouldn't matter because the replicas would be
distributed across the
* When I say wild imbalance, I do not mean all tokens on 1 node in the
cluster, I really should have said slightly imbalanced
On Tue, Aug 5, 2014 at 8:43 AM, Jonathan Haddad j...@jonhaddad.com wrote:
This is incorrect. Network Topology w/ Vnodes will be fine, assuming
you've got RF= # of
First, thanks for your answer.
This is incorrect. Network Topology w/ Vnodes will be fine, assuming you've
got RF= # of racks.
IMHO, it's not a good enough condition.
Let's use an example with RF=2
N1/rack_1 N2/rack_1 N3/rack_1 N4/rack_2
Here, you have RF= # of racks
And due to
If your nodes are not actually evenly distributed across physical racks for
redundancy, don't use multiple racks.
On Tue, Aug 5, 2014 at 10:57 AM, DE VITO Dominique
dominique.dev...@thalesgroup.com wrote:
First, thanks for your answer.
This is incorrect. Network Topology w/ Vnodes will be
Yes, if you have only 1 machine in a rack then your cluster will be
imbalanced. You're going to be able to dream up all sorts of weird
failure cases when you choose a scenario like RF=2 totally
imbalanced network arch.
Vnodes attempt to solve the problem of imbalanced rings by choosing so
many
So the ‘strategy’ change may not be seen by all nodes when the ‘upgrade
keyspace …’ command returns and I can use ’describe cluster’ to check if
the change has taken effect on all nodes right?
Correct, the change may take time to propagate to all nodes. As Rahul said
you can check describe
nodetool status:
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN 10.10.20.27 1.89 TB256 25.4%
76023cdd-c42d-4068-8b53-ae94584b8b04 rack1
UN
Jonathan wrote:
Yes, if you have only 1 machine in a rack then your cluster will be
imbalanced. You're going to be able to dream up all sorts of weird failure
cases when you choose a scenario like RF=2 totally imbalanced network arch.
Vnodes attempt to solve the problem of imbalanced
Also Mark to your comment on my tpstats output, below is my iostat output,
and the iowait is at 4.59%, which means no IO pressure, but we are still
seeing the bad flush performance. Should we try increasing the flush
writers?
Linux 2.6.32-358.el6.x86_64 (ny4lpcas13.fusionts.corp) 08/05/2014
Hi all,
Allow me to rephrase a question I asked last week. I am performing some
queries with ALLOW FILTERING and getting consistent read timeouts like the
following:
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra
timeout during read query at consistency ONE (1 responses
On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly clint.ke...@gmail.com wrote:
Allow me to rephrase a question I asked last week. I am performing some
queries with ALLOW FILTERING and getting consistent read timeouts like the
following:
ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order
Hi Ruchir,
With the large number of blocked flushes and the number of pending
compactions would still indicate IO contention. Can you post the output of
'iostat -x 5 5'
If you do in fact have spare IO, there are several configuration options
you can tune such as increasing the number of flush
On Tue, Aug 5, 2014 at 5:48 AM, Jiri Horky ho...@avast.com wrote:
What puzzles me is the fact that the authentization apparently started
to work after the network recovered but the exchange of data did not.
I would like to understand what could caused the problems and how to
avoid them in
How much did you reduce *read_request_timeout_in_ms* on your local machine?
Cassandra timeout during read query is higher than one machine because
Cassandra server must run the read operation in more servers (so you have
network traffic).
2014-08-05 14:54 GMT-03:00 Robert Coli
On Tue, Aug 5, 2014 at 3:52 AM, Rene Kochen rene.koc...@schange.com wrote:
Do I have to run full repairs after this change? Because the yaml file
states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER,
YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE
The discussion about racks NTS is also mentioned in this recent article :
planetcassandra.org/multi-data-center-replication-in-nosql-databases/
The last section may be of interest for you
Le 5 août 2014 18:14, DE VITO Dominique dominique.dev...@thalesgroup.com
a écrit :
Jonathan wrote:
You need to create an index on attribute *c.*
2014-08-05 9:24 GMT-03:00 Jens Rantil jens.ran...@tink.se:
Hi,
I'm having an issue with ALLOW FILTERING with Cassandra 2.0.8. See a
minimal example here:
https://gist.github.com/JensRantil/ec43622c26acb56e5bc9
I expect the second last to
Have you looked nodetool?
http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsNodetool_r.html
2014-08-04 16:43 GMT-03:00 Kevin Burton bur...@spinn3r.com:
Is it possible to take older tables, which are immutable, and move them
from SSD to HDD?
We lower the SLA on older
On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos
vasileiosvlac...@gmail.com wrote:
The problem is that the nodetool seems to be stuck, and nodetool netstats
on node1 of DC2 appears to be stuck at 10% streaming a 5G file from node2
at DC1. This doesn't tally with nodetool netstats when
Hi Kevin,
This is something we do plan to support, but don't right now. You can see
the discussion around this and related issues here
https://issues.apache.org/jira/browse/CASSANDRA-5863 (although it may
seem unrelated at first glance).
On Mon, Aug 4, 2014 at 8:43 PM, Kevin Burton
Hi Rob,
Thanks for your feedback. I understand that use of ALLOW FILTERING is
not a best practice. In this case, however, I am building a tool on
top of Cassandra that allows users to sometimes do things that are
less than optimal. When they try to do expensive queries like this,
I'd rather
Hi Vasilis,
To further on what Rob said
I believe you might be able to tune the phi detector threshold to help this
operation complete, hopefully someone with direct experience of same will
chime in.
I have been through this operation where streams break due to a node
falsely being marked
Ah FWIW I was able to reproduce the problem by reducing
range_request_timeout_in_ms. This is great since I want to increase
the timeout for batch jobs where we scan a large set of rows, but
leave the timeout for single-row queries alone.
Best regards,
Clint
On Tue, Aug 5, 2014 at 11:42 AM,
Right now, we have 6 flush writers and compaction_throughput_mb_per_sec is
set to 0, which I believe disables throttling.
Also, Here is the iostat -x 5 5 output:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda 10.00
Also, right now the top command shows that we are at 500-700% CPU, and we
have 23 total processors, which means we have a lot of idle CPU left over,
so throwing more threads at compaction and flush should alleviate the
problem?
On Tue, Aug 5, 2014 at 2:57 PM, Ruchir Jha ruchir@gmail.com
OK, ticket 7696 [1] created.
Jiri Horky
https://issues.apache.org/jira/browse/CASSANDRA-7696
On 08/05/2014 07:57 PM, Robert Coli wrote:
On Tue, Aug 5, 2014 at 5:48 AM, Jiri Horky ho...@avast.com
mailto:ho...@avast.com wrote:
What puzzles me is the fact that the authentization
On Tue, Aug 5, 2014 at 11:53 AM, Clint Kelly clint.ke...@gmail.com wrote:
Ah FWIW I was able to reproduce the problem by reducing
range_request_timeout_in_ms. This is great since I want to increase
the timeout for batch jobs where we scan a large set of rows, but
leave the timeout for
As long as you correctly configure the new snitch so that the replica sets
do not change, no, you do not need to repair.
Is the following correct:
The replica sets do not change if you modify the snitch from SimpleSnitch
to NetworkTopologyStrategy and the topology file puts all nodes in the same
On Tue, Aug 5, 2014 at 2:27 PM, Rene Kochen rene.koc...@schange.com wrote:
As long as you correctly configure the new snitch so that the replica
sets do not change, no, you do not need to repair.
Is the following correct:
The replica sets do not change if you modify the snitch from
I think the RAC placement of these 12 nodes will become important. As the
12 nodes are placed in SimpleSnitch, which is not RAC aware, it would be
good to retain them in single RAC in the property file snitch also
initially. node repair is a safe option. If you need to change the RAC
placement, my
Hi everyone,
For some integration tests, we start up a CassandraDaemon in a
separate process (using the Java 7 ProcessBuilder API). All of my
integration tests run beautifully on my laptop, but one of them fails
on our Jenkins cluster.
The failing integration test does around 10k writes to
If there is an oom it will be in the logs.
On Aug 5, 2014 8:17 PM, Clint Kelly clint.ke...@gmail.com wrote:
Hi everyone,
For some integration tests, we start up a CassandraDaemon in a
separate process (using the Java 7 ProcessBuilder API). All of my
integration tests run beautifully on my
HI Kevin,
Thanks for your reply. That is what I assumed, but some of the posts
I read on Stack Overflow (e.g., the one that I referenced in my mail)
suggested otherwise. I was just curious if others had experienced OOM
problems that weren't logged or if there were other common culprits.
Best
57 matches
Mail list logo