We run a fairly small production Cassandra 2.2.4 cluster with 5 nodes on
Rackspace VMs, (4 cores, 4GB RAM, SSD backed) and whilst these nodes are on the
small side, day to day it has kept up with our workload fine.
We currently use SizeTieredCompactionStrategy and want to move to the
Thanks for responding!
My natural partition key is a customer id. Our customers have widely
varying amounts of data. Since the vast majority of them have data that's
small enough to fit in a single partition, I'd like to avoid imposing
unnecessary overhead on the 99% just to avoid issues with the
>
>
> In this case, 99% of my data could fit in a single 50 MB partition. But if
> I use the standard approach, I have to split my partitions into 50 pieces
> to accommodate the largest data. That means that to query the 700 rows for
> my median case, I have to read 50 partitions instead of one.
>
Hello,
thanks to everyone for the fast replies!
Unfortunately, since yesterday afternoon I have been assigned to a more
urgent task, so I will implement the solutions you proposed in the spare
time and I will let you know the outcomes asap (hopefully in few weeks).
Thanks a lot again!
Best
Bear in mind that you won't be able to merely "tune" your schema - you will
need to completely redesign your data model. Step one is to look at all of
the queries you need to perform and get a handle on what flat, denormalized
data model they will need to execute performantly in a NoSQL database.
Jim, I don't quite get why you think you would need to query 50 partitions
to return merely hundreds or thousands of rows. Please elaborate. I mean,
sure, for that extreme 100th percentile, yes, you would query a lot of
partitions, but for the 90th percentile it would be just one. Even the 99th
Hi,
As per my understanding, a Cassandra version n is implicitly declared EOL when
two major versions are released after the version n i.e. when version n + 2 is
released.
I think the EOL policy must be revisted in interest of the expanding Cassandra
user base.
Concerns with current EOL
As to why I think it's cluster-wide, here's what the documentation says:
https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html
compaction_throughput_mb_per_sec
Hi Jack,
Thanks for your response. My answers inline...
On Tue, Jan 5, 2016 at 11:52 AM, Jack Krupansky
wrote:
> Jim, I don't quite get why you think you would need to query 50 partitions
> to return merely hundreds or thousands of rows. Please elaborate. I mean,
>
Hi Nate,
Yes, I've been thinking about treating customers as either small or big,
where "small" ones have a single partition and big ones have 50 (or
whatever number I need to keep sizes reasonable). There's still the problem
of how to handle a small customer who becomes too big, but that will
On Tue, Jan 5, 2016 at 6:50 AM, Ken Hancock wrote:
> As to why I think it's cluster-wide, here's what the documentation says:
>
Do you see "system" used in place of "cluster" anywhere else in the docs?
I think you are correct that the docs should standardize on
Will do. I searched the doc for additional usage of the term "system"
commitlog_segment_size_in_mb refers to "every table in the system"
concurrent_writes talks about CPU cores "in your system"
That's it for "system" other than the compaction_throughput_mb_per_sec
which refers to "across the
Hi,I’ve been benchmarking Cassandra to get an idea of how the performance scales with more data on a single machine. I just wanted to get some feedback to whether these are the numbers I should expect.The benchmarks are quite simple — I measure the latency and throughput for two kinds of
I understand, Ravi, we have our application layers well defined. The major
changes will be in database access layers and entities will be changed.
Schema will be modified to tune the efficiency of the data store chosen.
We have been using mongo as a cache for a long time now, but as its a
Security is a very wide concept. What exactly do you want to achieve ?
From: Ajay Garg [mailto:ajaygargn...@gmail.com]
Sent: Wednesday, January 06, 2016 11:27 AM
To: user@cassandra.apache.org
Subject: Basic query in setting up secure inter-dc cluster
Hi All.
We have a 2*2 cluster deployed, but
Hi, when I try to connect cassandra3.0 cluster in opscenter, I experienced
an error in opscenter log, see below:
''Control connection failed to connect, shutting down Cluster: ('Unable to
connect to any servers', {u'54.187.25.239': ProtocolError("Unexpected
response during Connection setup:
Hi All.
We have a 2*2 cluster deployed, but no security as of now.
As a first stage, we wish to implement inter-dc security.
Is it possible to enable security one machine at a time?
For example, let's say the machines are DC1M1, DC1M2, DC2M1, DC2M2.
If I make the changes JUST IN DC2M2 and
You could keep a "num_buckets" value associated with the client's account,
which can be adjusted accordingly as usage increases.
On Tue, Jan 5, 2016 at 2:17 PM Jim Ancona wrote:
> On Tue, Jan 5, 2016 at 4:56 PM, Clint Martin <
> clintlmar...@coolfiretechnologies.com>
On Tue, Jan 5, 2016 at 4:56 PM, Clint Martin <
clintlmar...@coolfiretechnologies.com> wrote:
> What sort of data is your clustering key composed of? That might help some
> in determining a way to achieve what you're looking for.
>
Just a UUID that acts as an object identifier.
>
> Clint
> On Jan
I forwarded a comment to the docs team.
It appears that they picked the language up from the cassandra.yaml file
itself. Looking at use of system in that file, it seems that it usually
means the node, the box running the node.
-- Jack Krupansky
On Tue, Jan 5, 2016 at 9:50 AM, Ken Hancock
We run a small Cassandra 2.2.0 cluster, with 5 nodes, on bare-metal servers
and we are going to replace those nodes with other nodes. I planned to add
all the new nodes first, one-by-one, and later remove the old ones,
one-by-one.
Although the first new node gets stuck when joining the cluster. I
Hi All,
Im planning to shift from SQL database to a columnar nosql database, we
have streamlined our choices to Cassandra and HBase. I would really
appreciate if someone decent experience with both give me a honest
comparison on below parameters (links to neutral benchmarks/blogs also
I did something like this in Perl. What you want to know is will the server
respond to CQL, then it is ready to use.
The Bash equivalent of what I did would be to use:
cqlsh < /dev/null
if $?
...
Stephen
On 4 January 2016 at 15:56, Giovanni Usai
wrote:
> Hello
Please ignore.
On 5 January 2016 at 11:48, Herbert Fischer
wrote:
> We run a small Cassandra 2.2.0 cluster, with 5 nodes, on barebone servers
> and we are going to replace those nodes with other nodes. I planned to add
> all the new nodes first, one-by-one, and
We run a small Cassandra 2.2.0 cluster, with 5 nodes, on barebone servers
and we are going to replace those nodes with other nodes. I planned to add
all the new nodes first, one-by-one, and later remove the old ones,
one-by-one.
Although the first new node gets stuck when joining the cluster. I
*Thanks Jack* *for the detailed advice*.
Yes it is a Java Application.
We have a Denormalized view of our data already in place, we use it for
storing it in MongoDB as a cache, however will get our hands dirty before
implementation. We would like to have a single DB view. And replace MongoDB
&
Sorry to nitpick, but Cassandra is not a columnar database. If you're
looking for columnar because you have an analytics need, Cassandra is not
what you want. If you've just made the same mistake that 99% of people
make, well, now you know. Cassandra historically has been referred to as a
DataStax has documented quite a few customers/case studies:
http://www.datastax.com/resources/casestudies
Materialized Views should be considered if you can go straight to 3.0, but
you can always do the same synthesized views yourself in your app, which is
current standard best practice anyways.
Thanks for pointing out the typo Jonathan. Our use case is of Column
Family. :)
On Wed, Jan 6, 2016 at 2:38 AM, Jonathan Haddad wrote:
> Sorry to nitpick, but Cassandra is not a columnar database. If you're
> looking for columnar because you have an analytics need,
On Tue, Jan 5, 2016 at 3:01 AM, Herbert Fischer <
herbert.fisc...@crossengage.io> wrote:
> We run a small Cassandra 2.2.0 cluster, with 5 nodes, on bare-metal
> servers and we are going to replace those nodes with other nodes. I planned
> to add all the new nodes first, one-by-one, and later
What sort of data is your clustering key composed of? That might help some
in determining a way to achieve what you're looking for.
Clint
On Jan 5, 2016 2:28 PM, "Jim Ancona" wrote:
> Hi Nate,
>
> Yes, I've been thinking about treating customers as either small or big,
>
You are moving from a SQL database to C* ??? I hope you are aware of the
differences between a nosql like C* and a RDBMS. To keep it short, the app
has to change significantly.
Please read documentation on differences between nosql and RDBMS.
thanks.
On Tue, Jan 5, 2016 at 6:20 AM, Bhuvan Rawal
32 matches
Mail list logo