Vincent: currently big partitions, even if you're using paging & slicing by
clustering keys, will give you performance problems over time. Please read
the JIRAs that Alex linked to, they provide in depth explanations as to
why, from some of the best Cassandra operators in the world :)
On Fri, Oct
Well I only asked that because I wanted to make sure that we're not
doing it wrong, because that's actually how we query stuff, we always
provide a cluster key or a range of cluster keys.
But yes, I understand that compactions may suffer and/or there may be
hidden bottlenecks because of big parti
On Fri, Oct 28, 2016 at 11:21 AM, Vincent Rischmann
wrote:
> Doesn't paging help with this ? Also if we select a range via the cluster
> key we're never really selecting the full partition. Or is that wrong ?
>
>
> On Fri, Oct 28, 2016, at 05:00 PM, Edward Capriolo wrote:
>
> Big partitions are a
Doesn't paging help with this ? Also if we select a range via the
cluster key we're never really selecting the full partition. Or is
that wrong ?
On Fri, Oct 28, 2016, at 05:00 PM, Edward Capriolo wrote:
> Big partitions are an anti-pattern here is why:
>
> First Cassandra is not an analytic data
Big partitions are an anti-pattern here is why:
First Cassandra is not an analytic datastore. Sure it has some UDFs and
aggregate UDFs, but the true purpose of the data store is to satisfy point
reads. Operations have strict timeouts:
# How long the coordinator should wait for read operations to
Hi Eric,
that would be https://issues.apache.org/jira/browse/CASSANDRA-9754 by
Michael Kjellman and https://issues.apache.org/jira/browse/CASSANDRA-11206 by
Robert Stupp.
If you haven't seen it yet, Robert's summit talk on big partitions is
totally worth it :
Video : https://www.youtube.com/watch?
On Thu, Oct 27, 2016 at 4:13 PM, Alexander Dejanovski
wrote:
> A few patches are pushing the limits of partition sizes so we may soon be
> more comfortable with big partitions.
You don't happen to have Jira links to these handy, do you?
--
Eric Evans
john.eric.ev...@gmail.com
from large partitions (it can also crash your server in some cases, so TEST IT
IN A LAB FIRST).
- Jeff
From: Alexander Dejanovski
Reply-To: "user@cassandra.apache.org"
Date: Thursday, October 27, 2016 at 2:13 PM
To: "user@cassandra.apache.org"
Subject: Re: Too
The "official" recommendation would be 100MB, but it's hard to give a
precise answer.
Keeping it under the GB seems like a good target.
A few patches are pushing the limits of partition sizes so we may soon be
more comfortable with big partitions.
Cheers
Le jeu. 27 oct. 2016 21:28, Vincent Rischm
Yeah that particular table is badly designed, I intend to fix it, when
the roadmap allows us to do it :)
What is the recommended maximum partition size ?
Thanks for all the information.
On Thu, Oct 27, 2016, at 08:14 PM, Alexander Dejanovski wrote:
> 3.3GB is already too high, and it's surely no
3.3GB is already too high, and it's surely not good to have well performing
compactions. Still I know changing a data model is no easy thing to do, but
you should try to do something here.
Anticompaction is a special type of compaction and if an sstable is being
anticompacted, then any attempt to
Ok, I think we'll give incremental repairs a try on a limited number of
CFs first and then if it goes well we'll progressively switch more CFs
to incremental.
I'm not sure I understand the problem with anticompaction and
validation running concurrently. As far as I can tell, right now when a
CF is
Oh right, that's what they advise :)
I'd say that you should skip the full repair phase in the migration
procedure as that will obviously fail, and just mark all sstables as
repaired (skip 1, 2 and 6).
Anyway you can't do better, so take a leap of faith there.
Intensity is already very low and 100
Thanks for the response.
We do break up repairs between tables, we also tried our best to have no
overlap between repair runs. Each repair has 1 segments (purely
arbitrary number, seemed to help at the time). Some runs have an
intensity of 0.4, some have as low as 0.05.
Still, sometimes one p
Hi Vincent,
most people handle repair with :
- pain (by hand running nodetool commands)
- cassandra range repair :
https://github.com/BrianGallew/cassandra_range_repair
- Spotify Reaper
- and OpsCenter repair service for DSE users
Reaper is a good option I think and you should stick to it. If it
Hi,
we have two Cassandra 2.1.15 clusters at work and are having some
trouble with repairs.
Each cluster has 9 nodes, and the amount of data is not gigantic but
some column families have 300+Gb of data.
We tried to use `nodetool repair` for these tables but at the time we
tested it, it made the w
16 matches
Mail list logo