No you can't do it in one step. Streaming between versions isn't supported
On Thu, May 18, 2017 at 8:26 AM daemeon reiydelle
wrote:
> Yes, or decomission the old one and build anew after new one is operational
>
> “All men dream, but not equally. Those who dream by night in the dusty
> recesses o
50% free is unnecessary. The only reason to keep that much free is if you
wanted to regularly run major compactions, which you shouldn't.
I'd aim for 75%. Bootstrap new nodes in when you get close to that
number. Ensure you don't have any sstables larger than your available
space and you'll be
How many CPUs are you using for interrupts?
http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux
Have you tried making a flame graph to see where Cassandra is spending its
time? http://www.brendangregg.com/blog/2014-06-12/java-flame-graphs.html
Are you tracking GC pauses
> Oh, so all the data is lost if the instance is shutdown or restarted (for
that instance)?
When you restart the OS, you're technically not shutting down the
instance. As long as the instance isn't stopped / terminated, your data is
fine. I ran my databases on ephemeral storage for years without
ount of IOPS (gp2)
> out of EBS at a reasonable rate, it gets more expensive than an I3.
>
>
>
> *From: *Jonathan Haddad
>
>
> *Date: *Tuesday, May 23, 2017 at 9:42 AM
> *To: *"Gopal, Dhruva" , Matija Gobec <
> matija0...@gmail.com>, Bhuvan Rawal
>
>
Why do you think keeping your data in the memtable is a what you need to do?
On Thu, May 25, 2017 at 7:16 AM Avi Kivity wrote:
> Then it doesn't have to (it still may, for other reasons).
>
> On 05/25/2017 05:11 PM, preetika tyagi wrote:
>
> What if the commit log is disabled?
>
> On May 25, 2017
8:06 AM Avi Kivity wrote:
> Not sure whether you're asking me or the original poster, but the more
> times data gets overwritten in a memtable, the less it has to be compacted
> later on (and even without overwrites, larger memtables result in less
> compaction).
>
> On 05/25/
g that.
> Given that we're using unlogged/same partition batches is it safe to raise
> the batch size warning limit? Actually cqlsh COPY FROM has very good
> throughput using a small batch size, but I can't get that same throughput
> in cassandra-stress or my C++ app wit
If you have a small amount of hot data, enable the row cache. The memtable
is not designed to be a cache. You will not see a massive performance
impact of writing one to disk. Sstables will be in your page cache, meaning
you won't be hitting disk very often.
On Fri, May 26, 2017 at 7:41 AM Max C w
is your pattern.
On Fri, May 26, 2017 at 2:15 PM Jan Algermissen
wrote:
> Jonathan,
>
> On 26 May 2017, at 17:00, Jonathan Haddad wrote:
>
> > If you have a small amount of hot data, enable the row cache. The
> > memtable
> > is not designed to be a cache. You wil
This isn't an HDFS mailing list.
On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle
wrote:
> no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs
> node. Depends somewhat on whether there is a mix of more and less
> frequently accessed data. But even storing only hot data, never
he cluster as you can see
> over the weekend there was a massive latency spike, and it was fixed by a
> restart of all the nodes.
>
> On May 30 2017, at 2:18 pm, Jonathan Haddad wrote:
>
>> This isn't an HDFS mailing list.
>>
>> On Tue, May 30, 2017 at 2:14 PM dae
the
> dreamers of the day are dangerous men, for they may act their dreams with
> open eyes, to make it possible.” — T.E. Lawrence*
>
>
> On Tue, May 30, 2017 at 2:18 PM, Jonathan Haddad
> wrote:
>
>> This isn't an HDFS mailing list.
>>
>> On Tue,
I really wouldn't go by the tick tock blog post, considering tick tock is
dead.
I'm still not wild about putting any 3.0 or 3.x into production. 3.0
removed off heap memtables and there have been enough bugs in the storage
engine that I'm still wary. My hope is to see 3.11.x get enough bug fixes
Unfortunately this feature falls in a category of *incredibly useful*
features that have gotten the -1 over the years because it doesn't scale
like we want it to. As far as basic aggregations go, it's remarkably
trivial to roll up 100K-1MM items using very little memory, so at first it
seems like
I can't recommend *anyone* use incremental repair as there's some pretty
horrible bugs in it that can cause Merkle trees to wildly mismatch & result
in massive overstreaming. Check out
https://issues.apache.org/jira/browse/CASSANDRA-9143.
TL;DR: Do not use incremental repair before 4.0.
On Tue,
It would be a little weird to change the definition of QUORUM, which means
majority, to mean something other than majority for a single use case.
Sounds like you want to introduce a new CL, HALF.
On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu wrote:
> Justin, what I suggest is that for QUORUM consisten
Nobody is promoting ES as a primary datastore in this thread. Every
mention of it is to accompany C*.
On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan wrote:
> For all those promoting ES as a PRIMARY datastore, please read this before:
>
> https://discuss.elastic.co/t/elasticsearch-as-a-primary-d
meant as Jonathan said to use C* for primary key and as a
>> primary storage and ES as an indexed version of what you have in cassandra.
>>
>> 2017-06-12 19:19 GMT+02:00 DuyHai Doan :
>>
>>> Sorry, I misread some reply I had the impression that people recommend
>
ng. But for some reason the interrupts are all being handled on CPU 0
>>>> anyway.
>>>>
>>>> I see this in /var/log/dmesg on the machines:
>>>>
>>>>>
>>>>> Your BIOS has requested that x2apic be disabled.
>>>>&
Hey folks!
I'm proud to announce the 0.6.1 release of the Reaper project, the open
source repair management tool for Apache Cassandra.
This release improves the Cassandra backend significantly, making it a
first class citizen for storing repair schedules and managing repair
progress. It's no lon
The driver grabs all the cluster information from the nodes you provide the
driver and connects automatically to the rest. You don't need (and
shouldn't use) a load balancer.
Jon
On Mon, Jun 19, 2017 at 12:28 PM Daniel Hölbling-Inzko <
daniel.hoelbling-in...@bitmovin.com> wrote:
> Just out of c
It sounds like you're suggesting adding new nodes in to replace existing
ones. You can't do that because it requires streaming between versions,
which isn't supported.
You need to take a node down, upgrade the C* version, then start it back
up.
Jon
On Mon, Jun 26, 2017 at 3:56 PM Nitan Kainth
Oops, I read that wrong, sorry. You want to upgrade the OS. Disregard my
email.
On Mon, Jun 26, 2017 at 4:04 PM Jonathan Haddad wrote:
> It sounds like you're suggesting adding new nodes in to replace existing
> ones. You can't do that because it requires streaming between v
While someone here might know the answer, I think you'll want to ask on the
DC/OS mailing list for best chance of getting a response. The Apache
Cassandra Project doesn't maintain the code you're asking about.
The DC/OS mailing list here:
https://groups.google.com/a/dcos.io/forum/#!forum/users
G
Query it at consistency ALL and let read repair do its thing.
On Tue, Jun 27, 2017 at 11:48 AM Pranay akula
wrote:
> I have a CF with composite partition key, partition key consists of blob
> and text data types.
> The select query against this particular partition is timing out so to
> debug it
You should check if the same error exists in 3.11. If so, open up a Jira.
On Wed, Jul 5, 2017 at 10:38 AM Łukasz Biedrycki
wrote:
> Hey,
>
> I am using Cassandra 3.9.
>
> Recently I experienced a problem that prevents me to restart cassandra. I
> narrowed it down to SASI Index.
>
> Steps to repro
Cassandra uses the writetime to resolve the conflict. Highest time stamp
wins. There's no guarantee on the order the mutations arrive in.
On Thu, Jul 6, 2017 at 7:14 AM suraj pasuparthy
wrote:
> thanks Pranay,
> But is the order maintained across tables?
> As in the client in DC1 first writes rec
Awesome utility Avi! Thanks for sharing.
On Tue, Jul 11, 2017 at 10:57 AM Avi Kivity wrote:
> There is now a readme with some examples and a build file.
>
> On 07/11/2017 11:53 AM, Avi Kivity wrote:
>
> Yeah, posting a github link carries an implied undertaking to write a
> README file and make i
This looks like expected behavior to me. You aren't inserting a value for
b. Since there's no value, there's also no writetime.
On Tue, Jul 18, 2017 at 12:15 PM Nitan Kainth wrote:
> Hi,
>
> We see that null columns have writetime(column) populated for few columns
> and shows null for few othe
2017 at 12:24 PM Nitan Kainth wrote:
> Jonathan,
>
> Please notice last rows with partition key values (w,v and t). they were
> inserted same way and has write time values
>
> On Jul 18, 2017, at 2:22 PM, Jonathan Haddad wrote:
>
> This looks like expected behavior to me. Y
Using a different table to answer each query is the correct answer here
assuming there's a significant amount of data.
If you don't have that much data, maybe you should consider using a
database like Postgres which gives you query flexibility instead of
horizontal scalability.
On Sun, Jul 23, 201
The TTL is applied to the cells on insert. Changing it doesn't change the
TTL on data that was inserted previously.
On Sun, Oct 1, 2017 at 6:23 AM Gábor Auth wrote:
> Hi,
>
> The `alter table number_item with gc_grace_seconds = 3600;` is sets the
> grace seconds of tombstones of the future modif
Anthony’s suggestions using replace_address_first_boot lets you avoid that
requirement, and it’s specifically why it was added in 2.2.
On Tue, Nov 14, 2017 at 1:02 AM Anshu Vajpayee
wrote:
> Thanks guys ,
>
> I thikn better to pass replace_address on command line rather than update
> the cassnd
It should work with DSE, but we don’t explicitly test it.
Mind testing it and posting your results? If you could include the DSE
version it would be great.
On Thu, Nov 16, 2017 at 11:57 PM Anshu Vajpayee
wrote:
> Thanks John for your efforts and nicley putting it on website & youtube .
>
> Just
I wouldn’t recommend using incremental repair at all at this time due to
some bugs that can cause massive overstreaming.
Our advice at TLP is to do subrange repair, and we maintain Reaper to help
with that: http://cassandra-reaper.io
Jon
On Wed, Nov 22, 2017 at 2:18 AM Akshit Jain wrote:
> Is t
Have you read through the docs for stress? You can have it use your own
queries and data model.
http://cassandra.apache.org/doc/latest/tools/cassandra_stress.html
On Sun, Nov 26, 2017 at 1:02 AM Akshit Jain wrote:
> Hi,
> What is the best way to stress test the cassandra cluster with real life
>
Definitely upgrade to 3.11.1.
On Sun, Dec 17, 2017 at 8:54 PM Pradeep Chhetri
wrote:
> Hello Kurt,
>
> I realized it was because of RAM shortage which caused the issue. I bumped
> up the memory of the machine and node bootstrap started but this time i hit
> this bug of cassandra 3.9:
>
> https://
Changing the defaul TTL doesn’t change the TTL on the existing data, only
new data. It’s only set if you don’t supply one yourself.
On Wed, Jan 31, 2018 at 11:35 PM Bo Finnerup Madsen
wrote:
> Hi,
>
> We are running a small 9 node Cassandra v2.1.17 cluster. The cluster
> generally runs fine, but
I would also optimize for your worst case, which is hitting zero caches.
If you're using the default settings when creating a table, you're going to
get compression settings that are terrible for reads. If you've got memory
to spare, I suggest changing your chunk_length_in_kb to 4 and disabling
re
That might be fine for a one off but is totally impractical at scale or
when using TWCS.
On Fri, Feb 9, 2018 at 8:39 AM DuyHai Doan wrote:
> Or use the new user-defined compaction option recently introduced,
> provided you can determine over which SSTables a partition is spread
>
> On Fri, Feb 9,
If you want consistent reads you have to use the CL that enforces it.
There’s no way around it.
On Fri, Feb 9, 2018 at 2:35 PM Mahdi Ben Hamida wrote:
> In this case, we only write using CAS (code guarantees that). We also
> never update, just insert if not exist. Once a hash exists, it never
> c
The easiest way to do this is replacing one node at a time by using rsync.
I don't know why it has to be more complicated than copying data to a new
machine and replacing it in the cluster. Bringing up a new DC with
snapshots is going to be a nightmare in comparison.
On Wed, Feb 21, 2018 at 8:16
If it's a new cluster, there's no need to disable auto_bootstrap. That
setting prevents the first node in the second DC from being a replica for
all the data in the first DC. If there's no data in the first DC, you can
skip a couple steps and just leave it on.
Leave it on, and enjoy your afterno
gt;
>
> Thank you for the answer. Do you know where to look to understand why this
> works. As i understood all the node then will chose ramdoms tokens. How can
> i assure the correctness of the ring?
>
>
>
> So as you said. Under the condition that there.is no data in the cluster.
&
I had to do something similar recently. Take a look at
org.apache.cassandra.cql3.QueryProcessor.parseStatement(). I've got some
sample code here [1] as well as a blog post [2] that explains how to access
the private variables, since there's no access provided. It wasn't really
designed to be use
There isn't a ton from that talk I'd consider "wrong" at this point, but
some of it is a little stale. I always start off looking at system
metrics. For a very thorough discussion on the matter check out Brendan
Gregg's USE [1] method. I did a blog post on my own about the talk [2]
that has scre
The docs are in tree, meaning they are versioned, and should be written for
the version they correspond to. Trunk docs should reflect the current state
of trunk, and shouldn’t have caveats for other versions.
On Mon, Mar 12, 2018 at 8:15 AM Kenneth Brotman
wrote:
> If we use DataStax’s example, w
Right now they can’t.
On Mon, Mar 12, 2018 at 9:03 AM Kenneth Brotman
wrote:
> I see how that makes sense Jon but how does a user then select the
> documentation for the version they are running on the Apache Cassandra web
> site?
>
>
>
> Kenneth Brotman
>
>
>
>
Yes, I agree, we should host versioned docs. I don't think anyone is
against it, it's a matter of someone having the time to do it.
On Tue, Mar 13, 2018 at 6:14 PM kurt greaves wrote:
> I’ve never heard of anyone shipping docs for multiple versions, I don’t
>> know why we’d do that. You can ge
Can you provide the code that you use to create the table? This feels like
code error rather than a database bug.
On Wed, Aug 13, 2014 at 1:26 PM, Kevin Burton wrote:
> 2.0.5… I'm upgrading to 2.0.9 now just to rule this out….
>
> I can give you the full CQL for the table, but I can't seem to
It sounds like your clocks are out of sync. Run ntpdate to fix your
clock & then make sure you're running ntpd on every machine.
On Mon, Aug 25, 2014 at 1:25 PM, Sávio S. Teles de Oliveira
wrote:
> We're using cassandra 2.0.9 with datastax java cassandra driver 2.0.0 in a
> cluster of eight node
This is actually a more correct response than mine, I made a few
assumptions that may or may not be true.
On Mon, Aug 25, 2014 at 1:31 PM, Robert Coli wrote:
> On Mon, Aug 25, 2014 at 1:25 PM, Sávio S. Teles de Oliveira
> wrote:
>>
>> We're using cassandra 2.0.9 with datastax java cassandra driv
I believe shuffle has been removed recently. I do not recommend using
it for any reason.
If you really want to go vnodes, your only sane option is to add a new
DC that uses vnodes and switch to it.
The downside in the 2.0.x branch to using vnodes is that repairs take
N times as long, where N is
abled DC, as it does not work
at all.
On Mon, Sep 8, 2014 at 2:01 PM, Tim Heckman wrote:
> On Mon, Sep 8, 2014 at 1:45 PM, Jonathan Haddad wrote:
>> I believe shuffle has been removed recently. I do not recommend using
>> it for any reason.
>
> We're still using the 1
Multi-dc is available in every version of Cassandra.
On Wed, Sep 10, 2014 at 9:21 AM, Oleg Ruchovets wrote:
> Thank you very much for the links.
> Just to be sure: is this capability available for COMMUNITY ADDITION?
>
> Thanks
> Oleg.
>
> On Wed, Sep 10, 2014 at 11:49 PM, Alain RODRIGUEZ
> wr
Make sure your clocks are synced. If they aren't, the writetime that
determines the most recent value will be incorrect.
On Wed, Sep 17, 2014 at 11:58 AM, Robert Coli wrote:
> On Wed, Sep 17, 2014 at 11:55 AM, Sávio S. Teles de Oliveira
> wrote:
>>
>> I'm using the Cassandra 2.0.9 with JAVA dat
Keep in mind secondary indexes in cassandra are not there to improve
performance, or even really be used in a serious user facing manner.
Build and maintain your own view of the data, it'll be much faster.
On Thu, Sep 18, 2014 at 6:33 PM, Jay Patel wrote:
> Hi there,
>
> We are seeing extreme
Depending on how you query (one or quorum) you might be able to do 1 rack at a
time (or az or whatever you've got) assuming your snitch is set up right
> On Sep 19, 2014, at 11:30 AM, Kevin Burton wrote:
>
> This is great feedback…
>
> I think it could actually be even easier than this…
>
>
You'll need to provide a bit of information. To start, a query trace
from would be helpful.
http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/tracing_r.html
(self promo) You may want to read over my blog post regarding
diagnosing problems in production. I've covered diagnosing slo
Are you using Cassandra 2.0 & vnodes? If so, repair takes forever.
This problem is addressed in 2.1.
On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux
wrote:
> I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in
> another.
>
>
>
> Running a repair on a large column family
mailto:brice.duth...@gmail.com]
> Sent: Friday, September 26, 2014 12:47 PM
> To: user@cassandra.apache.org
> Subject: Re: Repair taking long time
>
>
>
> Unfortunately DSE 4.5.0 is still on 2.0.x
>
>
> -- Brice
>
>
>
> On Fri, Sep 26, 2014 at 7:40 PM, Jonathan Haddad wrot
ion..no support (yet!) :(
>
> Gene Robichaux
> Manager, Database Operations
> Match.com
> 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225
> Phone: 214-576-3273
>
> -Original Message-
> From: jonathan.had...@gmail.com [mailto:jonathan.had...@gmail.com] On Behal
First, did you run a query trace?
I recommend Al Tobey's pcstat util to determine if your files are in
the buffer cache: https://github.com/tobert/pcstat
On Wed, Oct 22, 2014 at 4:34 AM, Thomas Whiteway
wrote:
> Hi,
>
>
>
> I’m working on an application using a Cassandra (2.1.0) cluster where
No. Consider a scenario where you supply a timestamp a week in the future,
flush it to sstable, and then do a write, with the current timestamp. The
record in disk will have a timestamp greater than the one in the memtable.
On Wed, Oct 22, 2014 at 9:18 AM, Donald Smith <
donald.sm...@audiencesci
If the issue is related to I/O, you're going to want to determine if
you're saturated. Take a look at `iostat -dmx 1`, you'll see avgqu-sz
(queue size) and svctm, (service time).The higher those numbers
are, the most overwhelmed your disk is.
On Sun, Oct 26, 2014 at 12:01 PM, DuyHai Doan wro
For cqlengine we do quite a bit of write then read to ensure data was
written correctly, across 1.2, 2.0, and 2.1. For what it's worth,
I've never seen this issue come up. On a single node, Cassandra only
acks the write after it's been written into the memtable. So, you'd
expect to see the most
Personally I've found that using query timing + log aggregation on the
client side is more effective than trying to mess with tracing probability
in order to find a single query which has recently become a problem. I
recommend wrapping your session with something that can automatically log
the sta
In production?
On Mon Nov 10 2014 at 6:06:41 AM Spencer Brown wrote:
> I'm using /McFrazier/PhpBinaryCql/
>
>
> On Mon, Nov 10, 2014 at 1:48 AM, Akshay Ballarpure <
> akshay.ballarp...@tcs.com> wrote:
>
>> Hello,
>> I am working on PHP cassandra integration, please let me know which
>> library i
With Cassandra you're going to want to model tables to meet the
requirements of your queries instead of like a relational database where
you build tables in 3NF then optimize after.
For your optimized select query, your table (with caveat, see below) could
start out as:
create table words (
yea
Performance will be the same. There's no performance benefit to using
multiple keyspaces.
On Thu Nov 13 2014 at 8:42:40 AM Li, George
wrote:
> Hi,
> we use Cassandra to store some association type of data. For example,
> store user to course (course registrations) association and user to school
sets of data with (potentially) two separate read
> patterns, don't put them in the same table.
>
> On Thu, Nov 13, 2014 at 11:08 AM, Jonathan Haddad
> wrote:
>
>> Performance will be the same. There's no performance benefit to using
>> multiple keyspaces.
If he deletes all the data with RF=1, won't he have data loss?
On Mon Nov 17 2014 at 5:14:23 PM Michael Shuler
wrote:
> On 11/17/2014 02:04 PM, Alain Vandendorpe wrote:
> > Hey all,
> >
> > For legacy reasons we're living with Cassandra 2.0.10 in an RF=1 setup.
> > This is being moved away from
I don't think DateTiered will help here, since there's no clustering key
defined. This is a pretty straightforward workload, I've done something
similar.
Are you overwriting the session on every request? Or just writing it once?
On Mon Dec 01 2014 at 6:45:14 AM Matt Brown wrote:
> This sounds l
>> Hash: SHA1
>>
>> The session will be written once at create time, and never modified
>> after that. Will that affect things?
>>
>> Thank you
>>
>> - -Phil
>>
>> On 01.12.2014 15:58, Jonathan Haddad wrote:
>> > I don
I recommend reading through
https://issues.apache.org/jira/browse/CASSANDRA-8150 to get an idea of how
the JVM GC works and what you can do to tune it. Also good is Blake
Eggleston's writeup which can be found here:
http://blakeeggleston.com/cassandra-tuning-the-jvm-for-read-heavy-workloads.html
What's a ring cache?
FYI if you're using the DataStax CQL drivers they will automatically route
requests to the correct node.
On Sun Dec 07 2014 at 12:59:36 AM kong wrote:
> Hi,
>
> I'm doing stress test on Cassandra. And I learn that using ring cache can
> improve the performance because the c
rmance wise
If you've got a specific question I think someone can find a way to help,
but asking "what can 8gb of heap give me" is pretty abstract and
unanswerable.
Jon
On Sun Dec 07 2014 at 8:03:53 AM Philo Yang wrote:
> 2014-12-05 15:40 GMT+08:00 Jonathan Haddad :
>
I think he mentioned 100MB as the max size - planning for 1mb might make
your data model difficult to work.
On Sun Dec 07 2014 at 12:07:47 PM Kai Wang wrote:
> Thanks for the help. I wasn't clear how clustering column works. Coming
> from Thrift experience, it took me a while to understand how c
rove
> performance.
> Thank you very much.
>
> 2014-12-08 1:28 GMT+08:00 Jonathan Haddad :
>
>> What's a ring cache?
>>
>> FYI if you're using the DataStax CQL drivers they will automatically
>> route requests to the correct node.
>>
>> On
Listen address needs the actual address, not the interface. This is best
accomplished by setting up proper hostnames for each machine (through DNS
or hosts file) and leaving listen_address blank, as it will pick the
external ip. Otherwise, you'll need to set the listen address to the IP of
the ma
od
> performance as excepted using my client code. I know others have
> benchmarked Cassandra and got good results. But if I cannot reproduce the
> satisfactory results, I cannot use it in my case.
>
> I will create a repo and send a link later, hope to get your kind help.
>
>
You don't need a prime number of nodes in your ring, but it's not a bad
idea to it be a multiple of your RF when your cluster is small.
On Tue Dec 09 2014 at 8:29:35 AM Nate Yoder wrote:
> Hi Ian,
>
> Thanks for the suggestion but I had actually already done that prior to
> the scenario I descr
pal Engineer & Data Scientist, Whistle
> 415-944-7344 // n...@whistle.com
>
> On Tue, Dec 9, 2014 at 8:31 AM, Jonathan Haddad wrote:
>
>> You don't need a prime number of nodes in your ring, but it's not a bad
>> idea to it be a multiple of your RF when y
Yes. It is, in general, a best practice to upgrade to the latest bug fix
release before doing an upgrade to the next point release.
On Tue Dec 09 2014 at 6:58:24 PM wyang wrote:
> I looked some upgrade documentations and am a little puzzled.
>
>
> According to
> https://github.com/apache/cassan
I did a presentation on diagnosing performance problems in production at
the US & Euro summits, in which I covered quite a few tools & preventative
measures you should know when running a production cluster. You may find
it useful:
http://rustyrazorblade.com/2014/09/cassandra-summit-recap-diagnosi
The really important thing to really take away from Ryan's original post is
that batches are not there for performance. The only case I consider
batches to be useful for is when you absolutely need to know that several
tables all get a mutation (via logged batches). The use case for this is
when
Hey Jens,
Unfortunately the output of the nodetool histograms changes between
versions. While I think your script is useful, it's likely to break
between versions. You might be interested to weigh in on the JIRA ticket
to make the nodetool output machine friendly:
https://issues.apache.org/jira/
r load.
>
> I would also note that the example in the spec has multiple inserts with
> different partition key values, which flies in the face of the admonition
> to to refrain from using server-side distribution of requests.
>
> At a minimum the CQL spec should make a m
benefit of batches but
without the coordinator overhead.
Can you post your benchmark code?
On Sat Dec 13 2014 at 6:10:36 AM Jonathan Haddad wrote:
> There are cases where it can. For instance, if you batch multiple
> mutations to the same partition (and talk to a replica for that parti
27;s runnable for you
> in a Scala REPL console without having to resolve our internal
> dependencies. This may not be today though.
>
> Also, @Ryan, I don't think that shuffling would make a difference for my
> above tests since as Jon observed, all my nodes were already repl
point, but in a healthy
> cluster, it's the same write volume, just a longer tenancy in eden. If
> reasonable sized batches are causing survivors, you're not far off from
> falling over anyway.
>
> On Sat, Dec 13, 2014 at 10:04 AM, Jonathan Haddad
> wrote:
>
>> One
Not a problem - it's good to hash this stuff out and understand the
technical reasons why something works or doesn't work.
On Sat Dec 13 2014 at 10:07:10 AM Jonathan Haddad wrote:
> On Sat Dec 13 2014 at 10:00:16 AM Eric Stevens wrote:
>
>> Isn't the net effe
uns of 113,825 records (3 protos, 5 agents, ~15 per bucket) in batches
> of 10
> Total Run Time
> traverse test2 ((aid, bckt), end) =
> 11,429,008,000
> traverse test1 ((aid, bckt), proto, end) reverse order=
> 12,593,034,000
> trave
I'd consider solving your root problem of "people are starting and stopping
servers in prod accidentally" instead of making Cassandra more difficult to
manage operationally.
On Thu Dec 18 2014 at 4:04:34 AM Ryan Svihla wrote:
> why auto_bootstrap=false? The documentation even suggests the opposi
This topic comes up quite a bit. Enough, in fact, that I've done a 1 hour
webinar on the topic. I cover how the JVM GC works and things you need to
consider when tuning it for Cassandra.
https://www.youtube.com/watch?v=7B_w6YDYSwA
With your specific problem - full GC not reducing the old gen -
It may be more valuable to set up your test cluster as the same version,
and make sure your tokens are the same. then copy over your sstables.
you'll have an exact replica of prod & you can test your upgrade process.
On Fri Dec 19 2014 at 11:04:58 AM Ryan Svihla wrote:
> In theory, you could a
Your gc grace should be longer than your repair schedule. You're likely
going to have deleted data resurface.
On Fri Dec 19 2014 at 8:31:13 AM Alain RODRIGUEZ wrote:
> All that you said match the idea I had of how it works except this part:
>
> "The request blocks however until all CL is satis
Secondary indexes are there for convenience, not performance. If you're
looking for something performant, you'll need to maintain your own indexes.
On Mon Dec 29 2014 at 3:22:58 PM Sam Klock wrote:
> Hi folks,
>
> Perhaps this is a question better addressed to the Cassandra developers
> direct
This is most likely because your listen address is set to localhost. Try
changing it to listen on the external interface.
On Sat Jan 03 2015 at 10:03:57 AM Chamila Wijayarathna <
cdwijayarat...@gmail.com> wrote:
> Hello all,
>
> I have a cassandra node at a machine. When I access cqlsh from the
301 - 400 of 507 matches
Mail list logo