Re: New Servers - Cassandra 4

2021-08-02 Thread Max C.
Have you considered a blade chassis? Then you can get most of the redundancy of having lots of small nodes in few(er) rack units. SuperMicro has a chassis that can accommodate 14 servers in 4U: https://www.supermicro.com/en/products/superblade/enclosure#4U - Max > On Aug 2, 2021, at 12:05

On-prem backup options ... Medusa?

2021-06-11 Thread Max C.
Hi Everyone, What are you doing for backups for on-premises deployments (C* 3.11)? We don’t have the option of EBS snapshots, tablesnap to S3, etc. but we have lots of NFS-mounted file server space. We’re currently doing a snapshot and then copying the full snapshot to a file server

Re: multiple clients making schema changes at once

2021-06-03 Thread Max C.
ultiple > threads. The alter statement is run in a synchronized block (Java). Should > I put an artificial delay after the alter statement? > > -Joe > > On 6/1/2021 2:59 PM, Max C. wrote: >> We use ZooKeeper + kazoo’s lock implementation. Kazoo is a Python clie

Re: multiple clients making schema changes at once

2021-06-01 Thread Max C.
ent apps would send create instruction to that service, that > would receive them and do the creates 1 by 1, and the client app would wait > the response from it before starting to insert. > > Best, > > Sébastien. > > Le mar. 1 juin 2021 à 05:21, Max C. <mailto:mc_cass

Re: multiple clients making schema changes at once

2021-05-31 Thread Max C.
In our case we have a shared dev cluster with (for example) a key space for each developer, a key space for each CI runner, etc. As part of initializing our test suite we setup the schema to match the code that is about to be tested. This can mean multiple CI runners each adding/dropping

Re: Should we use Materialised Views or ditch them ?

2020-02-28 Thread Max C.
The general view of the community is that you should *NOT* use them in production, due to multiple serious outstanding issues (see Jira). We used them quite a bit when they first came out and have since rolled back all uses except for the absolute most basic cases (ex: a table with 30K rows

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-03 Thread Max C.
Let’s say you have a 6 node cluster, with RF=3, and no vnodes. In that case each piece of data is stored as follows: : N1: N2 N3 N2: N3 N4 N3: N4 N5 N4: N5 N6 N5: N6 N1 N6: N1 N2 With this setup, there are some circumstances where you could lose 2 nodes (ex: N1 & N4) and still be able to

Re: Cassandra Repair question

2019-10-19 Thread Max C.
Yes - agree with Sergio. For the majority of use cases, the best practice for repair is to use Cassandra Reaper. > On Oct 19, 2019, at 12:06 am, Sergio wrote: > > Use Cassandra reaper > > On Fri, Oct 18, 2019, 10:12 PM Krish Donald > wrote: > Thanks Manish, > >

Re: Configurations for better performance

2019-10-17 Thread Max C .
I haven’t watched it yet, but John Haddad did a talk on performance optimization at the Datastax accelerate conference (and another talk a year/two before): 10 Easy Ways to Tune Your Cassandra Cluster with John Haddad | DataStax Accelerate 2019 https://www.youtube.com/watch?v=swL7bCnolkU -

Re: nodetool status and node maintenance

2018-10-29 Thread Max C.
Agree - avoid parsing nodetool, if you can. I’d add that if anyone out there is interested in JMX but doesn’t want to deal with Java, you should install Jolokia so you can interact with Cassandra’s JMX data via a language independent REST-like interface. https://jolokia.org/

Re: Insert from Select - CQL

2018-10-27 Thread Max C.
I’ve never been a big fan of the “COPY” statement. My preference for stuff like this (though I am definitely in the minority I think!) — particularly for the amount of data you’re talking about — is to use the open source tool “cassandradump” — which is similar to mysqldump but for cassandra.

Upcoming Cassandra-related Conferences

2018-10-05 Thread Max C.
Some upcoming Cassandra-related conferences, if anyone is interested: Scylla Summit November 5-7, 2018 Pullman San Francisco Bay Hotel, Redwood City CA https://www.scylladb.com/scylla-summit-2018/ (This one seems to be almost entirely Scylla

Re: Connections info

2018-10-04 Thread Max C.
Looks like the number of connections is available in JMX as: org.apache.cassandra.metrics:type=Client,name=connectedNativeClients http://cassandra.apache.org/doc/4.0/operating/metrics.html "Number of clients connected to this nodes

Re: Corrupt insert during ALTER TABLE add

2018-09-13 Thread Max C.
Yep, that’s the problem! Thanks Jeff (and Alex Petrov for fixing it). - Max > On Sep 13, 2018, at 1:24 pm, Jeff Jirsa wrote: > > CASSANDA-13004 (fixed in recent 3.0 and 3.11 builds) - To unsubscribe, e-mail:

Re: Corrupt insert during ALTER TABLE add

2018-09-13 Thread Max C.
Correction — we’re running C* 3.0.8. DataStax Python driver 3.4.1. > On Sep 13, 2018, at 1:11 pm, Max C. wrote: > > I ran “alter table” today to add the “task_output_capture_state” column (see > below), and we found a few rows inserted around the time of the ALTER TABLE >

Corrupt insert during ALTER TABLE add

2018-09-13 Thread Max C.
I ran “alter table” today to add the “task_output_capture_state” column (see below), and we found a few rows inserted around the time of the ALTER TABLE did not contain the same values when selected as when they were inserted. When the row was selected, what we saw was: - test_id —> OK (same as

Re: Recommended num_tokens setting for small cluster

2018-08-31 Thread Max C.
Jeff/Kurt/Alex — thanks so much for your feedback on this issue, and thanks for all of the help you guys have lent to people on this list over the years. :-) - Max > On Aug 29, 2018, at 11:38 pm, Oleksandr Shulgin > wrote: > > On Thu, Aug 30, 2018 at

Recommended num_tokens setting for small cluster

2018-08-29 Thread Max C.
Hello Everyone, Datastax recommends num_tokens = 8 as a sensible default, rather than num_tokens = 256: https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html … but

Re: Snapshot SSTable modified??

2018-05-30 Thread Max C.
efore the > link count on the snapshot file is decremented. The file's contents haven't > changed so mtime is identical, but ctime does get updated. BSDtar doesn't > seem to interpret link count changes as a file change, so it's pretty > effective as a workaround. > >

Re: Snapshot SSTable modified??

2018-05-25 Thread Max C
ile to be created and > uploaded - you have to watch the log to make sure it's not being written. > > > On Wed, May 23, 2018 at 2:18 PM, Max C. <mc_cassan...@core43.com > <mailto:mc_cassan...@core43.com>> wrote: > Hi Everyone, > > We’ve noti

Snapshot SSTable modified??

2018-05-23 Thread Max C.
Hi Everyone, We’ve noticed a few times in the last few weeks that when we’re doing backups, tar has complained with messages like this: tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-6a9440a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db: file changed

Re: Effect of frequent mutations / memtable

2017-05-26 Thread Max C
In my case, we're using Cassandra to store QA test data — so the pattern is that we may do a bunch of updates within a few minutes / hours, and then the data will essentially be read-only for the rest of its lifetime (years). My question is the same — do we need to worry about the performance

Re: Is there a C* summit this year?

2017-05-20 Thread Max C
Hi Kant, I reached out to Datastax with this very question a few months ago, here's the response I received: "Great to hear that you enjoyed yourself last year, we surely enjoyed having you and the other community folks! In regards to the actual summit, we will not be holding the Cassandra

Re: quick questions

2016-12-17 Thread Max C
As Matija mentioned, quorum is RF / 2 + 1: RF=1, Quorum = 1 RF=2, Quorum = 2 RF=3, Quorum = 2 RF=4, Quorum = 3 RF=5, Quorum = 3 RF=6, Quorum = 4 RF=7, Quorum = 4 So no, you don’t have to have an odd RF to achieve a quorum, as you see above. Most people use RF=3 with a minimum of 3 nodes,

Commercial Support Providers?

2016-11-03 Thread Max C
Hello - We’re rolling out a small cluster at my work (2 DCs of 3 nodes each — hosted on-premises), and my boss has asked us to look into commercial support offerings. The main thing we’re looking for is a company that we can call day or night if/when things go “kaboom” and I can’t figure out

NPE during schema upgrade from 2.2.6 -> 3.0.6

2016-05-27 Thread Max C
Hi Everyone, I’m getting a NullPointerException when I start up 3.0.6 for the first time with data from 2.2.6. Any ideas for how to fix this, or other troubleshooting strategies? This is just a single-node development box. Originally I tried upgrading from 2.1.13 to 3.0.6, but I ran into

Re: 1, 2, 3...

2016-04-09 Thread Max C
Looks like this guy (Brian Hess) wrote a script to split the token range and run count(*) on each subrange: https://github.com/brianmhess/cassandra-count - Max > On Apr 8, 2016, at 10:56 pm, Jeff Jirsa wrote: > >