Re: Use NetworkTopologyStrategy for single data center and add data centers later

2021-01-27 Thread Carl Mueller
Yes, perform that as soon as possible. When you add a new datacenter, keyspaces that are SimpleStrategy (don't forget about system_traces and system_distributed) won't work. On Sat, Dec 19, 2020 at 12:38 PM Aaron Ploetz wrote: > Yes, you absolutely can (and should) use NetworkTopologyStrategy

Table metrics grid isn't showing in the apache cassandra documentation

2021-01-27 Thread Carl Mueller
https://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics I checked with Chrome and Firefox

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
actually inspect the system, run a program that holds > half the memory, to put yourself down at the same memory level as m5.2xl, > see if the lack of page cache causes disk IO to return. > > > > On Wed, Dec 2, 2020 at 2:56 PM Carl Mueller > wrote: > >> I agree in t

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
at 5:12 PM Erick Ramirez wrote: > From C* 2.2 onwards, SSTables get mapped to memory by mmap() so the hot > data will be accessed much faster on systems with more RAM. > > On Thu, 3 Dec 2020 at 09:57, Carl Mueller > wrote: > >> I agree in theory, I just want some way to co

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
2020 at 8:41 AM Carl Mueller > wrote: > >> Oh, this is cassandra 2.2.13 (multi tenant delays) and ubuntu 18.04. >> >> On Wed, Dec 2, 2020 at 10:35 AM Carl Mueller < >> carl.muel...@smartthings.com> wrote: >> >>> We have a cluster that is experienc

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
> worth doing that much digging. > > > On Wed, Dec 2, 2020 at 8:41 AM Carl Mueller > wrote: > >> Oh, this is cassandra 2.2.13 (multi tenant delays) and ubuntu 18.04. >> >> On Wed, Dec 2, 2020 at 10:35 AM Carl Mueller < >> carl.muel...@smartthings.com> wrote: >

Re: Digest mismatch

2020-12-02 Thread Carl Mueller
es per slice (last five minutes): NaN > Maximum tombstones per slice (last five minutes): 0 > Dropped Mutations: 8174438 > > Thoughts/ideas? Thank you! > > -Joe > On 12/2/2020 11:49 AM, Carl Mueller wrote: > > Why is one of your nodes only at 14.6% ownership? That's weird

Re: Digest mismatch

2020-12-02 Thread Carl Mueller
Why is one of your nodes only at 14.6% ownership? That's weird, unless you have a small rowcount. Are you frequently deleting rows? Are you frequently writing rows at ONE? What version of cassandra? On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger wrote: > Hi All - this is my first post here.

Re: Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
Oh, this is cassandra 2.2.13 (multi tenant delays) and ubuntu 18.04. On Wed, Dec 2, 2020 at 10:35 AM Carl Mueller wrote: > We have a cluster that is experiencing very high disk read I/O in the > 20-40 MB/sec range on m5.2x (gp2 drives). This is verified via VM metrics > as well

Vastly different disk I/O on different sized aws instances

2020-12-02 Thread Carl Mueller
We have a cluster that is experiencing very high disk read I/O in the 20-40 MB/sec range on m5.2x (gp2 drives). This is verified via VM metrics as well as iotop. When we switch m5.4x it drops to 60 KB/sec. There is no difference in network send/recv, read/write request counts. The graph for

slurm for cluster job scheduling and coordination

2020-03-09 Thread Carl Mueller
Between repairs, rolling restarts, scheduled maintenance bounces, backups, upgrades, etc there are lots of cluster-wide tasks that would be nice to be scheduled and viewed. Slurm appears to have some features that support this but might be heavyweight considering its primary application is

Re: Cassandra OS Patching.

2020-02-04 Thread Carl Mueller
In AWS we are doing os upgrades by standing up an equivalent set of nodes, and then in a rolling fashion we move the EBS mounts, selectively sync necessary cassandra settings, and then start up the new node. The downtime is fairly minimal since the ebs detach/attach is pretty quick, you don't

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-04 Thread Carl Mueller
Your case seems to argue for completely eliminating vnodes. Which the Priam people have been preaching for a long time. There is not, certainly to a cassandra user-level person, good documentation on the pros and cons of vnodes vs single tokens, and as we see here the impacts of various vnode

Re: Dynamo autoscaling: does it beat cassandra?

2019-12-10 Thread Carl Mueller
gularly scaling, you're probably > going to pay more for it. > > > It'd be cool if someone focused on this - I think the faster streaming > goes a long way. The way vnodes work today make it difficult to add more > than one at a time without violating consistency, and thats unlikel

Re: TTL on UDT

2019-12-09 Thread Carl Mueller
ion, thus > applying ttl() on them makes sense. I'm not sure however if the CQL parser > allows this syntax > > On Mon, Dec 9, 2019 at 9:13 PM Carl Mueller > wrote: > >> I could be wrong, but UDTs I think are written (and overwritten) as one >> unit, so the notion of a TT

Dynamo autoscaling: does it beat cassandra?

2019-12-09 Thread Carl Mueller
Dynamo salespeople have been pushing autoscaling abilities that have been one of the key temptations to our management to switch off of cassandra. Has anyone done any numbers on how well dynamo will autoscale demand spikes, and how we could architect cassandra to compete with such abilities? We

Re: Seeing tons of DigestMismatchException exceptions after upgrading from 2.2.13 to 3.11.4

2019-12-09 Thread Carl Mueller
My speculation on rapidly churning/fast reads of recently written data: - data written at quorum (for RF3): write confirm is after two nodes reply - data read very soon after (possibly code antipattern), and let's assume the third node update hasn't completed yet (e.g. AWS network "variance").

Re: TTL on UDT

2019-12-09 Thread Carl Mueller
I could be wrong, but UDTs I think are written (and overwritten) as one unit, so the notion of a TTL on a UDT field doesn't exist, the TTL is applied to the overall structure. Think of it like a serialized json object with multiple fields. To update a field they deserialize the json, then

Re: AWS ephemeral instances + backup

2019-12-09 Thread Carl Mueller
k at iowait during a snapshot and see if the results are > acceptable for a running node. Even if it is marginal, if you’re only > snapshotting one node at a time, then speculative retry would just skip > over the temporary slowpoke. > > > > *From: *Carl Mueller

AWS ephemeral instances + backup

2019-12-05 Thread Carl Mueller
Does anyone have experience tooling written to support this strategy: Use case: run cassandra on i3 instances on ephemerals but synchronize the sstables and commitlog files to the cheapest EBS volume type (those have bad IOPS but decent enough throughput) On node replace, the startup script for

Re: upgrading from 2.x TWCS to 3.x TWCS

2019-09-30 Thread Carl Mueller
i, Sep 27, 2019 at 7:39 PM Carl Mueller > wrote: > >> So IF that delegate class would work: >> >> 1) create jar with the delegate class >> 2) deploy jar along with upgrade on node >> 3) once all nodes are upgraded, issue ALTER to change to the >> org

Re: upgrading from 2.x TWCS to 3.x TWCS

2019-09-27 Thread Carl Mueller
{ public TimeWindowCompactionStrategy(ColumnFamilyStore cfs, Map options) { super(cfs,options); } public String getName() { return getClass().getSimpleName(); } } On Fri, Sep 27, 2019 at 1:05 PM Carl Mueller wrote: > Or can we just do this safely in a side

Re: upgrading from 2.x TWCS to 3.x TWCS

2019-09-27 Thread Carl Mueller
TimeWindowCompactionStrategy { public TimeWindowCompactionStrategy(ColumnFamilyStore cfs, Map options) { super(cfs,options); } } On Fri, Sep 27, 2019 at 12:29 PM Carl Mueller wrote: > So IF that delegate class would work: > > 1) create jar with the delegate class > 2) depl

Re: upgrading from 2.x TWCS to 3.x TWCS

2019-09-27 Thread Carl Mueller
So IF that delegate class would work: 1) create jar with the delegate class 2) deploy jar along with upgrade on node 3) once all nodes are upgraded, issue ALTER to change to the org.apache.cassandra TWCS class. will that trigger full recompaction? On Fri, Sep 27, 2019 at 12:25 PM Carl Mueller

Re: upgrading from 2.x TWCS to 3.x TWCS

2019-09-27 Thread Carl Mueller
ptions) throws ConfigurationException { return delegate.validateOptions(options); } } On Fri, Sep 27, 2019 at 11:58 AM Carl Mueller wrote: > Is this still the official answer on TWCS 2.X --> 3.X upgrades? Pull the > code and recompile as a different package? > > Can I just declar

Re: upgrading from 2.x TWCS to 3.x TWCS

2019-09-27 Thread Carl Mueller
Is this still the official answer on TWCS 2.X --> 3.X upgrades? Pull the code and recompile as a different package? Can I just declare the necessary class and package namespace and delegate to the actual main-codebase class? On Mon, Nov 5, 2018 at 1:41 AM Oleksandr Shulgin <

Re: Bootstrap keeps failing

2019-09-25 Thread Carl Mueller
We are experiencing bootstrap problems in 2.2.x and 3.11.x with bootstrapping when clusters hit 30 nodes, across multiple datacenters. We will try some of the stuff here and hopefully it helps us. On Tue, Mar 12, 2019 at 11:46 AM Léo FERLIN SUTTON wrote: > Hello ! > > Just wanted to let you

Impact of a large number of components in column key/cluster key

2019-08-06 Thread Carl Mueller
Say there are 1 vs three vs five vs 8 parts of a column key. Will range slicing slow down the more parts there are? Will compactions be impacted?

Re: TWCS generates large numbers of sstables on only some nodes

2019-07-16 Thread Carl Mueller
files from the same bucket together rather than constantly append to the same sstable But that assumption is based on a superficial examination of the compactor code. On Tue, Jul 16, 2019 at 12:47 AM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Mon, Jul 15, 2019 at 6:2

TWCS generates large numbers of sstables on only some nodes

2019-07-15 Thread Carl Mueller
Related to our overstreaming, we have a cluster of about 25 nodes, with most at about 1000 sstable files (Data + others). And about four that are at 20,000 - 30,000 sstable files (Data+Index+etc). We have vertically scaled the outlier machines and turned off compaction throttling thinking it was

Re: Breaking up major compacted Sstable with TWCS

2019-07-15 Thread Carl Mueller
Does sstablesplit properly restore the time-bucket the data? That appears to be size-based only. On Fri, Jul 12, 2019 at 5:55 AM Rhys Campbell wrote: > > https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableSplit.html > > Leon Zaruvinsky schrieb am

Re: TWCS: 2.2 ring expand massive overstream 100000 sstables

2019-07-10 Thread Carl Mueller
n to normal in three or four more days when the fragments expire. On Tue, Jul 9, 2019 at 11:12 AM Carl Mueller wrote: > The existing 15 node cluster had about 450-500GB/node, most in one TWCS > table. Data is applied with a 7-day TTL. Our cluster couldn't be expanded > due to a b

TWCS: 2.2 ring expand massive overstream 100000 sstables

2019-07-09 Thread Carl Mueller
The existing 15 node cluster had about 450-500GB/node, most in one TWCS table. Data is applied with a 7-day TTL. Our cluster couldn't be expanded due to a bit of political foot dragging and new load of about 2x-3x started up around the time we started expanding. about 500 sstables per node, with

Re: TWCS: what happens on node replacement/streaming

2019-07-06 Thread Carl Mueller
THanks jeff On Sat, Jul 6, 2019 at 12:16 PM Jeff Jirsa wrote: > The max timestamp for each sstable is in the metadata on each sstable, so > on streaming of any kind (bootstrap, repair, etc) sstables are added to > their corrrect and expected windows. > > > > > On Jul 6,

TWCS: what happens on node replacement/streaming

2019-07-06 Thread Carl Mueller
TWCS distributes it data by time buckets/flushes But on node add/streaming, it doesn't have the natural ordering provided by the timing of the incoming update sterams. So does TWCS properly reconsturct buckets on streaming/replacement?

Re: upgrade pinning to v3 protocol: massive drop in writes

2019-06-25 Thread Carl Mueller
Nevermind, it would appear once we looked further out on the metrics there was some huge bump about a month ago from the levels we see now On Tue, Jun 25, 2019 at 1:35 PM Carl Mueller wrote: > Oh we are 2.2.13 currently, seems to be 3.7.1 for the java-driver > > On Tue, Jun 25, 2019 a

Re: upgrade pinning to v3 protocol: massive drop in writes

2019-06-25 Thread Carl Mueller
Oh we are 2.2.13 currently, seems to be 3.7.1 for the java-driver On Tue, Jun 25, 2019 at 1:11 PM Carl Mueller wrote: > We have an app that needs to be pinned to v3 protocol for the upgrade to > 3.11.X > > ... we rolled out the v3 "pinning" and the amount of write counts

upgrade pinning to v3 protocol: massive drop in writes

2019-06-25 Thread Carl Mueller
We have an app that needs to be pinned to v3 protocol for the upgrade to 3.11.X ... we rolled out the v3 "pinning" and the amount of write counts and network traffice plummeted by 60-90%. The app seems to be functioning properly. has anyone seen anything like this? Could it be "Custom Payloads"

Re: postmortem on 2.2.13 scale out difficulties

2019-06-12 Thread Carl Mueller
I posted a bug, cassandra-15155 : https://issues.apache.org/jira/browse/CASSANDRA-15155?jql=project%20%3D%20CASSANDRA It seems VERY similar to https://issues.apache.org/jira/browse/CASSANDRA-6648 On Wed, Jun 12, 2019 at 12:14 PM Carl Mueller wrote: > And once the cluster token map format

Re: postmortem on 2.2.13 scale out difficulties

2019-06-12 Thread Carl Mueller
- [Stream #05af9ee0-8d26-11e9-85c1-bd5476090c54] Executing streaming plan for Bootstrap INFO [main] 2019-06-12 15:23:25,526 StorageService.java:1199 - Bootstrap completed! for the tokens [-7314981925085449175, ... bunch of tokens... 5499447097629838103] On Wed, Jun 12, 2019 at 12:07 PM Carl Mueller

Re: postmortem on 2.2.13 scale out difficulties

2019-06-12 Thread Carl Mueller
; Have you tried Nodetool bootstrap resume & jvm option i.e. > JVM_OPTS="$JVM_OPTS -Dcassandra.consistent.rangemovement=false" ? > > > > > > *From:* Carl Mueller [mailto:carl.muel...@smartthings.com.INVALID] > *Sent:* Wednesday, June 12, 2019 11:35 AM > *To

Re: postmortem on 2.2.13 scale out difficulties

2019-06-12 Thread Carl Mueller
We're getting DEBUG [GossipStage:1] 2019-06-12 15:20:07,797 MigrationManager.java:96 - Not pulling schema because versions match or shouldPullSchemaFrom returned false multiple times, as it contacts the nodes. On Wed, Jun 12, 2019 at 11:35 AM Carl Mueller wrote: > We only were able to sc

Re: postmortem on 2.2.13 scale out difficulties

2019-06-12 Thread Carl Mueller
We only were able to scale out four nodes and then failures started occurring, including multiple instances of nodes joining a cluster without streaming. Sigh. On Tue, Jun 11, 2019 at 3:11 PM Carl Mueller wrote: > We had a three-DC (asia-tokyo/europe/us) cassandra 2.2.13 cluster, AWS, >

postmortem on 2.2.13 scale out difficulties

2019-06-11 Thread Carl Mueller
We had a three-DC (asia-tokyo/europe/us) cassandra 2.2.13 cluster, AWS, IPV6 Needed to scale out the asia datacenter, which was 5 nodes, europe and us were 25 nodes We were running into bootstrapping issues where the new node failed to bootstrap/stream, it failed with

Re: schema for testing that has a lot of edge cases

2019-06-07 Thread Carl Mueller
think both projects might be of interest to > reach your goal. > > C*heers, > --- > Alain Rodriguez - al...@thelastpickle.com > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > > Le jeu. 23 mai 2019 à 21:25,

schema for testing that has a lot of edge cases

2019-05-23 Thread Carl Mueller
Does anyone have any schema / schema generation that can be used for general testing that has lots of complicated aspects and data? For example, it has a bunch of different rk/ck variations, column data types, altered /added columns and data (which can impact sstables and compaction),

Re: What happens to empty partitions?

2019-05-17 Thread Carl Mueller
Eventually compaction will remove the row when the sstable is merged/rewritten. On Fri, May 17, 2019 at 8:06 AM Tom Vernon wrote: > Hi, I'm having trouble getting my head around what happens to a partition > that no longer contains any data. As TTL is applied at the column level > (but not on

2.1 cassandra 1 node down produces replica shortfall

2019-05-17 Thread Carl Mueller
Being one of our largest and unfortunately heaviest multi-tenant clusters, and our last 2.1 prod cluster, we are encountering not enough replica errors (need 2, only found 1) after only bringing down 1 node. 90 node cluster, 30/dc, dcs are in europe, asia, and us. AWS. Are there bugs for

Re: Cassandra taking very long to start and server under heavy load

2019-05-07 Thread Carl Mueller
You may have encountered the same behavior we have encountered going from 2.1 --> 2.2 a week or so ago. We also have multiple data dirs. Hm. In our case, we will purge the data of the big offending table. HOw big are your nodes? On Tue, May 7, 2019 at 1:40 AM Evgeny Inberg wrote: > Still

Re: 2019 manual deletion of sstables

2019-05-07 Thread Carl Mueller
(repair would be done after all the nodes with obviously deletable sstables were deleted) (we may then do a purge program anyway) (this would seem to get rid of 60-90% of the purgable data without incurring a big round of tombstones and compaction) On Tue, May 7, 2019 at 12:05 PM Carl Mueller

2019 manual deletion of sstables

2019-05-07 Thread Carl Mueller
Last my googling had some people doing this back in 2.0.x days, and that you could do it if you brought a node down, removed the desired sstable #'s artifacts (Data/Index/etc), and then started up. Probably also with a clearing of the saved caches. A decent-ish amount of data (256G) in a 2.1

re: Trouble restoring with sstableloader

2019-04-18 Thread Carl Mueller
This is a response to a message from 2017 that I found unanswered on the user list, we were getting the same error. Also in this stackoverflow https://stackoverflow.com/questions/53160611/frame-size-352518912-larger-than-max-length-15728640-exception-while-runnin/55751104#55751104 I have noted

Re: 2.1.9 --> 2.2.13 upgrade node startup after upgrade very slow

2019-04-17 Thread Carl Mueller
On Wed, Apr 17, 2019 at 11:36 AM Jon Haddad wrote: > > > > Run the async java profiler on the node to determine what it's doing: > > https://github.com/jvm-profiling-tools/async-profiler > > > > On Wed, Apr 17, 2019 at 11:31 AM Carl Mueller > > wrote: >

Re: 2.1.9 --> 2.2.13 upgrade node startup after upgrade very slow

2019-04-17 Thread Carl Mueller
gt; > > > On Wed, Apr 17, 2019 at 10:30 AM Carl Mueller > wrote: > >> Oh, the table in question is SizeTiered, had about 10 sstables total, it >> was JBOD across two data directories. >> >> On Wed, Apr 17, 2019 at 12:26 PM Carl Mueller < >> carl.

Re: 2.1.9 --> 2.2.13 upgrade node startup after upgrade very slow

2019-04-17 Thread Carl Mueller
Oh, the table in question is SizeTiered, had about 10 sstables total, it was JBOD across two data directories. On Wed, Apr 17, 2019 at 12:26 PM Carl Mueller wrote: > We are doing a ton of upgrades to get out of 2.1.x. We've done probably > 20-30 clusters so far and have not encountered an

2.1.9 --> 2.2.13 upgrade node startup after upgrade very slow

2019-04-17 Thread Carl Mueller
We are doing a ton of upgrades to get out of 2.1.x. We've done probably 20-30 clusters so far and have not encountered anything like this yet. After upgrade of a node, the restart takes a long time. like 10 minutes long. ALmost all of our other nodes took less than 2 minutes to upgrade (aside

Re: cass-2.2 trigger - how to get clustering columns and value?

2019-04-11 Thread Carl Mueller
later if this method is of interest. > > Thanks > > Paul Chandler > > > On 10 Apr 2019, at 22:52, Carl Mueller > > > wrote: > > > > We have a multitenant cluster that we can't upgrade to 3.x easily, and > we'd like to migrate some apps off of the shared cl

Re: How to install an older minor release?

2019-04-10 Thread Carl Mueller
You'll have to setup a local repo like artifactory. On Wed, Apr 3, 2019 at 4:33 AM Kyrylo Lebediev wrote: > Hi Oleksandr, > > Yes, that was always the case. All older versions are removed from Debian > repo index :( > > > > *From: *Oleksandr Shulgin > *Reply-To: *"user@cassandra.apache.org" >

cass-2.2 trigger - how to get clustering columns and value?

2019-04-10 Thread Carl Mueller
We have a multitenant cluster that we can't upgrade to 3.x easily, and we'd like to migrate some apps off of the shared cluster to dedicated clusters. This is a 2.2 cluster. So I'm trying a trigger to track updates while we transition and will send via kafka. Right now I'm just trying to extract

Re: Best practices while designing backup storage system for big Cassandra cluster

2019-04-02 Thread Carl Mueller
1, 2019 at 1:30 PM Carl Mueller wrote: > At my current job I had to roll my own backup system. Hopefully I can get > it OSS'd at some point. Here is a (now slightly outdated) presentation: > > > https://docs.google.com/presentation/d/13Aps-IlQPYAa_V34ocR0E8Q4C8W2YZ6Jn5_BYGrjqFk/

Re: Best practices while designing backup storage system for big Cassandra cluster

2019-04-01 Thread Carl Mueller
At my current job I had to roll my own backup system. Hopefully I can get it OSS'd at some point. Here is a (now slightly outdated) presentation: https://docs.google.com/presentation/d/13Aps-IlQPYAa_V34ocR0E8Q4C8W2YZ6Jn5_BYGrjqFk/edit#slide=id.p If you are struggling with the disk I/O cost of

Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-27 Thread Carl Mueller
We are probably going to just have a VM startup script for now that automatically updates the yaml on instance restart. It seems to be the least-sucky approach at this point. On Wed, Mar 27, 2019 at 12:36 PM Carl Mueller wrote: > I filed https://issues.apache.org/jira/browse/CASSANDRA-15

Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-27 Thread Carl Mueller
I filed https://issues.apache.org/jira/browse/CASSANDRA-15068 EIPs per the aws experts cost money, are limited in resources (we have a lot of VMs) and cause a lot of headaches in our autoscaling / infrastructure as code systems. On Wed, Mar 27, 2019 at 12:35 PM Carl Mueller wrote: > I'll

Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-27 Thread Carl Mueller
< oleksandr.shul...@zalando.de> wrote: > On Tue, Mar 26, 2019 at 10:28 PM Carl Mueller > wrote: > >> - the AWS people say EIPs are a PITA. >> > > Why? > > >> - if we hardcode the global IPs in the yaml, then yaml editing is >> required for

Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-26 Thread Carl Mueller
address: 0.0.0.0 and broadcast_rpc_address being blank/null/commented out. That section of code may need an exception for EC2MRS. On Tue, Mar 26, 2019 at 12:01 PM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Tue, Mar 26, 2019 at 5:49 PM Carl Mueller > wrote: > >> Lookin

Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-26 Thread Carl Mueller
system.peers entries, which I expected to happen. On Tue, Mar 26, 2019 at 3:33 AM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Mon, Mar 25, 2019 at 11:13 PM Carl Mueller > wrote: > >> >> Since the internal IPs are given when the client app connects to the >&g

Re: Merging two cluster's in to one without any downtime

2019-03-25 Thread Carl Mueller
Either: double-write at the driver level from one of the apps and perform an initial and a subsequent sstable loads (or whatever ETL method you want to use) to merge the data with good assurances. use a trigger to replicate the writes, with some sstable loads / ETL. use change data capture with

upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-25 Thread Carl Mueller
This is a multi-dc cluster with public IPs for the nodes and also addressed with private IPs as well in AWS. The apps connect via java-driver to a public IP. When we built the 2.1.X cluster with ec2multiregionsnitch, the system.peers table had public ips for the nodes in the rpc_address column.

cassandra upgrades multi-DC in parallel

2019-03-12 Thread Carl Mueller
If there are multiple DCs in a cluster, is it safe to upgrade them in parallel, with each DC doing a node-at-a-time?

Re: Released an ACID-compliant transaction library on top of Cassandra

2019-01-16 Thread Carl Mueller
"2) Overview: In essence, the protocol calls for each data item to maintain the last committed and perhaps also the currently active version, for the data and relevant metadata. Each version is tagged with meta-data pertaining to the transaction that created it. This includes the transaction

Re: rolling version upgrade, upgradesstables, and vulnerability window

2018-10-30 Thread Carl Mueller
:39 AM Alexander Dejanovski < a...@thelastpickle.com> wrote: > Yes, as the new version can read both the old and the new sstables format. > > Restrictions only apply when the cluster is in mixed versions. > > On Tue, Oct 30, 2018 at 4:37 PM Carl Mueller > wrote: >

Re: rolling version upgrade, upgradesstables, and vulnerability window

2018-10-30 Thread Carl Mueller
. schema changes won’t cross version) that can be far more > impactful. > > > > -- > Jeff Jirsa > > > > On Oct 30, 2018, at 8:21 AM, Carl Mueller > > > wrote: > > > > We are about to finally embark on some version upgrades for lots of > clusters

comprehensive list of checks before rolling version upgrades

2018-10-30 Thread Carl Mueller
Does anyone have a pretty comprehensive list of these? Many that I don't currently know how to check but I'm researching... I've seen: - verify disk space available for snapshot + sstablerewrite - gossip state agreement, all nodes are healthy - schema state agreement - ability to access all the

rolling version upgrade, upgradesstables, and vulnerability window

2018-10-30 Thread Carl Mueller
We are about to finally embark on some version upgrades for lots of clusters, 2.1.x and 2.2.x targetting eventually 3.11.x I have seen recipes that do the full binary upgrade + upgrade sstables for 1 node before moving forward, while I've seen a 2016 vote by Jon Haddad (a TLP guy) that backs

Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-10-16 Thread Carl Mueller
Your dashboards are great. The only challenge is getting all the data to feed them. On Tue, Oct 16, 2018 at 1:45 PM Carl Mueller wrote: > metadata.csv: that helps a lot, thank you! > > On Fri, Oct 5, 2018 at 5:42 AM Alain RODRIGUEZ wrote: > >> I feel you for most of the

Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-10-16 Thread Carl Mueller
k > at some point in the future. > > Good luck, > C*heers, > --- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > Le jeu. 4 oct.

Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-10-04 Thread Carl Mueller
hey got the same name/label for C*2.1, 2.2 and 3+ on Datadog. There is an > abstraction layer that removes this complexity (if I remember well, we > built those dashboards a while ago). > > C*heers > --- > Alain Rodriguez - @arodream - al...@thelastpickle.com >

Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-10-01 Thread Carl Mueller
C*heers, > --- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > Le ven. 28 sept. 2018 à 20:38, Carl Mueller > a écrit : > >> VERY NI

Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-09-28 Thread Carl Mueller
> On Fri, 28 Sep 2018 at 19:04, Carl Mueller > wrote: > >> It's my understanding that metrics got heavily re-namespaced in JMX for >> 2.2 from 2.1 >> >> Did anyone ever make a migration matrix/guide for conversion of old >> metrics to new metrics? >> >> >>

Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-09-28 Thread Carl Mueller
It's my understanding that metrics got heavily re-namespaced in JMX for 2.2 from 2.1 Did anyone ever make a migration matrix/guide for conversion of old metrics to new metrics?

Re: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

2018-08-30 Thread Carl Mueller
- Range aware compaction strategy that subdivides data by the token range could help for this: you only bakcup data for the primary node and not the replica data - yes, if you want to use nodetool refresh as some sort of recovery solution, MAKE SURE YOU STORE THE TOKEN LIST with the

"minimum backup" in vnodes

2018-08-15 Thread Carl Mueller
Goal: backup a cluster with the minimum amount of data. Restore to be done with sstableloader Let's start with a basic case: - six node cluster - one datacenter - RF3 - data is perfectly replicated/repaired - Manual tokens (no vnodes) - simplest strategy In this case, it is (theoretically)

Determining active sstables and table- dir

2018-04-27 Thread Carl Mueller
IN cases where a table was dropped and re-added, there are now two table directories with different uuids with sstables. If you don't have knowledge of which one is active, how do you determine which is the active table directory? I have tried cf_id from system.schema_columnfamilies and that can

Re: cassl 2.1.x seed node update via JMX

2018-03-22 Thread Carl Mueller
> Previously (as described in the ticket above), the seed node list is only > updated when doing a shadow round, removing an endpoint or restarting (look > for callers of o.a.c.gms.Gossiper#buildSeedsList() if you're curious). > > A rolling restart is the usual SOP for that. > > On F

cassl 2.1.x seed node update via JMX

2018-03-22 Thread Carl Mueller
We have a cluster that is subject to the one-year gossip bug. We'd like to update the seed node list via JMX without restart, since our foolishly single-seed-node in this forsaken cluster is being autoculled in AWS. Is this possible? It is not marked volatile in the Config of the source code, so

Re: [EXTERNAL] Cassandra vs MySQL

2018-03-20 Thread Carl Mueller
Yes, cassandra's big win is that once you get your data and applications adapted to the platform, you have a clear path to very very large scale and resiliency. Um, assuming you have the dollars. It scales out on commodity hardware, but isn't exactly efficient in the use of that hardware. I like

Re: One time major deletion/purge vs periodic deletion

2018-03-20 Thread Carl Mueller
It's possible you'll run into compaction headaches. Likely actually. If you have time-bucketed purge/archives, I'd implement a time bucketing strategy using rotating tables dedicated to a time period so that when an entire table is ready for archiving you just snapshot its sstables and then

Re: Cassandra vs MySQL

2018-03-14 Thread Carl Mueller
THERE ARE NO JOINS WITH CASSANDRA CQL != SQL Same for aggregation, subqueries, etc. And effectively multitable transactions are out. If you have simple single-table queries and updates, or can convert the app to do so, then you're in business. On Tue, Mar 13, 2018 at 5:02 AM, Rahul Singh

Re: Cassandra at Instagram with Dikang Gu interview by Jeff Carpenter

2018-03-12 Thread Carl Mueller
Again, I'd really like to get a feel for scylla vs rocksandra vs cassandra. Isn't the driver binary protocol the easiest / least redesign level of storage engine swapping? Scylla and Cassandra and Rocksandra are currently three options. Rocksandra can expand out it's non-java footprint without

Re: data types storage saving

2018-03-06 Thread Carl Mueller
If you're willing to do the data type conversion in insert and retrieval, the you could use blobs as a sort of "adaptive length int" AFAIK On Tue, Mar 6, 2018 at 6:02 AM, onmstester onmstester wrote: > I'm using int data type for one of my columns but for 99.99...% its data

Re: Rocksandra blog post

2018-03-06 Thread Carl Mueller
Basically they are avoiding gc, right? Not necessarily improving on the theoreticals of sstables and LSM trees. Why didn't they use/try scylla? I'd be interested to see that benchmark. On Tue, Mar 6, 2018 at 3:48 AM, Romain Hardouin wrote: > Rocksandra is very

Re: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-27 Thread Carl Mueller
a docker image to build them so you don’t need to mess with > sphinx. Check the README for instructions. > > Jon > > > On Feb 27, 2018, at 9:49 AM, Carl Mueller <carl.muel...@smartthings.com> > wrote: > > > If there was a github for the docs, we could start

Re: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-27 Thread Carl Mueller
hings simple. Did I miss > something? What does it matter right now? > > > > Thanks Carl, > > > > Kenneth Brotman > > > > *From:* Carl Mueller [mailto:carl.muel...@smartthings.com] > *Sent:* Tuesday, February 27, 2018 8:50 AM > *To:* user@cassandra.apache

Re: Data Deleted After a few days of being off

2018-02-27 Thread Carl Mueller
Does cassandra still function if the commitlog dir has no writes? Will the data still go into the memtable and serve queries? On Tue, Feb 27, 2018 at 1:37 AM, Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Tue, Feb 27, 2018 at 7:37 AM, A wrote: > >> >> I

Re: Version Rollback

2018-02-27 Thread Carl Mueller
My speculation is that IF (bigif) the sstable formats are compatible between the versions, which probably isn't the case for major versions, then you could drop back. If the sstables changed format, then you'll probably need to figure out how to rewrite the sstables in the older format and then

Re: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-27 Thread Carl Mueller
so... are those pages in the code tree of github? I don't see them or a directory structure under /doc. Is mirroring the documentation between the apache site and a github source a big issue? On Tue, Feb 27, 2018 at 7:50 AM, Kenneth Brotman < kenbrot...@yahoo.com.invalid> wrote: > I was debating

Re: Cluster Repairs 'nodetool repair -pr' Cause Severe IncreaseinRead Latency After Shrinking Cluster

2018-02-22 Thread Carl Mueller
1386179.89 > 14530764 4 > > > > key_space_01/cf_01 histograms > > Percentile SSTables Write Latency Read LatencyPartition > SizeCell Count > > (micros) (micros) > (bytes) > > 50%

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread Carl Mueller
Also, I was wondering if the key cache maintains a count of how many local accesses a key undergoes. Such information might be very useful for compactions of sstables by splitting data by frequency of use so that those can be preferentially compacted. On Wed, Feb 21, 2018 at 5:08 PM, Carl Mueller

Re: Memtable flush -> SSTable: customizable or same for all compaction strategies?

2018-02-21 Thread Carl Mueller
to explicitly exclude the loadup of any files/sstable components that are CUSTOM in SStable.java On Wed, Feb 21, 2018 at 10:05 AM, Carl Mueller <carl.muel...@smartthings.com > wrote: > jon: I am planning on writing a custom compaction strategy. That's why the > question is here, I figured t

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Carl Mueller
Cass 2.1.14 is missing some wide row optimizations done in later cass releases IIRC. Speculation: IN won't matter, it will load the entire wide row into memory regardless which might spike your GC/heap and overflow the rowcache On Wed, Feb 21, 2018 at 2:16 PM, Gareth Collins

  1   2   >