Re: merge two cluster
thank you all. you saved my time and resource. regards osman Get Outlook for Android<https://aka.ms/ghei36> From: Jon Haddad Sent: Thursday, October 24, 2019 12:13:45 AM To: user@cassandra.apache.org Subject: Re: merge two cluster Probably not beneficial, I wouldn't do it. Not a fan of multi-tenancy with Cassandra unless the use cases are so small that your noisy neighbor problem is not very noisy at all. For those cases I don't know what you get from Cassandra other than a cool resume. On Wed, Oct 23, 2019 at 12:41 PM Reid Pinchback mailto:rpinchb...@tripadvisor.com>> wrote: I haven’t seen much evidence that larger cluster = more performance, plus or minus the statistics of speculative retry. It horizontally scales for storage definitely, and somewhat for connection volume. If anything, per Sean’s observation, you have less ability to have a stable tuning for a particular usage pattern. Try to have a mental picture of what you think is happening in the JVM while Cassandra is running. There are short-lived objects, medium-lived objects, long/static-lived objects, and behind the scenes some degree of read I/O and write I/O against disk. Garbage collectors struggle badly with medium-lived objects, but Cassandra really depends a great deal on those. If you merge two clusters together, within any one node you still have the JVM size and disk architecture you had before, but you are adding competition on fixed resources and potentially in the very way they find most difficult to handle. If those resources were heavily underutilized, like Sean’s point about merging small apps together, then sure. But if those two clusters of yours are already showing that they experience significant load, then you are unlikely to improve anything, far more likely to end up worse off. GC overhead and compaction flushes to disk are your challenges; merging two clusters doesn’t change the physics of those two areas, but could increase the demand on them. The only caveat to all of the above I can think of is if there was a fault-tolerance story motivating the merging. Like “management wants us in two AZs in AWS, but lacks the budget for more instances, and each pool by itself is too small for us to come up with a 2 rack organization that makes sense”. R From: Osman YOZGATLIOĞLU mailto:osman.yozgatlio...@kron.com.tr>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Wednesday, October 23, 2019 at 10:40 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: merge two cluster Message from External Sender Sorry, missing question; Actually I'm asking this for performance perspective. At application level both cluster used at the same time and approx same level. Inserted data inserted to both cluster, different parts of course. If I merge two cluster, can I gain some performance improvements? Like raid stripes, more disk, more stripe, more speed.. Regards On 23.10.2019 17:30, Durity, Sean R wrote: Beneficial to whom? The apps, the admins, the developers? I suggest that app teams have separate clusters per application. This prevents the noisy neighbor problem, isolates any security issues, and helps when it is time for maintenance, upgrade, performance testing, etc. to not have to coordinate multiple app teams at the same time. Also, an individual cluster can be tuned for its specific workload. Sometimes, though, costs and data size push us towards combining smaller apps owned by the same team onto a single cluster. Those are the exceptions. As a Cassandra admin, I am always trying to scale the ability to admin multiple clusters without just adding new admins. That is an on-going task, dependent on your operating environment. Also, because every table has a portion of memory (memtable), there is a practical limit to the number of tables that any one cluster should have. I have heard it is in the low hundreds of tables. This puts a limit on the number of applications that a cluster can safely support. Sean Durity – Staff Systems Engineer, Cassandra From: Osman YOZGATLIOĞLU <mailto:osman.yozgatlio...@kron.com.tr> Sent: Wednesday, October 23, 2019 6:23 AM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] merge two cluster Hello, I have two cluster and both contains different data sets with different node counts. Would it be beneficial to merge two cluster? Regards, Osman The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitt
Re: merge two cluster
Sorry, missing question; Actually I'm asking this for performance perspective. At application level both cluster used at the same time and approx same level. Inserted data inserted to both cluster, different parts of course. If I merge two cluster, can I gain some performance improvements? Like raid stripes, more disk, more stripe, more speed.. Regards On 23.10.2019 17:30, Durity, Sean R wrote: Beneficial to whom? The apps, the admins, the developers? I suggest that app teams have separate clusters per application. This prevents the noisy neighbor problem, isolates any security issues, and helps when it is time for maintenance, upgrade, performance testing, etc. to not have to coordinate multiple app teams at the same time. Also, an individual cluster can be tuned for its specific workload. Sometimes, though, costs and data size push us towards combining smaller apps owned by the same team onto a single cluster. Those are the exceptions. As a Cassandra admin, I am always trying to scale the ability to admin multiple clusters without just adding new admins. That is an on-going task, dependent on your operating environment. Also, because every table has a portion of memory (memtable), there is a practical limit to the number of tables that any one cluster should have. I have heard it is in the low hundreds of tables. This puts a limit on the number of applications that a cluster can safely support. Sean Durity – Staff Systems Engineer, Cassandra From: Osman YOZGATLIOĞLU <mailto:osman.yozgatlio...@kron.com.tr> Sent: Wednesday, October 23, 2019 6:23 AM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] merge two cluster Hello, I have two cluster and both contains different data sets with different node counts. Would it be beneficial to merge two cluster? Regards, Osman The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
merge two cluster
Hello, I have two cluster and both contains different data sets with different node counts. Would it be beneficial to merge two cluster? Regards, Osman
Re: Max number of windows when using TWCS
Hello, By the way, about https://issues.apache.org/jira/browse/CASSANDRA-13418, I'm not sure how to apply this solution. Do you have a guide about it? Regards, Osman On 12.02.2019 01:42, Nitan Kainth wrote: That’s right Jeff. That’s why I am thinking why not compaction gets rid of old exited sstables? Regards, Nitan Cell: 510 449 9629 On Feb 11, 2019, at 3:53 PM, Jeff Jirsa mailto:jji...@gmail.com>> wrote: It's probably not safe. You shouldn't touch the underlying sstables unless you're very sure you know what you're doing. On Mon, Feb 11, 2019 at 1:05 PM Akash Gangil mailto:akashg1...@gmail.com>> wrote: I have in the past tried to delete SSTables manually, but have noticed bits and pieces of that data still remain, even though the sstables of that window is deleted. So always wondered if playing directly with the underlying filesystem is a safe bet? On Mon, Feb 11, 2019 at 1:01 PM Jonathan Haddad mailto:j...@jonhaddad.com>> wrote: Deleting SSTables manually can be useful if you don't know your TTL up front. For example, you have an ETL process that moves your raw Cassandra data into S3 as parquet files, and you want to be sure that process is completed before you delete the data. You could also start out without setting a TTL and later realize you need one. This is a remarkably common problem. On Mon, Feb 11, 2019 at 12:51 PM Nitan Kainth mailto:nitankai...@gmail.com>> wrote: Jeff, It means we have to delete sstables manually? Regards, Nitan Cell: 510 449 9629 On Feb 11, 2019, at 2:40 PM, Jeff Jirsa mailto:jji...@gmail.com>> wrote: There's a bit of headache around overlapping sstables being strictly safe to delete. https://issues.apache.org/jira/browse/CASSANDRA-13418 was added to allow the "I know it's not technically safe, but just delete it anyway" use case. For a lot of people who started using TWCS before 13418, "stop cassandra, remove stuff we know is expired, start cassandra" is a not-uncommon pattern in very high-write, high-disk-space use cases. On Mon, Feb 11, 2019 at 12:34 PM Nitan Kainth mailto:nitankai...@gmail.com>> wrote: Hi, In regards to comment “Purging data is also straightforward, just dropping SSTables (by a script) where create date is older than a threshold, we don't even need to rely on TTL” Doesn’t the old sstables drop by itself? One ttl and gc grace seconds past whole sstable will have only tombstones. Regards, Nitan Cell: 510 449 9629 On Feb 11, 2019, at 2:23 PM, DuyHai Doan mailto:doanduy...@gmail.com>> wrote: Purging data is also straightforward, just dropping SSTables (by a script) where create date is older than a threshold, we don't even need to rely on TTL -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade -- Akash
Re: removing already joining node
Thank you for clarification. Regards Osman On 13.01.2019 11:24, Jürgen Albersdorfer wrote: Just turn it off. There is no persistent change to the cluster until the node has finished bootstrap and in Status UN. Von meinem iPhone gesendet Am 12.01.2019 um 22:36 schrieb Osman YOZGATLIOĞLU mailto:osman.yozgatlio...@krontech.com>>: Hello, I have one joining node. I decided to change cluster topology and I need to move this node to another cluster. How can I decommission joining node? I can't find exact case at google. Regards, Osman
removing already joining node
Hello, I have one joining node. I decided to change cluster topology and I need to move this node to another cluster. How can I decommission joining node? I can't find exact case at google. Regards, Osman
multiple node bootstrapping
Hello, I have 2 dc cassandra 3.0.14 setup. I need to add 2 new nodes to each dc. I started one node in dc1 and its already joining. 3TB of 50TB finished in 2 weeks. One year ttl time series data with twcs. I know, its not best practise.. I want to start one node in dc2 and cassandra refused to start with mentioning already one node in joining state. I find some workaround with jmx directives, but i'm not sure if I broke something on the way. Is it wise to bootstrap in both dc at the same time? Regards, Osman