cassandra-external-file-seed-provider: manage your list of seeds via an external file
Hello, I wanted to announce a small project that I've worked on a while ago, that may be useful to other people: https://github.com/multani/cassandra-external-file-seed-provider This is a simple seed provider that fetches the list of seeds from an externally managed file. The original goal was to fetch the list of seeds from a specific key from our Consul cluster and save it in a dedicated file using Consul Template, without having to update the whole cassandra.yaml file. I believe this is general enough to help other people as well. Best, Jonathan
Re: How seed nodes are working and how to upgrade/replace them?
On Tue, 8 Jan 2019 at 18:29, Jeff Jirsa wrote: > Given Consul's popularity, seems like someone could make an argument that > we should be shipping a consul-aware seed provider. > Elasticsearch has a very handy dedicated file-based discovery system: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#file-based-hosts-provider It's similar to what Cassandra's built-in SimpleSeedProvider does, but it doesn't require to keep up-to-date the *whole* cassandra.yaml file and that could probably be simpler to dynamically watch for changes. Ultimately, there are plenty of external applications that could be used to pull-in information from your favorite service discovery tool (etcd, Consul, etc.) or configuration management and keep this file up to date, without having to need a plugin for every system out there.
Re: How seed nodes are working and how to upgrade/replace them?
On Tue, 8 Jan 2019 at 18:39, Jeff Jirsa wrote: > On Tue, Jan 8, 2019 at 8:19 AM Jonathan Ballet wrote: > >> Hi Jeff, >> >> thanks for answering to most of my points! >> From the reloadseeds' ticket, I followed to >> https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very >> instructive, although a bit old. >> >> >> On Mon, 7 Jan 2019 at 17:23, Jeff Jirsa wrote: >> >>> > On Jan 7, 2019, at 6:37 AM, Jonathan Ballet >>> wrote: >>> > >>> [...] >>> >>> > In essence, in my example that would be: >>> > >>> > - decide that #2 and #3 will be the new seed nodes >>> > - update all the configuration files of all the nodes to write the >>> IP addresses of #2 and #3 >>> > - DON'T restart any node - the new seed configuration will be picked >>> up only if the Cassandra process restarts >>> > >>> > * If I can manage to sort my Cassandra nodes by their age, could it be >>> a strategy to have the seeds set to the 2 oldest nodes in the cluster? >>> (This implies these nodes would change as the cluster's nodes get >>> upgraded/replaced). >>> >>> You could do this, seems like a lot of headache for little benefit. >>> Could be done with simple seed provider and config management >>> (puppet/chef/ansible) laying down new yaml or with your own seed provider >>> >> >> So, just to make it clear: sorting by age isn't a goal in itself, it was >> just an example on how I could get a stable list. >> >> Right now, we have a dedicated group of seed nodes + a dedicated group >> for non-seeds: doing rolling-upgrade of the nodes from the second list is >> relatively painless (although slow) whereas we are facing the issues >> discussed in CASSANDRA-3829 for the first group which are non-seeds nodes >> are not bootstrapping automatically and we need to operate them in a more >> careful way. >> > Rolling upgrade shouldn't need to re-bootstrap. Only replacing a host > should need a new bootstrap. That should be a new host in your list, so it > seems like this should be fairly rare? > Sorry, that's internal pigdin, by "rolling upgrade" I meant replacing in a rolling fashion all the nodes. > What I'm really looking for is a way to simplify adding and removing nodes >> into our (small) cluster: I can easily provide a small list of nodes from >> our cluster with our config management tool so that new nodes are >> discovering the rest of the cluster, but the documentation seems to imply >> that seed nodes also have other functions and I'm not sure what problems we >> could face trying to simplify this approach. >> >> Ideally, what I would like to have would be: >> >> * Considering a stable cluster (no new nodes, no nodes leaving), the N >> seeds should be always the same N nodes >> * Adding new nodes should not change that list >> * Stopping/removing one of these N nodes should "promote" another >> (non-seed) node as a seed >> - that would not restart the already running Cassandra nodes but would >> update their configuration files. >> - if a node restart for whatever reason it would pick up this new >> configuration >> >> So: no node would start its life as a seed, only a few already existing >> node would have this status. We would not have to deal with the "a seed >> node doesn't bootstrap" problem and it would make our operation process >> simpler. >> >> >>> > I also have some more general questions about seed nodes and how they >>> work: >>> > >>> > * I understand that seed nodes are used when a node starts and needs >>> to discover the rest of the cluster's nodes. Once the node has joined and >>> the cluster is stable, are seed nodes still playing a role in day to day >>> operations? >>> >>> They’re used probabilistically in gossip to encourage convergence. >>> Mostly useful in large clusters. >>> >> >> How "large" are we speaking here? How many nodes would it start to be >> considered "large"? >> > > ~800-1000 > Alllrriigght, we still have a long way :) Jonathan
Re: How seed nodes are working and how to upgrade/replace them?
Hi Jeff, thanks for answering to most of my points! >From the reloadseeds' ticket, I followed to https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very instructive, although a bit old. On Mon, 7 Jan 2019 at 17:23, Jeff Jirsa wrote: > > On Jan 7, 2019, at 6:37 AM, Jonathan Ballet wrote: > > > [...] > > > In essence, in my example that would be: > > > > - decide that #2 and #3 will be the new seed nodes > > - update all the configuration files of all the nodes to write the IP > addresses of #2 and #3 > > - DON'T restart any node - the new seed configuration will be picked > up only if the Cassandra process restarts > > > > * If I can manage to sort my Cassandra nodes by their age, could it be a > strategy to have the seeds set to the 2 oldest nodes in the cluster? (This > implies these nodes would change as the cluster's nodes get > upgraded/replaced). > > You could do this, seems like a lot of headache for little benefit. Could > be done with simple seed provider and config management > (puppet/chef/ansible) laying down new yaml or with your own seed provider > So, just to make it clear: sorting by age isn't a goal in itself, it was just an example on how I could get a stable list. Right now, we have a dedicated group of seed nodes + a dedicated group for non-seeds: doing rolling-upgrade of the nodes from the second list is relatively painless (although slow) whereas we are facing the issues discussed in CASSANDRA-3829 for the first group which are non-seeds nodes are not bootstrapping automatically and we need to operate them in a more careful way. What I'm really looking for is a way to simplify adding and removing nodes into our (small) cluster: I can easily provide a small list of nodes from our cluster with our config management tool so that new nodes are discovering the rest of the cluster, but the documentation seems to imply that seed nodes also have other functions and I'm not sure what problems we could face trying to simplify this approach. Ideally, what I would like to have would be: * Considering a stable cluster (no new nodes, no nodes leaving), the N seeds should be always the same N nodes * Adding new nodes should not change that list * Stopping/removing one of these N nodes should "promote" another (non-seed) node as a seed - that would not restart the already running Cassandra nodes but would update their configuration files. - if a node restart for whatever reason it would pick up this new configuration So: no node would start its life as a seed, only a few already existing node would have this status. We would not have to deal with the "a seed node doesn't bootstrap" problem and it would make our operation process simpler. > > I also have some more general questions about seed nodes and how they > work: > > > > * I understand that seed nodes are used when a node starts and needs to > discover the rest of the cluster's nodes. Once the node has joined and the > cluster is stable, are seed nodes still playing a role in day to day > operations? > > They’re used probabilistically in gossip to encourage convergence. Mostly > useful in large clusters. > How "large" are we speaking here? How many nodes would it start to be considered "large"? Also, about the convergence: is this related to how fast/often the cluster topology is changing? (new nodes, leaving nodes, underlying IP addresses changing, etc.) Thanks for your answers! Jonathan
Re: How seed nodes are working and how to upgrade/replace them?
On Mon, 7 Jan 2019 at 16:51, Oleksandr Shulgin wrote: > On Mon, Jan 7, 2019 at 3:37 PM Jonathan Ballet wrote: > >> >> I'm working on how we could improve the upgrades of our servers and how >> to replace them completely (new instance with a new IP address). >> What I would like to do is to replace the machines holding our current >> seeds (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular >> basis: >> >> * Is it possible to "promote" any non-seed node as a seed node? >> >> * Is it possible to "promote" a new seed node without having to restart >> all the nodes? >> In essence, in my example that would be: >> >> - decide that #2 and #3 will be the new seed nodes >> - update all the configuration files of all the nodes to write the IP >> addresses of #2 and #3 >> - DON'T restart any node - the new seed configuration will be picked up >> only if the Cassandra process restarts >> > > You can provide a custom implementation of the seed provider protocol: > org.apache.cassandra.locator.SeedProvider > > We were exploring that approach few years ago with etcd, which I think > provides capabilities similar to that of Consul: > https://github.com/a1exsh/cassandra-etcd-seed-provider/blob/master/src/main/java/org/zalando/cassandra/locator/EtcdSeedProvider.java > Hi Alex, we were using also a dedicated Consul seed provider but we weren't confident enough about maintaining our version so we removed it in favor of something simpler. Ultimately, we hope(d) that delegating the maintenance of that list to an external process (like Consul Template), directly updating the configuration file, is (should be?) mostly similar without having to maintain our own copy, built with the right version of Cassandra, etc. Thanks for the info though! Jonathan
How seed nodes are working and how to upgrade/replace them?
Hi, I'm trying to understand how seed nodes are working, when and how do they play a part in a Cassandra cluster, and how they should be managed and propagated to other nodes. I have a cluster of 6 Cassandra nodes (let's call them #1 to #6), on which node #1 and #2 are seeds. All the configuration files of all the Cassandra nodes are currently configured with: ``` seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: 'IP #1,IP #2' ``` We are using a service discovery tool (Consul) which automatically registers new Cassandra nodes with its dedicated health-check and are able to generate new configuration based on the content of the service discovery status (with Consul-Template). I'm working on how we could improve the upgrades of our servers and how to replace them completely (new instance with a new IP address). What I would like to do is to replace the machines holding our current seeds (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular basis: * Is it possible to "promote" any non-seed node as a seed node? * Is it possible to "promote" a new seed node without having to restart all the nodes? In essence, in my example that would be: - decide that #2 and #3 will be the new seed nodes - update all the configuration files of all the nodes to write the IP addresses of #2 and #3 - DON'T restart any node - the new seed configuration will be picked up only if the Cassandra process restarts * If I can manage to sort my Cassandra nodes by their age, could it be a strategy to have the seeds set to the 2 oldest nodes in the cluster? (This implies these nodes would change as the cluster's nodes get upgraded/replaced). I also have some more general questions about seed nodes and how they work: * I understand that seed nodes are used when a node starts and needs to discover the rest of the cluster's nodes. Once the node has joined and the cluster is stable, are seed nodes still playing a role in day to day operations? * The documentation says multiple times that not all nodes should be seed nodes, but I didn't really find any place about the consequences it has to have "too many" seed nodes. Also, relatively to the questions I asked above, is there any downsides of having changing seed nodes in a cluster? (with the exact same, at some point I define #1 and #2 to be seeds, then later #4 and #5, etc.) Thanks for helping me to understand better how seeds are working! Jonathan
Separated commit log directory configuration
Hi, Cassandra's documentation has several recommendations for moving the commit log directory to a dedicated disk, separated from the sstables disk(s). However, I couldn't find much information on what would be good practices regarding this dedicated commit log disk and I'm wondering how to configure it the best? We have a Cassandra cluster running 3.11.1 with separate disks for sstables/commit logs which worked great until a few weeks, but since a couple of weeks, some nodes sometimes stop working due to a disk full error: * The error is: "org.apache.cassandra.io.FSWriteError: java.nio.file.FileSystemException: /var/lib/cassandra/commitlog/CommitLog-6-1532888559853.log: No space left on device" * we have a dedicated disk of 20 GB * our commit-related settings are as follow (mostly default): commit_failure_policy: stop commitlog_directory: /var/lib/cassandra/commitlog commitlog_segment_size_in_mb: 32 commitlog_sync: periodic commitlog_sync_period_in_ms: 1 commitlog_total_space_in_mb: 19000 We only increased commitlog_total_space_in_mb so that Cassandra fully uses the dedicated disk, but that may be an error? The default value for this setting is (per the documentation): The default value is the smaller of 8192, and 1/4 of the total space of the commitlog volume. But that doesn't say much (or should it really by 25% of the disk space?) So, my questions would be: * What size should I dedicate to this commit log disk? What are the rules of thumb to discover the "best" size? * How should I configure the "commitlog_total_space_in_mb" setting respectively to the size of the disk? Thanks! Jonathan
Re: Rebuilding a new Cassandra node at 100Mb/s
Hey Bryan, I haven't change this setting, but it looks like this is the same setting that can be changed with "nodetool setstreamthroughput"? It sounds pretty interesting at a first glance, but FWIW, the limit was 12.6 MB/s, not 25 MB/s (so effectively 100 Mb/s). On 12/03/2015 11:40 PM, Bryan Cheng wrote: Jonathan: Have you changed stream_throughput_outbound_megabits_per_sec in cassandra.yaml? # Throttles all outbound streaming file transfers on this node to the # given total throughput in Mbps. This is necessary because Cassandra does # mostly sequential IO when streaming data during bootstrap or repair, which # can lead to saturating the network connection and degrading rpc performance. # When unset, the default is 200 Mbps or 25 MB/s. # stream_throughput_outbound_megabits_per_sec: 200 On Thu, Dec 3, 2015 at 11:32 AM, Robert Coli <rc...@eventbrite.com <mailto:rc...@eventbrite.com>> wrote: On Thu, Dec 3, 2015 at 7:51 AM, Jonathan Ballet <jbal...@edgelab.ch <mailto:jbal...@edgelab.ch>> wrote: I noticed it's not really fast and my monitoring system shows that the traffic incoming on this node is exactly at 100Mb/s (12.6MB/s). I know it can be much more than that (I just tested sending a file through SSH between the two machines and it goes up to 1Gb/s), is there a limitation of some sort on Cassandra which limit the transfer rate to 100Mb/s? Probably limited by number of simultaneous parallel streams. Many people do not want streams to go "as fast as possible" because their priority is maintaining baseline service times while rebuilding/bootstrapping. Not sure there's a way to tune it, but this is definitely on the "large node" radar.. =Rob
Re: Rebuilding a new Cassandra node at 100Mb/s
Thanks for your answer Rob, On 12/03/2015 08:32 PM, Robert Coli wrote: On Thu, Dec 3, 2015 at 7:51 AM, Jonathan Ballet <jbal...@edgelab.ch <mailto:jbal...@edgelab.ch>> wrote: I noticed it's not really fast and my monitoring system shows that the traffic incoming on this node is exactly at 100Mb/s (12.6MB/s). I know it can be much more than that (I just tested sending a file through SSH between the two machines and it goes up to 1Gb/s), is there a limitation of some sort on Cassandra which limit the transfer rate to 100Mb/s? Probably limited by number of simultaneous parallel streams. Many people do not want streams to go "as fast as possible" because their priority is maintaining baseline service times while rebuilding/bootstrapping. Not sure there's a way to tune it, but this is definitely on the "large node" radar.. I was actually a bit surprised that the limit seems to really be capped at 100 Mb/s, not more not less. So I was thinking there was something else playing here... Jonathan
Rebuilding a new Cassandra node at 100Mb/s
Hi, I added a new node to my cluster but in a new datacenter. After updating the keyspace replication factor values (using the NetworkTopologyStrategy strategy), I'm now running a "nodetool rebuild" on the new node. I noticed it's not really fast and my monitoring system shows that the traffic incoming on this node is exactly at 100Mb/s (12.6MB/s). I know it can be much more than that (I just tested sending a file through SSH between the two machines and it goes up to 1Gb/s), is there a limitation of some sort on Cassandra which limit the transfer rate to 100Mb/s? I was thinking that "nodetool setstreamthroughput 0" (or even 999) would help, but it doesn't seem to change anything on the current rebuilding process. Any idea? Thanks! Jonathan
Many keyspaces pattern
Hi, we are running an application which produces every night a batch with several hundreds of Gigabytes of data. Once a batch has been computed, it is never modified (nor updates nor deletes), we just keep producing new batches every day. Now, we are *sometimes* interested to remove a complete specific batch altogether. At the moment, we are accumulating all these data into only one keyspace which has a batch ID column in all our tables which is also part of the primary key. A sample table looks similar to this: CREATE TABLE computation_results ( batch_id int, id1 int, id2 int, value double, PRIMARY KEY ((batch_id, id1), id2) ) WITH CLUSTERING ORDER BY (id2 ASC); But we found out it is very difficult to remove a specific batch as we need to know all the IDs to delete the entries and it's both time and resource consuming (ie. it takes a long time and I'm not sure it's going to scale at all.) So, we are currently looking into having each of our batches in a keyspace of their own so removing a batch is merely equivalent to delete a keyspace. Potentially, it means we will end up having several hundreds of keyspaces in one cluster, although most of the time only the very last one will be used (we might still want to access the older ones, but that would be a very seldom use-case.) At the moment, the keyspace has about 14 tables and is probably not going to evolve much. Are there any counter-indications of using lot of keyspaces (300+) into one Cassandra cluster? Are there any good practices that we should follow? After reading the "Anti-patterns in Cassandra > Too many keyspaces or tables", does it mean we should plan ahead to already split our keyspace among several clusters? Finally, would there be any other way to achieve what we want to do? Thanks for your help! Jonathan
Re: Deploying OpsCenter behind a HTTP(S) proxy
Hi Ben, thank for your reply. That was I was afraid of actually as it means there's no easy solution to implement I guess. I think some guys in my team are in contact with DS people, so I may have a look there. Jonathan On 06/18/2015 07:26 PM, Ben Bromhead wrote: OpsCenter is a little bit tricky to simply just rewrite urls, the lhr requests and rest endpoints it hits are all specified a little differently in the javascript app it loads. We ended up monkey patching a buttload of the js files to get all the requests working properly with our proxy. Everytime a new release of OpsCenter comes out we have to rework it. If you are a DSE customer I would raise it as a support issue :) On 18 June 2015 at 02:29, Spencer Brown lilspe...@gmail.com mailto:lilspe...@gmail.com wrote: First, your firewall should really be your frontend There operational frontend is apache, which is common. You want every url with opscenter in it handled elsewhere. You could also set up proxies for /. cluster-configs, etc... Then there is mod_rewrite, which provides a lot more granularity about when you want what gets handled where.I set up the architectural infrastructure for Orbitz and some major banks, and I'd be happpy to help you out on this. I charge $30/hr., but what you need isn't very complex so we're really just talking $100. On Thu, Jun 18, 2015 at 5:13 AM, Jonathan Ballet jbal...@gfproducts.ch mailto:jbal...@gfproducts.ch wrote: Hi, I'm looking for information on how to correctly deploy an OpsCenter instance behind a HTTP(S) proxy. I have a running instance of OpsCenter 5.1 reachable at http://opscenter:/opscenter/ but I would like to be able to serve this kind of tool under a single hostname on HTTPS along with other tools of this kind, for easier convenience. I'm currently using Apache as my HTTP front-end and I tried this naive configuration: VirtualHost *:80 ServerName tools ... ProxyPreserveHost On # Proxy to OpsCenter # ProxyPass /opscenter/ http://opscenter:/opscenter/ ProxyPassReverse/opscenter/ http://opscenter:/opscenter/ /VirtualHost This doesn't quite work, as OpsCenter seem to also serve specific endpoints from / directly Of course, it doesn't correctly work, as OpsCenter seem to also serve specific data from / directly, such as: /cluster-configs /TestCluster /meta /rc /tcp Is there something I can configure in OpsCenter so that it serves these URLs from somewhere else, or a list of known URLs that I can remap on the proxy, or better yet, a known proxy configuration to put in front of OpsCenter? Regards, Jonathan -- Ben Bromhead Instaclustr | www.instaclustr.com https://www.instaclustr.com/ | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Re: Deploying OpsCenter behind a HTTP(S) proxy
Hi Spencer, I certainly know how to configure a proxy or how to rewrite URLs if I need to and we are currently not looking for a contractor, but thanks for your message! :) Jonathan On 06/18/2015 11:29 AM, Spencer Brown wrote: First, your firewall should really be your frontend There operational frontend is apache, which is common. You want every url with opscenter in it handled elsewhere. You could also set up proxies for /. cluster-configs, etc... Then there is mod_rewrite, which provides a lot more granularity about when you want what gets handled where.I set up the architectural infrastructure for Orbitz and some major banks, and I'd be happpy to help you out on this. I charge $30/hr., but what you need isn't very complex so we're really just talking $100. On Thu, Jun 18, 2015 at 5:13 AM, Jonathan Ballet jbal...@gfproducts.ch mailto:jbal...@gfproducts.ch wrote: Hi, I'm looking for information on how to correctly deploy an OpsCenter instance behind a HTTP(S) proxy. I have a running instance of OpsCenter 5.1 reachable at http://opscenter:/opscenter/ but I would like to be able to serve this kind of tool under a single hostname on HTTPS along with other tools of this kind, for easier convenience. I'm currently using Apache as my HTTP front-end and I tried this naive configuration: VirtualHost *:80 ServerName tools ... ProxyPreserveHost On # Proxy to OpsCenter # ProxyPass /opscenter/ http://opscenter:/opscenter/ ProxyPassReverse/opscenter/ http://opscenter:/opscenter/ /VirtualHost This doesn't quite work, as OpsCenter seem to also serve specific endpoints from / directly Of course, it doesn't correctly work, as OpsCenter seem to also serve specific data from / directly, such as: /cluster-configs /TestCluster /meta /rc /tcp Is there something I can configure in OpsCenter so that it serves these URLs from somewhere else, or a list of known URLs that I can remap on the proxy, or better yet, a known proxy configuration to put in front of OpsCenter? Regards, Jonathan
Deploying OpsCenter behind a HTTP(S) proxy
Hi, I'm looking for information on how to correctly deploy an OpsCenter instance behind a HTTP(S) proxy. I have a running instance of OpsCenter 5.1 reachable at http://opscenter:/opscenter/ but I would like to be able to serve this kind of tool under a single hostname on HTTPS along with other tools of this kind, for easier convenience. I'm currently using Apache as my HTTP front-end and I tried this naive configuration: VirtualHost *:80 ServerName tools ... ProxyPreserveHost On # Proxy to OpsCenter # ProxyPass /opscenter/ http://opscenter:/opscenter/ ProxyPassReverse/opscenter/ http://opscenter:/opscenter/ /VirtualHost This doesn't quite work, as OpsCenter seem to also serve specific endpoints from / directly Of course, it doesn't correctly work, as OpsCenter seem to also serve specific data from / directly, such as: /cluster-configs /TestCluster /meta /rc /tcp Is there something I can configure in OpsCenter so that it serves these URLs from somewhere else, or a list of known URLs that I can remap on the proxy, or better yet, a known proxy configuration to put in front of OpsCenter? Regards, Jonathan
Re: Using Cassandra and Twisted (Python)
Hello Alex, thanks for your answer! I'll try posting there as well then! Best, Jonathan On 06/16/2015 07:05 PM, Alex Popescu wrote: Jonathan, I'm pretty sure you'll have better chances to get this answered on the Python driver mailing list https://groups.google.com/a/lists.datastax.com/forum/#!forum/python-driver-user On Tue, Jun 16, 2015 at 1:01 AM, Jonathan Ballet jbal...@gfproducts.ch mailto:jbal...@gfproducts.ch wrote: Hi, I'd like to write some Python applications using Twisted to talk to a Cassandra cluster. It seems like the Datastax Python library from https://github.com/datastax/python-driver does support Twisted, but it's not exactly clear how I would use this library along with Twisted. The documentation for the async API is very sparse and there's no mention on how to plug this into Twisted event-loop. Does anyone have a small working example on how to use both of these? Thanks! Jonathan -- Bests, Alex Popescu | @al3xandru Sen. Product Manager @ DataStax
Using Cassandra and Twisted (Python)
Hi, I'd like to write some Python applications using Twisted to talk to a Cassandra cluster. It seems like the Datastax Python library from https://github.com/datastax/python-driver does support Twisted, but it's not exactly clear how I would use this library along with Twisted. The documentation for the async API is very sparse and there's no mention on how to plug this into Twisted event-loop. Does anyone have a small working example on how to use both of these? Thanks! Jonathan