RE: Seed List
Thanks all, I stuck with IP addresses. It was just a question to save me having to mess about doing lookups. Given that we only have 6 seeds (3 per DC) it shouldn’t be too much of a headache. However, I had occasion to start from scratch and noticed a thing. With the first cluster attempt, using ansible to add ‘new’ nodes to the ring: DC1, nodes 1-12 DC2 nodes 1-10 My first seed node was DC1, node 3 but ansible operated in numerical order. Thus, my cluster started with two nodes (DC1 node 1, node 2) which had no idea of one another until DC1 node 3 appeared. When I checked with nodetool status, I saw that each node (after ALL nodes added) , in each DC, OWNS was 15-21%. The second time around, I manually added each node, alternating between each DC, starting with the seeds…then with each regular node (once again alternating). Now the OWNS 8-10.5%. Which seems a lot more realistic. My theory is that, for the first attempt, the two regular nodes which started each owned 100% (tot 200%) and when all the other nodes joined this 200% was divided. AT the time I assumed this was a normal report but now…100%(ish) seems more correct. The ‘scientific’ way to test would be to start a 3rd cluster and join 3 nodes before joining the first seed and see if the owns is 300% divided…but that would be a chore. Anyone have any knowledge if this might have been the case? Marc From: Miles Garnsey Sent: Friday, June 24, 2022 1:12 AM To: user@cassandra.apache.org Subject: Re: Seed List EXTERNAL From my understanding, the main thing to be aware of is that Cassandra’s default SeedProvider doesn't resolve multiple A records, so you’re actually limited in terms of the load balancing/DNS configurations you can use. You can however write alternative seed providers which have different (perhaps more appropriate) behaviour, see here<https://github.com/k8ssandra/management-api-for-apache-cassandra/blob/a317ba40fae8e2b11eecde01ad5e6ee15aebdb0f/management-api-agent-4.x/src/main/java/org/apache/cassandra/locator/K8SeedProvider4x.java> for an example. On 23 Jun 2022, at 10:53 pm, Durity, Sean R mailto:sean_r_dur...@homedepot.com>> wrote: It can work to use host names. We have done it for temporary clusters where there is at least a theoretical possibility of an ip address change. I don't know all the trade-offs of using host names, since we don't do that for production. Sean R. Durity INTERNAL USE -Original Message- From: Marc Hoppins mailto:marc.hopp...@eset.com>> Sent: Thursday, June 23, 2022 3:33 AM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] Seed List Hi guys, Documentation (for almost everything) uses IP addresses for seeds, is it possible to use the FQDN instead for the seeds (cass.yaml)? It is far easier to read/use names. Thanks M
Re: Seed List
From my understanding, the main thing to be aware of is that Cassandra’s default SeedProvider doesn't resolve multiple A records, so you’re actually limited in terms of the load balancing/DNS configurations you can use. You can however write alternative seed providers which have different (perhaps more appropriate) behaviour, see here <https://github.com/k8ssandra/management-api-for-apache-cassandra/blob/a317ba40fae8e2b11eecde01ad5e6ee15aebdb0f/management-api-agent-4.x/src/main/java/org/apache/cassandra/locator/K8SeedProvider4x.java> for an example. > On 23 Jun 2022, at 10:53 pm, Durity, Sean R > wrote: > > It can work to use host names. We have done it for temporary clusters where > there is at least a theoretical possibility of an ip address change. I don't > know all the trade-offs of using host names, since we don't do that for > production. > > > Sean R. Durity > > > INTERNAL USE > > -Original Message- > From: Marc Hoppins > Sent: Thursday, June 23, 2022 3:33 AM > To: user@cassandra.apache.org > Subject: [EXTERNAL] Seed List > > Hi guys, > > Documentation (for almost everything) uses IP addresses for seeds, is it > possible to use the FQDN instead for the seeds (cass.yaml)? It is far easier > to read/use names. > > Thanks > > M
RE: Seed List
It can work to use host names. We have done it for temporary clusters where there is at least a theoretical possibility of an ip address change. I don't know all the trade-offs of using host names, since we don't do that for production. Sean R. Durity INTERNAL USE -Original Message- From: Marc Hoppins Sent: Thursday, June 23, 2022 3:33 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Seed List Hi guys, Documentation (for almost everything) uses IP addresses for seeds, is it possible to use the FQDN instead for the seeds (cass.yaml)? It is far easier to read/use names. Thanks M
Seed List
Hi guys, Documentation (for almost everything) uses IP addresses for seeds, is it possible to use the FQDN instead for the seeds (cass.yaml)? It is far easier to read/use names. Thanks M
Re: cassandra-external-file-seed-provider: manage your list of seeds via an external file
Amazing, thank you so much. On Fri, Jun 25, 2021, 18:42 Jonathan Ballet wrote: > Hello, > > I wanted to announce a small project that I've worked on a while ago, that > may be useful to other people: > > https://github.com/multani/cassandra-external-file-seed-provider > > This is a simple seed provider that fetches the list of seeds from an > externally managed file. > > The original goal was to fetch the list of seeds from a specific key from > our Consul cluster and save it in a dedicated file using Consul Template, > without having to update the whole cassandra.yaml file. > I believe this is general enough to help other people as well. > > Best, > > Jonathan >
cassandra-external-file-seed-provider: manage your list of seeds via an external file
Hello, I wanted to announce a small project that I've worked on a while ago, that may be useful to other people: https://github.com/multani/cassandra-external-file-seed-provider This is a simple seed provider that fetches the list of seeds from an externally managed file. The original goal was to fetch the list of seeds from a specific key from our Consul cluster and save it in a dedicated file using Consul Template, without having to update the whole cassandra.yaml file. I believe this is general enough to help other people as well. Best, Jonathan
Re: Can you change seed nodes without doing a restart?
You just need to remove the node from it's own seeds list so it can bootstrap itself back into the cluster. Otherwise, it will immediately join the cluster without streaming data from other replicas. If you intend to promote it back as a seed node, you don't need to remove it from the seeds list of other nodes in the cluster. Cheers!
Can you change seed nodes without doing a restart?
Hi - when replacing a dead seed node, we need to make it not a seed node before replacing it. To do that, you need to change the cassandra.yaml values and (I believe) perform a rolling restart. Is the restart necessary? Thanks. - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: New seed node in the cluster immediately UN without passing for UJ state
> > Just follow up to your statement: > Limiting the seeds to 2 per DC means : > A) Each node in a DC has at least 2 seeds and those seeds belong to the > same DC > or > B) Each node in a DC has at least 2 seeds even across different DC > I apologise for the ambiguity of my previous response, I see it now. :) The recommendation is to pick 2 nodes in each DC to designate as seeds. For example if you had 3 DCs in your cluster, your seeds list might look like: - seeds: "DC1_IP1, DC1_IP2, DC2_IP1, DC2_IP2, DC3_IP1, DC3_IP2" I hope that makes sense. Cheers!
RE: New seed node in the cluster immediately UN without passing for UJ state
Seed node doesn’t bootstrap so if new node were to act as seed node, official recommendations are to boot strap ‘new’ node first , only after that list that node as seed. Seed nodes are usually same across cluster nodes. You can designate 2 nodes as seed per dc in order to mitigate network latency for bootstrapping node ie. Say you have 2dcs you designate 4 nodes (in total) as seed – 2 from each dc. ~Asad From: Sergio [mailto:lapostadiser...@gmail.com] Sent: Tuesday, February 25, 2020 6:19 PM To: user@cassandra.apache.org Cc: erik.rami...@datastax.com Subject: Re: New seed node in the cluster immediately UN without passing for UJ state Hi Erick! Just follow up to your statement: Limiting the seeds to 2 per DC means : A) Each node in a DC has at least 2 seeds and those seeds belong to the same DC or B) Each node in a DC has at least 2 seeds even across different DC Thanks, Sergio Il giorno gio 13 feb 2020 alle ore 19:46 Erick Ramirez mailto:erick.rami...@datastax.com>> ha scritto: Not a problem. And I've just responded on the new thread. Cheers!
Re: New seed node in the cluster immediately UN without passing for UJ state
Hi Erick! Just follow up to your statement: Limiting the seeds to 2 per DC means : A) Each node in a DC has at least 2 seeds and those seeds belong to the same DC or B) Each node in a DC has at least 2 seeds even across different DC Thanks, Sergio Il giorno gio 13 feb 2020 alle ore 19:46 Erick Ramirez < erick.rami...@datastax.com> ha scritto: > Not a problem. And I've just responded on the new thread. Cheers! > >>
Re: New seed node in the cluster immediately UN without passing for UJ state
Not a problem. And I've just responded on the new thread. Cheers! >
Re: New seed node in the cluster immediately UN without passing for UJ state
Thank you very much for this helpful information! I opened a new thread for the other question :) Sergio Il giorno gio 13 feb 2020 alle ore 19:22 Erick Ramirez < erick.rami...@datastax.com> ha scritto: > I want to have more than one seed node in each DC, so unless I don't >> restart the node after changing the seed_list in that node it will not >> become the seed. > > > That's not really going to hurt you if you have other seeds in other DCs. > But if you're willing to take the hit from the restart then feel free to do > so. Just saying that it's not necessary to do it immediately so the option > is there for you. :) > > > Do I need to update the seed_list across all the nodes even in separate >> DCs and perform a rolling restart even across DCs or the restart should be >> happening only on the new node that I want as a seed? > > > You generally want to make the seeds list the same across all nodes in the > cluster. You want to avoid the situation where lots of nodes are used as > seeds by various nodes. Limiting the seeds to 2 per DC means that gossip > convergence will happen much faster. Cheers! > >>
Re: New seed node in the cluster immediately UN without passing for UJ state
> > I want to have more than one seed node in each DC, so unless I don't > restart the node after changing the seed_list in that node it will not > become the seed. That's not really going to hurt you if you have other seeds in other DCs. But if you're willing to take the hit from the restart then feel free to do so. Just saying that it's not necessary to do it immediately so the option is there for you. :) Do I need to update the seed_list across all the nodes even in separate DCs > and perform a rolling restart even across DCs or the restart should be > happening only on the new node that I want as a seed? You generally want to make the seeds list the same across all nodes in the cluster. You want to avoid the situation where lots of nodes are used as seeds by various nodes. Limiting the seeds to 2 per DC means that gossip convergence will happen much faster. Cheers! >
Re: New seed node in the cluster immediately UN without passing for UJ state
Right now yes I have one seed per DC. I want to have more than one seed node in each DC, so unless I don't restart the node after changing the seed_list in that node it will not become the seed. Do I need to update the seed_list across all the nodes even in separate DCs and perform a rolling restart even across DCs or the restart should be happening only on the new node that I want as a seed? The reason each Datacenter has: a seed from the current DC belongs to and a seed from the other DC. Thanks, Sergio Il giorno gio 13 feb 2020 alle ore 18:41 Erick Ramirez < erick.rami...@datastax.com> ha scritto: > 1) If I don't restart the node after changing the seed list this will >> never become the seed and I would like to be sure that I don't find my self >> in a spot where I don't have seed nodes and this means that I can not add a >> node in the cluster > > > Are you saying you only have 1 seed node in the seeds list of each node? > We recommend 2 nodes per DC as seeds -- if one node is down, there's still > another node in the local DC to contact. In the worst case scenario where 2 > nodes in the local DC are down, then nodes can contact seeds in other DCs. > > For the second item, could I make a small request? Since it's unrelated to > this thread, would you mind starting up a new email thread? It just makes > it easier for other users to follow the threads in the future if they're > searching for answers to similar questions. Cheers! > >>
Re: New seed node in the cluster immediately UN without passing for UJ state
> > 1) If I don't restart the node after changing the seed list this will > never become the seed and I would like to be sure that I don't find my self > in a spot where I don't have seed nodes and this means that I can not add a > node in the cluster Are you saying you only have 1 seed node in the seeds list of each node? We recommend 2 nodes per DC as seeds -- if one node is down, there's still another node in the local DC to contact. In the worst case scenario where 2 nodes in the local DC are down, then nodes can contact seeds in other DCs. For the second item, could I make a small request? Since it's unrelated to this thread, would you mind starting up a new email thread? It just makes it easier for other users to follow the threads in the future if they're searching for answers to similar questions. Cheers! >
Re: New seed node in the cluster immediately UN without passing for UJ state
Thank you very much for your response! 2 things: 1) If I don't restart the node after changing the seed list this will never become the seed and I would like to be sure that I don't find my self in a spot where I don't have seed nodes and this means that I can not add a node in the cluster 2) We have i3xlarge instances with data directory in the XFS filesystem that is ephemeral and hints, commit_log and saved_caches in the EBS volume. Whenever AWS is going to retire the instance due to degraded hardware performance is it better: Option 1) - Nodetool drain - Stop cassandra - Restart the machine from aws to be restored in a different VM from the hypervisor - Start Cassandra with -Dcassandra.replace_address OR Option 2) - Add a new node and wait for the NORMAL status - Decommission the one that is going to be retired - Run cleanup with cstar across the datacenters ? Thanks, Sergio Il giorno gio 13 feb 2020 alle ore 18:15 Erick Ramirez < erick.rami...@datastax.com> ha scritto: > I did decommission of this node and I did all the steps mentioned except >> the -Dcassandra.replace_address and now it is streaming correctly! > > > That works too but I was trying to avoid the rebalance operations (like > streaming to restore replica counts) since they can be expensive. > > So basically, if I want this new node as seed should I add its IP address >> after it joined the cluster and after >> - nodetool drain >> - restart cassandra? > > > There's no need to restart C* after updating the seeds list. It will just > take effect the next time you restart. > > I deactivated the future repair happening in the cluster while this node >> is joining. >> When you add a node is it better to stop the repair process? > > > It's not necessary to do so if you have sufficient capacity in your > cluster. Topology changes are just a normal part of a C* cluster's > operation just like repairs. But when you temporarily disable repairs, > existing nodes have more capacity to bootstrap a new node so there is a > benefit there. Cheers! > >>
Re: New seed node in the cluster immediately UN without passing for UJ state
> > I did decommission of this node and I did all the steps mentioned except > the -Dcassandra.replace_address and now it is streaming correctly! That works too but I was trying to avoid the rebalance operations (like streaming to restore replica counts) since they can be expensive. So basically, if I want this new node as seed should I add its IP address > after it joined the cluster and after > - nodetool drain > - restart cassandra? There's no need to restart C* after updating the seeds list. It will just take effect the next time you restart. I deactivated the future repair happening in the cluster while this node is > joining. > When you add a node is it better to stop the repair process? It's not necessary to do so if you have sufficient capacity in your cluster. Topology changes are just a normal part of a C* cluster's operation just like repairs. But when you temporarily disable repairs, existing nodes have more capacity to bootstrap a new node so there is a benefit there. Cheers! >
Re: New seed node in the cluster immediately UN without passing for UJ state
I did decommission of this node and I did all the steps mentioned except the -Dcassandra.replace_address and now it is streaming correctly! So basically, if I want this new node as seed should I add its IP address after it joined the cluster and after - nodetool drain - restart cassandra? I deactivated the future repair happening in the cluster while this node is joining. When you add a node is it better to stop the repair process? Thank you very much Erick! Best, Sergio Il giorno gio 13 feb 2020 alle ore 17:52 Erick Ramirez < erick.rami...@datastax.com> ha scritto: > Should I do something to fix it or leave as it? > > > It depends on what your intentions are. I would use the "replace" method > to build it correctly. At a high level: > - remove the IP from it's own seeds list > - delete the contents of data, commitlog and saved_caches > - add the replace flag in cassandra-env.sh ( > -Dcassandra.replace_address=its_own_ip) > - start C* > > That should allow the node to "replace itself" in the ring and prevent > expensive reshuffling/rebalancing of tokens. Cheers! > >>
Re: New seed node in the cluster immediately UN without passing for UJ state
> > Should I do something to fix it or leave as it? It depends on what your intentions are. I would use the "replace" method to build it correctly. At a high level: - remove the IP from it's own seeds list - delete the contents of data, commitlog and saved_caches - add the replace flag in cassandra-env.sh ( -Dcassandra.replace_address=its_own_ip) - start C* That should allow the node to "replace itself" in the ring and prevent expensive reshuffling/rebalancing of tokens. Cheers! >
Re: New seed node in the cluster immediately UN without passing for UJ state
Thanks for your fast reply! No repairs are running! https://cassandra.apache.org/doc/latest/faq/index.html#does-single-seed-mean-single-point-of-failure I added the node IP itself and the IP of existing seeds and I started Cassandra. So the right procedure is not to add in the seed list the new node and an already existing seed node and then start Cassandra? What should I do? I am running nodetool nestats and the streaming are happening from other nodes Thanks Il giorno gio 13 feb 2020 alle ore 17:39 Erick Ramirez < erick.rami...@datastax.com> ha scritto: > I wanted to add a new node in the cluster and it looks to be working fine >> but instead to wait for 2-3 hours data streaming like 100GB it immediately >> went to the UN (UP and NORMAL) state. >> > > Are you running a repair? I can't see how it's possibly receiving 100GB > since it won't bootstrap. >
Re: New seed node in the cluster immediately UN without passing for UJ state
> > I wanted to add a new node in the cluster and it looks to be working fine > but instead to wait for 2-3 hours data streaming like 100GB it immediately > went to the UN (UP and NORMAL) state. > Are you running a repair? I can't see how it's possibly receiving 100GB since it won't bootstrap.
Re: New seed node in the cluster immediately UN without passing for UJ state
Should I do something to fix it or leave as it? On Thu, Feb 13, 2020, 5:29 PM Jon Haddad wrote: > Seeds don't bootstrap, don't list new nodes as seeds. > > On Thu, Feb 13, 2020 at 5:23 PM Sergio wrote: > >> Hi guys! >> >> I don't know how but this is the first time that I see such behavior. I >> wanted to add a new node in the cluster and it looks to be working fine but >> instead to wait for 2-3 hours data streaming like 100GB it immediately went >> to the UN (UP and NORMAL) state. >> >> I saw a bunch of exception in the logs and WARN >> [MessagingService-Incoming-/10.1.17.126] 2020-02-14 01:08:07,812 >> IncomingTcpConnection.java:103 - UnknownColumnFamilyException reading from >> socket; closing >> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table >> for cfId a5af88d0-24f6-11e9-b009-95ed77b72f6e. If a table was just created, >> this is likely due to the schema not being fully propagated. Please wait >> for schema agreement on table creation. >> at >> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1525) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:850) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:825) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:183) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> at >> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) >> ~[apache-cassandra-3.11.5.jar:3.11.5] >> >> but in the end, it is working... >> >> Suggestion? >> >> Thanks, >> >> Sergio >> >
Re: New seed node in the cluster immediately UN without passing for UJ state
Seeds don't bootstrap, don't list new nodes as seeds. On Thu, Feb 13, 2020 at 5:23 PM Sergio wrote: > Hi guys! > > I don't know how but this is the first time that I see such behavior. I > wanted to add a new node in the cluster and it looks to be working fine but > instead to wait for 2-3 hours data streaming like 100GB it immediately went > to the UN (UP and NORMAL) state. > > I saw a bunch of exception in the logs and WARN > [MessagingService-Incoming-/10.1.17.126] 2020-02-14 01:08:07,812 > IncomingTcpConnection.java:103 - UnknownColumnFamilyException reading from > socket; closing > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table > for cfId a5af88d0-24f6-11e9-b009-95ed77b72f6e. If a table was just created, > this is likely due to the schema not being fully propagated. Please wait > for schema agreement on table creation. > at > org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1525) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:850) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:825) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:183) > ~[apache-cassandra-3.11.5.jar:3.11.5] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) > ~[apache-cassandra-3.11.5.jar:3.11.5] > > but in the end, it is working... > > Suggestion? > > Thanks, > > Sergio >
New seed node in the cluster immediately UN without passing for UJ state
Hi guys! I don't know how but this is the first time that I see such behavior. I wanted to add a new node in the cluster and it looks to be working fine but instead to wait for 2-3 hours data streaming like 100GB it immediately went to the UN (UP and NORMAL) state. I saw a bunch of exception in the logs and WARN [MessagingService-Incoming-/10.1.17.126] 2020-02-14 01:08:07,812 IncomingTcpConnection.java:103 - UnknownColumnFamilyException reading from socket; closing org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for cfId a5af88d0-24f6-11e9-b009-95ed77b72f6e. If a table was just created, this is likely due to the schema not being fully propagated. Please wait for schema agreement on table creation. at org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1525) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:850) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:825) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:183) ~[apache-cassandra-3.11.5.jar:3.11.5] at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) ~[apache-cassandra-3.11.5.jar:3.11.5] but in the end, it is working... Suggestion? Thanks, Sergio
Re: How to elect a normal node to a seed node
>This means that from the client driver perspective when I define the contact points I can specify any node in the cluster as contact point and not necessary a seed node? Correct. On Wed, Feb 12, 2020 at 11:48 AM Sergio wrote: > So if > 1) I stop the a Cassandra node that doesn't have in the seeds IP list > itself > 2) I change the cassandra.yaml of this node and I add it to the seed list > 3) I restart the node > > It will work completely fine and this is not even necessary. > > This means that from the client driver perspective when I define the > contact points I can specify any node in the cluster as contact point and > not necessary a seed node? > > Best, > > Sergio > > > On Wed, Feb 12, 2020, 9:08 AM Arvinder Dhillon > wrote: > >> I believe seed nodes are not special nodes, it's just that you choose a >> few nodes from cluster that helps to bootstrap new joining nodes. You can >> change Cassandra.yaml to make any other node as seed node. There's nothing >> like promotion. >> >> -Arvinder >> >> On Wed, Feb 12, 2020, 8:37 AM Sergio wrote: >> >>> Hi guys! >>> >>> Is there a way to promote a not seed node to a seed node? >>> >>> If yes, how do you do it? >>> >>> Thanks! >>> >>
Re: How to elect a normal node to a seed node
Seed nodes are special in the sense that other nodes need them for bootstrap (first startup only) and they have a special place in the Gossip system. Odds of gossiping to a seed node are higher than other nodes, which makes them "hubs" of gossip messaging. Also, they do not bootstrap, so they won't stream data in on their first start. Aside from that, any node can become a seed node at anytime. Just update the seed list on all nodes, roll restart the cluster and you'll have a new set of seed nodes. - Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com On Wed, Feb 12, 2020 at 6:48 PM Sergio wrote: > So if > 1) I stop the a Cassandra node that doesn't have in the seeds IP list > itself > 2) I change the cassandra.yaml of this node and I add it to the seed list > 3) I restart the node > > It will work completely fine and this is not even necessary. > > This means that from the client driver perspective when I define the > contact points I can specify any node in the cluster as contact point and > not necessary a seed node? > > Best, > > Sergio > > > On Wed, Feb 12, 2020, 9:08 AM Arvinder Dhillon > wrote: > >> I believe seed nodes are not special nodes, it's just that you choose a >> few nodes from cluster that helps to bootstrap new joining nodes. You can >> change Cassandra.yaml to make any other node as seed node. There's nothing >> like promotion. >> >> -Arvinder >> >> On Wed, Feb 12, 2020, 8:37 AM Sergio wrote: >> >>> Hi guys! >>> >>> Is there a way to promote a not seed node to a seed node? >>> >>> If yes, how do you do it? >>> >>> Thanks! >>> >>
Re: How to elect a normal node to a seed node
So if 1) I stop the a Cassandra node that doesn't have in the seeds IP list itself 2) I change the cassandra.yaml of this node and I add it to the seed list 3) I restart the node It will work completely fine and this is not even necessary. This means that from the client driver perspective when I define the contact points I can specify any node in the cluster as contact point and not necessary a seed node? Best, Sergio On Wed, Feb 12, 2020, 9:08 AM Arvinder Dhillon wrote: > I believe seed nodes are not special nodes, it's just that you choose a > few nodes from cluster that helps to bootstrap new joining nodes. You can > change Cassandra.yaml to make any other node as seed node. There's nothing > like promotion. > > -Arvinder > > On Wed, Feb 12, 2020, 8:37 AM Sergio wrote: > >> Hi guys! >> >> Is there a way to promote a not seed node to a seed node? >> >> If yes, how do you do it? >> >> Thanks! >> >
Re: How to elect a normal node to a seed node
I believe seed nodes are not special nodes, it's just that you choose a few nodes from cluster that helps to bootstrap new joining nodes. You can change Cassandra.yaml to make any other node as seed node. There's nothing like promotion. -Arvinder On Wed, Feb 12, 2020, 8:37 AM Sergio wrote: > Hi guys! > > Is there a way to promote a not seed node to a seed node? > > If yes, how do you do it? > > Thanks! >
How to elect a normal node to a seed node
Hi guys! Is there a way to promote a not seed node to a seed node? If yes, how do you do it? Thanks!
Re: How seed nodes are working and how to upgrade/replace them?
On Tue, 8 Jan 2019 at 18:29, Jeff Jirsa wrote: > Given Consul's popularity, seems like someone could make an argument that > we should be shipping a consul-aware seed provider. > Elasticsearch has a very handy dedicated file-based discovery system: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#file-based-hosts-provider It's similar to what Cassandra's built-in SimpleSeedProvider does, but it doesn't require to keep up-to-date the *whole* cassandra.yaml file and that could probably be simpler to dynamically watch for changes. Ultimately, there are plenty of external applications that could be used to pull-in information from your favorite service discovery tool (etcd, Consul, etc.) or configuration management and keep this file up to date, without having to need a plugin for every system out there.
Re: How seed nodes are working and how to upgrade/replace them?
On Tue, 8 Jan 2019 at 18:39, Jeff Jirsa wrote: > On Tue, Jan 8, 2019 at 8:19 AM Jonathan Ballet wrote: > >> Hi Jeff, >> >> thanks for answering to most of my points! >> From the reloadseeds' ticket, I followed to >> https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very >> instructive, although a bit old. >> >> >> On Mon, 7 Jan 2019 at 17:23, Jeff Jirsa wrote: >> >>> > On Jan 7, 2019, at 6:37 AM, Jonathan Ballet >>> wrote: >>> > >>> [...] >>> >>> > In essence, in my example that would be: >>> > >>> > - decide that #2 and #3 will be the new seed nodes >>> > - update all the configuration files of all the nodes to write the >>> IP addresses of #2 and #3 >>> > - DON'T restart any node - the new seed configuration will be picked >>> up only if the Cassandra process restarts >>> > >>> > * If I can manage to sort my Cassandra nodes by their age, could it be >>> a strategy to have the seeds set to the 2 oldest nodes in the cluster? >>> (This implies these nodes would change as the cluster's nodes get >>> upgraded/replaced). >>> >>> You could do this, seems like a lot of headache for little benefit. >>> Could be done with simple seed provider and config management >>> (puppet/chef/ansible) laying down new yaml or with your own seed provider >>> >> >> So, just to make it clear: sorting by age isn't a goal in itself, it was >> just an example on how I could get a stable list. >> >> Right now, we have a dedicated group of seed nodes + a dedicated group >> for non-seeds: doing rolling-upgrade of the nodes from the second list is >> relatively painless (although slow) whereas we are facing the issues >> discussed in CASSANDRA-3829 for the first group which are non-seeds nodes >> are not bootstrapping automatically and we need to operate them in a more >> careful way. >> > Rolling upgrade shouldn't need to re-bootstrap. Only replacing a host > should need a new bootstrap. That should be a new host in your list, so it > seems like this should be fairly rare? > Sorry, that's internal pigdin, by "rolling upgrade" I meant replacing in a rolling fashion all the nodes. > What I'm really looking for is a way to simplify adding and removing nodes >> into our (small) cluster: I can easily provide a small list of nodes from >> our cluster with our config management tool so that new nodes are >> discovering the rest of the cluster, but the documentation seems to imply >> that seed nodes also have other functions and I'm not sure what problems we >> could face trying to simplify this approach. >> >> Ideally, what I would like to have would be: >> >> * Considering a stable cluster (no new nodes, no nodes leaving), the N >> seeds should be always the same N nodes >> * Adding new nodes should not change that list >> * Stopping/removing one of these N nodes should "promote" another >> (non-seed) node as a seed >> - that would not restart the already running Cassandra nodes but would >> update their configuration files. >> - if a node restart for whatever reason it would pick up this new >> configuration >> >> So: no node would start its life as a seed, only a few already existing >> node would have this status. We would not have to deal with the "a seed >> node doesn't bootstrap" problem and it would make our operation process >> simpler. >> >> >>> > I also have some more general questions about seed nodes and how they >>> work: >>> > >>> > * I understand that seed nodes are used when a node starts and needs >>> to discover the rest of the cluster's nodes. Once the node has joined and >>> the cluster is stable, are seed nodes still playing a role in day to day >>> operations? >>> >>> They’re used probabilistically in gossip to encourage convergence. >>> Mostly useful in large clusters. >>> >> >> How "large" are we speaking here? How many nodes would it start to be >> considered "large"? >> > > ~800-1000 > Alllrriigght, we still have a long way :) Jonathan
Re: How seed nodes are working and how to upgrade/replace them?
I've done some gossip simulations in the past and found virtually no difference in the time it takes for messages to propagate in almost any sized cluster. IIRC it always converges by 17 iterations. Thus, I completely agree with Jeff's comment here. If you aren't pushing 800-1000 nodes, it's not even worth bothering with. Just be sure you have seeds in each DC. Something to be aware of - there's only a chance to gossip with a seed. That chance goes down as cluster size increases, meaning seeds have less and less of an impact as the cluster grows. Once you get to 100+ nodes, a given node is very rarely talking to a seed. Just make sure when you start a node it's not in its own seed list and you're good. On Tue, Jan 8, 2019 at 9:39 AM Jeff Jirsa wrote: > > > On Tue, Jan 8, 2019 at 8:19 AM Jonathan Ballet wrote: > >> Hi Jeff, >> >> thanks for answering to most of my points! >> From the reloadseeds' ticket, I followed to >> https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very >> instructive, although a bit old. >> >> >> On Mon, 7 Jan 2019 at 17:23, Jeff Jirsa wrote: >> >>> > On Jan 7, 2019, at 6:37 AM, Jonathan Ballet >>> wrote: >>> > >>> [...] >>> >>> > In essence, in my example that would be: >>> > >>> > - decide that #2 and #3 will be the new seed nodes >>> > - update all the configuration files of all the nodes to write the >>> IP addresses of #2 and #3 >>> > - DON'T restart any node - the new seed configuration will be picked >>> up only if the Cassandra process restarts >>> > >>> > * If I can manage to sort my Cassandra nodes by their age, could it be >>> a strategy to have the seeds set to the 2 oldest nodes in the cluster? >>> (This implies these nodes would change as the cluster's nodes get >>> upgraded/replaced). >>> >>> You could do this, seems like a lot of headache for little benefit. >>> Could be done with simple seed provider and config management >>> (puppet/chef/ansible) laying down new yaml or with your own seed provider >>> >> >> So, just to make it clear: sorting by age isn't a goal in itself, it was >> just an example on how I could get a stable list. >> >> Right now, we have a dedicated group of seed nodes + a dedicated group >> for non-seeds: doing rolling-upgrade of the nodes from the second list is >> relatively painless (although slow) whereas we are facing the issues >> discussed in CASSANDRA-3829 for the first group which are non-seeds nodes >> are not bootstrapping automatically and we need to operate them in a more >> careful way. >> >> > Rolling upgrade shouldn't need to re-bootstrap. Only replacing a host > should need a new bootstrap. That should be a new host in your list, so it > seems like this should be fairly rare? > > >> What I'm really looking for is a way to simplify adding and removing >> nodes into our (small) cluster: I can easily provide a small list of nodes >> from our cluster with our config management tool so that new nodes are >> discovering the rest of the cluster, but the documentation seems to imply >> that seed nodes also have other functions and I'm not sure what problems we >> could face trying to simplify this approach. >> >> Ideally, what I would like to have would be: >> >> * Considering a stable cluster (no new nodes, no nodes leaving), the N >> seeds should be always the same N nodes >> * Adding new nodes should not change that list >> * Stopping/removing one of these N nodes should "promote" another >> (non-seed) node as a seed >> - that would not restart the already running Cassandra nodes but would >> update their configuration files. >> - if a node restart for whatever reason it would pick up this new >> configuration >> >> So: no node would start its life as a seed, only a few already existing >> node would have this status. We would not have to deal with the "a seed >> node doesn't bootstrap" problem and it would make our operation process >> simpler. >> >> >>> > I also have some more general questions about seed nodes and how they >>> work: >>> > >>> > * I understand that seed nodes are used when a node starts and needs >>> to discover the rest of the cluster's nodes. Once the node has joined and >>> the cluster is stable, are seed nodes still playing a role in day to day >>> operations? >>> >>> They’re used probabilistically in gossip to encourage convergence. >>> Mostly useful in large clusters. >>> >> >> How "large" are we speaking here? How many nodes would it start to be >> considered "large"? >> > > ~800-1000 > > >> Also, about the convergence: is this related to how fast/often the >> cluster topology is changing? (new nodes, leaving nodes, underlying IP >> addresses changing, etc.) >> >> > New nodes, nodes going up/down, and schema propagation. > > >> Thanks for your answers! >> >> Jonathan >> > -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
Re: How seed nodes are working and how to upgrade/replace them?
On Tue, Jan 8, 2019 at 8:19 AM Jonathan Ballet wrote: > Hi Jeff, > > thanks for answering to most of my points! > From the reloadseeds' ticket, I followed to > https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very > instructive, although a bit old. > > > On Mon, 7 Jan 2019 at 17:23, Jeff Jirsa wrote: > >> > On Jan 7, 2019, at 6:37 AM, Jonathan Ballet wrote: >> > >> [...] >> >> > In essence, in my example that would be: >> > >> > - decide that #2 and #3 will be the new seed nodes >> > - update all the configuration files of all the nodes to write the IP >> addresses of #2 and #3 >> > - DON'T restart any node - the new seed configuration will be picked >> up only if the Cassandra process restarts >> > >> > * If I can manage to sort my Cassandra nodes by their age, could it be >> a strategy to have the seeds set to the 2 oldest nodes in the cluster? >> (This implies these nodes would change as the cluster's nodes get >> upgraded/replaced). >> >> You could do this, seems like a lot of headache for little benefit. Could >> be done with simple seed provider and config management >> (puppet/chef/ansible) laying down new yaml or with your own seed provider >> > > So, just to make it clear: sorting by age isn't a goal in itself, it was > just an example on how I could get a stable list. > > Right now, we have a dedicated group of seed nodes + a dedicated group for > non-seeds: doing rolling-upgrade of the nodes from the second list is > relatively painless (although slow) whereas we are facing the issues > discussed in CASSANDRA-3829 for the first group which are non-seeds nodes > are not bootstrapping automatically and we need to operate them in a more > careful way. > > Rolling upgrade shouldn't need to re-bootstrap. Only replacing a host should need a new bootstrap. That should be a new host in your list, so it seems like this should be fairly rare? > What I'm really looking for is a way to simplify adding and removing nodes > into our (small) cluster: I can easily provide a small list of nodes from > our cluster with our config management tool so that new nodes are > discovering the rest of the cluster, but the documentation seems to imply > that seed nodes also have other functions and I'm not sure what problems we > could face trying to simplify this approach. > > Ideally, what I would like to have would be: > > * Considering a stable cluster (no new nodes, no nodes leaving), the N > seeds should be always the same N nodes > * Adding new nodes should not change that list > * Stopping/removing one of these N nodes should "promote" another > (non-seed) node as a seed > - that would not restart the already running Cassandra nodes but would > update their configuration files. > - if a node restart for whatever reason it would pick up this new > configuration > > So: no node would start its life as a seed, only a few already existing > node would have this status. We would not have to deal with the "a seed > node doesn't bootstrap" problem and it would make our operation process > simpler. > > >> > I also have some more general questions about seed nodes and how they >> work: >> > >> > * I understand that seed nodes are used when a node starts and needs to >> discover the rest of the cluster's nodes. Once the node has joined and the >> cluster is stable, are seed nodes still playing a role in day to day >> operations? >> >> They’re used probabilistically in gossip to encourage convergence. Mostly >> useful in large clusters. >> > > How "large" are we speaking here? How many nodes would it start to be > considered "large"? > ~800-1000 > Also, about the convergence: is this related to how fast/often the cluster > topology is changing? (new nodes, leaving nodes, underlying IP addresses > changing, etc.) > > New nodes, nodes going up/down, and schema propagation. > Thanks for your answers! > > Jonathan >
Re: How seed nodes are working and how to upgrade/replace them?
Given Consul's popularity, seems like someone could make an argument that we should be shipping a consul-aware seed provider. On Tue, Jan 8, 2019 at 7:39 AM Jonathan Ballet wrote: > On Mon, 7 Jan 2019 at 16:51, Oleksandr Shulgin < > oleksandr.shul...@zalando.de> wrote: > >> On Mon, Jan 7, 2019 at 3:37 PM Jonathan Ballet >> wrote: >> >>> >>> I'm working on how we could improve the upgrades of our servers and how >>> to replace them completely (new instance with a new IP address). >>> What I would like to do is to replace the machines holding our current >>> seeds (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular >>> basis: >>> >>> * Is it possible to "promote" any non-seed node as a seed node? >>> >>> * Is it possible to "promote" a new seed node without having to restart >>> all the nodes? >>> In essence, in my example that would be: >>> >>> - decide that #2 and #3 will be the new seed nodes >>> - update all the configuration files of all the nodes to write the IP >>> addresses of #2 and #3 >>> - DON'T restart any node - the new seed configuration will be picked >>> up only if the Cassandra process restarts >>> >> >> You can provide a custom implementation of the seed provider protocol: >> org.apache.cassandra.locator.SeedProvider >> >> We were exploring that approach few years ago with etcd, which I think >> provides capabilities similar to that of Consul: >> https://github.com/a1exsh/cassandra-etcd-seed-provider/blob/master/src/main/java/org/zalando/cassandra/locator/EtcdSeedProvider.java >> > > Hi Alex, > > we were using also a dedicated Consul seed provider but we weren't > confident enough about maintaining our version so we removed it in favor of > something simpler. > Ultimately, we hope(d) that delegating the maintenance of that list to an > external process (like Consul Template), directly updating the > configuration file, is (should be?) mostly similar without having to > maintain our own copy, built with the right version of Cassandra, etc. > > Thanks for the info though! > > Jonathan > >
Re: How seed nodes are working and how to upgrade/replace them?
Hi Jeff, thanks for answering to most of my points! >From the reloadseeds' ticket, I followed to https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very instructive, although a bit old. On Mon, 7 Jan 2019 at 17:23, Jeff Jirsa wrote: > > On Jan 7, 2019, at 6:37 AM, Jonathan Ballet wrote: > > > [...] > > > In essence, in my example that would be: > > > > - decide that #2 and #3 will be the new seed nodes > > - update all the configuration files of all the nodes to write the IP > addresses of #2 and #3 > > - DON'T restart any node - the new seed configuration will be picked > up only if the Cassandra process restarts > > > > * If I can manage to sort my Cassandra nodes by their age, could it be a > strategy to have the seeds set to the 2 oldest nodes in the cluster? (This > implies these nodes would change as the cluster's nodes get > upgraded/replaced). > > You could do this, seems like a lot of headache for little benefit. Could > be done with simple seed provider and config management > (puppet/chef/ansible) laying down new yaml or with your own seed provider > So, just to make it clear: sorting by age isn't a goal in itself, it was just an example on how I could get a stable list. Right now, we have a dedicated group of seed nodes + a dedicated group for non-seeds: doing rolling-upgrade of the nodes from the second list is relatively painless (although slow) whereas we are facing the issues discussed in CASSANDRA-3829 for the first group which are non-seeds nodes are not bootstrapping automatically and we need to operate them in a more careful way. What I'm really looking for is a way to simplify adding and removing nodes into our (small) cluster: I can easily provide a small list of nodes from our cluster with our config management tool so that new nodes are discovering the rest of the cluster, but the documentation seems to imply that seed nodes also have other functions and I'm not sure what problems we could face trying to simplify this approach. Ideally, what I would like to have would be: * Considering a stable cluster (no new nodes, no nodes leaving), the N seeds should be always the same N nodes * Adding new nodes should not change that list * Stopping/removing one of these N nodes should "promote" another (non-seed) node as a seed - that would not restart the already running Cassandra nodes but would update their configuration files. - if a node restart for whatever reason it would pick up this new configuration So: no node would start its life as a seed, only a few already existing node would have this status. We would not have to deal with the "a seed node doesn't bootstrap" problem and it would make our operation process simpler. > > I also have some more general questions about seed nodes and how they > work: > > > > * I understand that seed nodes are used when a node starts and needs to > discover the rest of the cluster's nodes. Once the node has joined and the > cluster is stable, are seed nodes still playing a role in day to day > operations? > > They’re used probabilistically in gossip to encourage convergence. Mostly > useful in large clusters. > How "large" are we speaking here? How many nodes would it start to be considered "large"? Also, about the convergence: is this related to how fast/often the cluster topology is changing? (new nodes, leaving nodes, underlying IP addresses changing, etc.) Thanks for your answers! Jonathan
Re: How seed nodes are working and how to upgrade/replace them?
On Mon, 7 Jan 2019 at 16:51, Oleksandr Shulgin wrote: > On Mon, Jan 7, 2019 at 3:37 PM Jonathan Ballet wrote: > >> >> I'm working on how we could improve the upgrades of our servers and how >> to replace them completely (new instance with a new IP address). >> What I would like to do is to replace the machines holding our current >> seeds (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular >> basis: >> >> * Is it possible to "promote" any non-seed node as a seed node? >> >> * Is it possible to "promote" a new seed node without having to restart >> all the nodes? >> In essence, in my example that would be: >> >> - decide that #2 and #3 will be the new seed nodes >> - update all the configuration files of all the nodes to write the IP >> addresses of #2 and #3 >> - DON'T restart any node - the new seed configuration will be picked up >> only if the Cassandra process restarts >> > > You can provide a custom implementation of the seed provider protocol: > org.apache.cassandra.locator.SeedProvider > > We were exploring that approach few years ago with etcd, which I think > provides capabilities similar to that of Consul: > https://github.com/a1exsh/cassandra-etcd-seed-provider/blob/master/src/main/java/org/zalando/cassandra/locator/EtcdSeedProvider.java > Hi Alex, we were using also a dedicated Consul seed provider but we weren't confident enough about maintaining our version so we removed it in favor of something simpler. Ultimately, we hope(d) that delegating the maintenance of that list to an external process (like Consul Template), directly updating the configuration file, is (should be?) mostly similar without having to maintain our own copy, built with the right version of Cassandra, etc. Thanks for the info though! Jonathan
Re: How seed nodes are working and how to upgrade/replace them?
> On Jan 7, 2019, at 8:23 AM, Jeff Jirsa wrote: > > > > >> On Jan 7, 2019, at 6:37 AM, Jonathan Ballet wrote: >> >> Hi, >> >> I'm trying to understand how seed nodes are working, when and how do they >> play a part in a Cassandra cluster, and how they should be managed and >> propagated to other nodes. >> >> I have a cluster of 6 Cassandra nodes (let's call them #1 to #6), on which >> node #1 and #2 are seeds. All the configuration files of all the Cassandra >> nodes are currently configured with: >> >> ``` >> seed_provider: >> - class_name: org.apache.cassandra.locator.SimpleSeedProvider >> parameters: >> - seeds: 'IP #1,IP #2' >> ``` >> >> We are using a service discovery tool (Consul) which automatically registers >> new Cassandra nodes with its dedicated health-check and are able to generate >> new configuration based on the content of the service discovery status (with >> Consul-Template). >> >> >> I'm working on how we could improve the upgrades of our servers and how to >> replace them completely (new instance with a new IP address). >> What I would like to do is to replace the machines holding our current seeds >> (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular basis: >> >> * Is it possible to "promote" any non-seed node as a seed node? >> > > Yes - generally you can make any node a seed if you want > >> * Is it possible to "promote" a new seed node without having to restart all >> the nodes? > > nodetool reloadseeds This is apparently in 4.0+ https://issues.apache.org/jira/browse/CASSANDRA-14190 > > There are a few weird edge cases where seeds are reloaded automatically and > we don’t document how or why (it’s a side effect of an error condition in > hosts going up/down, but it’s generally pretty minor unless your seed > provider is broken) > > > (Also true that you could write a seed provider that did this automatically) > > >> In essence, in my example that would be: >> >> - decide that #2 and #3 will be the new seed nodes >> - update all the configuration files of all the nodes to write the IP >> addresses of #2 and #3 >> - DON'T restart any node - the new seed configuration will be picked up >> only if the Cassandra process restarts >> >> * If I can manage to sort my Cassandra nodes by their age, could it be a >> strategy to have the seeds set to the 2 oldest nodes in the cluster? (This >> implies these nodes would change as the cluster's nodes get >> upgraded/replaced). > > You could do this, seems like a lot of headache for little benefit. Could be > done with simple seed provider and config management (puppet/chef/ansible) > laying down new yaml or with your own seed provider > >> >> >> I also have some more general questions about seed nodes and how they work: >> >> * I understand that seed nodes are used when a node starts and needs to >> discover the rest of the cluster's nodes. Once the node has joined and the >> cluster is stable, are seed nodes still playing a role in day to day >> operations? > > They’re used probabilistically in gossip to encourage convergence. Mostly > useful in large clusters. > >> >> * The documentation says multiple times that not all nodes should be seed >> nodes, but I didn't really find any place about the consequences it has to >> have "too many" seed nodes. > > Decreases effectiveness of probabilistic gossiping with seed for convergence > >> Also, relatively to the questions I asked above, is there any downsides of >> having changing seed nodes in a cluster? (with the exact same, at some point >> I define #1 and #2 to be seeds, then later #4 and #5, etc.) >> > > No > >> >> Thanks for helping me to understand better how seeds are working! >> >> Jonathan >>
Re: How seed nodes are working and how to upgrade/replace them?
> On Jan 7, 2019, at 6:37 AM, Jonathan Ballet wrote: > > Hi, > > I'm trying to understand how seed nodes are working, when and how do they > play a part in a Cassandra cluster, and how they should be managed and > propagated to other nodes. > > I have a cluster of 6 Cassandra nodes (let's call them #1 to #6), on which > node #1 and #2 are seeds. All the configuration files of all the Cassandra > nodes are currently configured with: > > ``` > seed_provider: > - class_name: org.apache.cassandra.locator.SimpleSeedProvider > parameters: > - seeds: 'IP #1,IP #2' > ``` > > We are using a service discovery tool (Consul) which automatically registers > new Cassandra nodes with its dedicated health-check and are able to generate > new configuration based on the content of the service discovery status (with > Consul-Template). > > > I'm working on how we could improve the upgrades of our servers and how to > replace them completely (new instance with a new IP address). > What I would like to do is to replace the machines holding our current seeds > (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular basis: > > * Is it possible to "promote" any non-seed node as a seed node? > Yes - generally you can make any node a seed if you want > * Is it possible to "promote" a new seed node without having to restart all > the nodes? nodetool reloadseeds There are a few weird edge cases where seeds are reloaded automatically and we don’t document how or why (it’s a side effect of an error condition in hosts going up/down, but it’s generally pretty minor unless your seed provider is broken) (Also true that you could write a seed provider that did this automatically) > In essence, in my example that would be: > > - decide that #2 and #3 will be the new seed nodes > - update all the configuration files of all the nodes to write the IP > addresses of #2 and #3 > - DON'T restart any node - the new seed configuration will be picked up > only if the Cassandra process restarts > > * If I can manage to sort my Cassandra nodes by their age, could it be a > strategy to have the seeds set to the 2 oldest nodes in the cluster? (This > implies these nodes would change as the cluster's nodes get > upgraded/replaced). You could do this, seems like a lot of headache for little benefit. Could be done with simple seed provider and config management (puppet/chef/ansible) laying down new yaml or with your own seed provider > > > I also have some more general questions about seed nodes and how they work: > > * I understand that seed nodes are used when a node starts and needs to > discover the rest of the cluster's nodes. Once the node has joined and the > cluster is stable, are seed nodes still playing a role in day to day > operations? They’re used probabilistically in gossip to encourage convergence. Mostly useful in large clusters. > > * The documentation says multiple times that not all nodes should be seed > nodes, but I didn't really find any place about the consequences it has to > have "too many" seed nodes. Decreases effectiveness of probabilistic gossiping with seed for convergence > Also, relatively to the questions I asked above, is there any downsides of > having changing seed nodes in a cluster? (with the exact same, at some point > I define #1 and #2 to be seeds, then later #4 and #5, etc.) > No > > Thanks for helping me to understand better how seeds are working! > > Jonathan > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: How seed nodes are working and how to upgrade/replace them?
On Mon, Jan 7, 2019 at 3:37 PM Jonathan Ballet wrote: > > I'm working on how we could improve the upgrades of our servers and how to > replace them completely (new instance with a new IP address). > What I would like to do is to replace the machines holding our current > seeds (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular > basis: > > * Is it possible to "promote" any non-seed node as a seed node? > > * Is it possible to "promote" a new seed node without having to restart > all the nodes? > In essence, in my example that would be: > > - decide that #2 and #3 will be the new seed nodes > - update all the configuration files of all the nodes to write the IP > addresses of #2 and #3 > - DON'T restart any node - the new seed configuration will be picked up > only if the Cassandra process restarts > You can provide a custom implementation of the seed provider protocol: org.apache.cassandra.locator.SeedProvider We were exploring that approach few years ago with etcd, which I think provides capabilities similar to that of Consul: https://github.com/a1exsh/cassandra-etcd-seed-provider/blob/master/src/main/java/org/zalando/cassandra/locator/EtcdSeedProvider.java We are not using this anymore, but for other reasons (namely, being too optimistic about putting Cassandra cluster into an AWS AutoScaling Group). The SeedProvider itslef seem to have worked as we have expected. Hope this helps, -- Alex
How seed nodes are working and how to upgrade/replace them?
Hi, I'm trying to understand how seed nodes are working, when and how do they play a part in a Cassandra cluster, and how they should be managed and propagated to other nodes. I have a cluster of 6 Cassandra nodes (let's call them #1 to #6), on which node #1 and #2 are seeds. All the configuration files of all the Cassandra nodes are currently configured with: ``` seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: 'IP #1,IP #2' ``` We are using a service discovery tool (Consul) which automatically registers new Cassandra nodes with its dedicated health-check and are able to generate new configuration based on the content of the service discovery status (with Consul-Template). I'm working on how we could improve the upgrades of our servers and how to replace them completely (new instance with a new IP address). What I would like to do is to replace the machines holding our current seeds (#1 and #2 at the moment) in a rolling upgrade fashion, on a regular basis: * Is it possible to "promote" any non-seed node as a seed node? * Is it possible to "promote" a new seed node without having to restart all the nodes? In essence, in my example that would be: - decide that #2 and #3 will be the new seed nodes - update all the configuration files of all the nodes to write the IP addresses of #2 and #3 - DON'T restart any node - the new seed configuration will be picked up only if the Cassandra process restarts * If I can manage to sort my Cassandra nodes by their age, could it be a strategy to have the seeds set to the 2 oldest nodes in the cluster? (This implies these nodes would change as the cluster's nodes get upgraded/replaced). I also have some more general questions about seed nodes and how they work: * I understand that seed nodes are used when a node starts and needs to discover the rest of the cluster's nodes. Once the node has joined and the cluster is stable, are seed nodes still playing a role in day to day operations? * The documentation says multiple times that not all nodes should be seed nodes, but I didn't really find any place about the consequences it has to have "too many" seed nodes. Also, relatively to the questions I asked above, is there any downsides of having changing seed nodes in a cluster? (with the exact same, at some point I define #1 and #2 to be seeds, then later #4 and #5, etc.) Thanks for helping me to understand better how seeds are working! Jonathan
Re: auto_bootstrap for seed node
Setting auto_bootstrap on seed nodes is unnecessary and irrelevant. If the node is a seed it will ignore auto_bootstrap and it *will not* bootstrap. On 28 March 2018 at 15:49, Ali Hubail <ali.hub...@petrolink.com> wrote: > "it seems that we still need to keep bootstrap false?" > > Could you shed some light on what would happen if the auto_bootstrap is > removed (or set to true as the default value) in the seed nodes of the > newly added DC? > > What do you have in the seeds param of the new DC nodes (cassandra.yaml)? > Do you reference the old DC seed nodes there as well? > > *Ali Hubail* > > Email: ali.hub...@petrolink.com | www.petrolink.com > Confidentiality warning: This message and any attachments are intended > only for the persons to whom this message is addressed, are confidential, > and may be privileged. If you are not the intended recipient, you are > hereby notified that any review, retransmission, conversion to hard copy, > copying, modification, circulation or other use of this message and any > attachments is strictly prohibited. If you receive this message in error, > please notify the sender immediately by return email, and delete this > message and any attachments from your system. Petrolink International > Limited its subsidiaries, holding companies and affiliates disclaims all > responsibility from and accepts no liability whatsoever for the > consequences of any unauthorized person acting, or refraining from acting, > on any information contained in this message. For security purposes, staff > training, to assist in resolving complaints and to improve our customer > service, email communications may be monitored and telephone calls may be > recorded. > > > *"Peng Xiao" <2535...@qq.com <2535...@qq.com>>* > > 03/28/2018 12:54 AM > Please respond to > user@cassandra.apache.org > > To > "user" <user@cassandra.apache.org>, > > cc > Subject > Re: auto_bootstrap for seed node > > > > > We followed this https://docs.datastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html, > but it does not mention that change bootstrap for seed nodes after the > rebuild. > > Thanks, > Peng Xiao > > > -- Original -- > *From: * "Ali Hubail"<ali.hub...@petrolink.com>; > *Date: * Wed, Mar 28, 2018 10:48 AM > *To: * "user"<user@cassandra.apache.org>; > *Subject: * Re: auto_bootstrap for seed node > > You might want to follow DataStax docs on this one: > > For adding a DC to an existing cluster: > *https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddDCToCluster.html* > <https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddDCToCluster.html> > For adding a new node to an existing cluster: > *https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddNodeToCluster.html* > <https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddNodeToCluster.html> > > briefly speaking, > adding one node to an existing cluster --> use auto_bootstrap > adding a DC to an existing cluster --> rebuild > > You need to check the version of c* that you're running, and make sure you > pick the right doc version for that. > > Most of my colleagues miss very important steps while adding/removing > nodes/cluster, but if they stick to the docs, they always get it done right. > > Hope this helps > > * Ali Hubail* > > Confidentiality warning: This message and any attachments are intended > only for the persons to whom this message is addressed, are confidential, > and may be privileged. If you are not the intended recipient, you are > hereby notified that any review, retransmission, conversion to hard copy, > copying, modification, circulation or other use of this message and any > attachments is strictly prohibited. If you receive this message in error, > please notify the sender immediately by return email, and delete this > message and any attachments from your system. Petrolink International > Limited its subsidiaries, holding companies and affiliates disclaims all > responsibility from and accepts no liability whatsoever for the > consequences of any unauthorized person acting, or refraining from acting, > on any information contained in this message. For security purposes, staff > training, to assist in resolving complaints and to improve our customer > service, email communications may be monitored and telephone calls may be > recorded. > > *"Peng Xiao" <2535...@qq.com <2535...@qq.com>>* > > 03/27/2018 09:39 PM > > > Please respond to > user@cassandra.apache.org > > To > "user" <user@cassandra.apache.org>, > > cc > Subject > auto_bootstrap for seed node > > > > > > > Dear All, > > For adding a new DC ,we need to set auto_bootstrap: false and then run the > rebuild,finally we need to change auto_bootstrap: true,but for seed > nodes,it seems that we still need to keep bootstrap false? > Could anyone please confirm? > > Thanks, > Peng Xiao >
Re: auto_bootstrap for seed node
"it seems that we still need to keep bootstrap false?" Could you shed some light on what would happen if the auto_bootstrap is removed (or set to true as the default value) in the seed nodes of the newly added DC? What do you have in the seeds param of the new DC nodes (cassandra.yaml)? Do you reference the old DC seed nodes there as well? Ali Hubail Email: ali.hub...@petrolink.com | www.petrolink.com Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. "Peng Xiao" <2535...@qq.com> 03/28/2018 12:54 AM Please respond to user@cassandra.apache.org To "user" <user@cassandra.apache.org>, cc Subject Re: auto_bootstrap for seed node We followed this https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html , but it does not mention that change bootstrap for seed nodes after the rebuild. Thanks, Peng Xiao -- Original -- From: "Ali Hubail"<ali.hub...@petrolink.com>; Date: Wed, Mar 28, 2018 10:48 AM To: "user"<user@cassandra.apache.org>; Subject: Re: auto_bootstrap for seed node You might want to follow DataStax docs on this one: For adding a DC to an existing cluster: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddDCToCluster.html For adding a new node to an existing cluster: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddNodeToCluster.html briefly speaking, adding one node to an existing cluster --> use auto_bootstrap adding a DC to an existing cluster --> rebuild You need to check the version of c* that you're running, and make sure you pick the right doc version for that. Most of my colleagues miss very important steps while adding/removing nodes/cluster, but if they stick to the docs, they always get it done right. Hope this helps Ali Hubail Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. "Peng Xiao" <2535...@qq.com> 03/27/2018 09:39 PM Please respond to user@cassandra.apache.org To "user" <user@cassandra.apache.org>, cc Subject auto_bootstrap for seed node Dear All, For adding a new DC ,we need to set auto_bootstrap: false and then run the rebuild,finally we need to change auto_bootstrap: true,but for seed nodes,it seems that we still need to keep bootstrap false? Could anyone please confirm? Thanks, Peng Xiao
Re: auto_bootstrap for seed node
We followed this https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html, but it does not mention that change bootstrap for seed nodes after the rebuild. Thanks, Peng Xiao -- Original -- From: "Ali Hubail"<ali.hub...@petrolink.com>; Date: Wed, Mar 28, 2018 10:48 AM To: "user"<user@cassandra.apache.org>; Subject: Re: auto_bootstrap for seed node You might want to follow DataStax docs on this one: For adding a DC to an existing cluster: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddDCToCluster.html For adding a new node to an existing cluster: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddNodeToCluster.html briefly speaking, adding one node to an existing cluster --> use auto_bootstrap adding a DC to an existing cluster --> rebuild You need to check the version of c* that you're running, and make sure you pick the right doc version for that. Most of my colleagues miss very important steps while adding/removing nodes/cluster, but if they stick to the docs, they always get it done right. Hope this helps Ali Hubail Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. "Peng Xiao" <2535...@qq.com> 03/27/2018 09:39 PM Please respond to user@cassandra.apache.org To "user" <user@cassandra.apache.org>, cc Subject auto_bootstrap for seed node Dear All, For adding a new DC ,we need to set auto_bootstrap: false and then run the rebuild,finally we need to change auto_bootstrap: true,but for seed nodes,it seems that we still need to keep bootstrap false? Could anyone please confirm? Thanks, Peng Xiao
Re: auto_bootstrap for seed node
You might want to follow DataStax docs on this one: For adding a DC to an existing cluster: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddDCToCluster.html For adding a new node to an existing cluster: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/operations/opsAddNodeToCluster.html briefly speaking, adding one node to an existing cluster --> use auto_bootstrap adding a DC to an existing cluster --> rebuild You need to check the version of c* that you're running, and make sure you pick the right doc version for that. Most of my colleagues miss very important steps while adding/removing nodes/cluster, but if they stick to the docs, they always get it done right. Hope this helps Ali Hubail Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. "Peng Xiao" <2535...@qq.com> 03/27/2018 09:39 PM Please respond to user@cassandra.apache.org To "user" <user@cassandra.apache.org>, cc Subject auto_bootstrap for seed node Dear All, For adding a new DC ,we need to set auto_bootstrap: false and then run the rebuild,finally we need to change auto_bootstrap: true,but for seed nodes,it seems that we still need to keep bootstrap false? Could anyone please confirm? Thanks, Peng Xiao
auto_bootstrap for seed node
Dear All, For adding a new DC ,we need to set auto_bootstrap: false and then run the rebuild,finally we need to change auto_bootstrap: true,but for seed nodes,it seems that we still need to keep bootstrap false? Could anyone please confirm? Thanks, Peng Xiao
Re: cassl 2.1.x seed node update via JMX
What's the "one-year gossip bug" in this context? On Thu, Mar 22, 2018 at 3:26 PM, Carl Mueller <carl.muel...@smartthings.com> wrote: > Thanks. The rolling restart triggers the gossip bug so that's a no go. > We'lre going to migrate off the clsuter. Thanks! > > > > On Thu, Mar 22, 2018 at 5:04 PM, Nate McCall <n...@thelastpickle.com> > wrote: > >> This capability was *just* added in CASSANDRA-14190 and only in trunk. >> >> Previously (as described in the ticket above), the seed node list is only >> updated when doing a shadow round, removing an endpoint or restarting (look >> for callers of o.a.c.gms.Gossiper#buildSeedsList() if you're curious). >> >> A rolling restart is the usual SOP for that. >> >> On Fri, Mar 23, 2018 at 9:54 AM, Carl Mueller < >> carl.muel...@smartthings.com> wrote: >> >>> We have a cluster that is subject to the one-year gossip bug. >>> >>> We'd like to update the seed node list via JMX without restart, since >>> our foolishly single-seed-node in this forsaken cluster is being autoculled >>> in AWS. >>> >>> Is this possible? It is not marked volatile in the Config of the source >>> code, so I doubt it. >>> >> >> >> >> -- >> - >> Nate McCall >> Wellington, NZ >> @zznate >> >> CTO >> Apache Cassandra Consulting >> http://www.thelastpickle.com >> > >
Re: cassl 2.1.x seed node update via JMX
Thanks. The rolling restart triggers the gossip bug so that's a no go. We'lre going to migrate off the clsuter. Thanks! On Thu, Mar 22, 2018 at 5:04 PM, Nate McCall <n...@thelastpickle.com> wrote: > This capability was *just* added in CASSANDRA-14190 and only in trunk. > > Previously (as described in the ticket above), the seed node list is only > updated when doing a shadow round, removing an endpoint or restarting (look > for callers of o.a.c.gms.Gossiper#buildSeedsList() if you're curious). > > A rolling restart is the usual SOP for that. > > On Fri, Mar 23, 2018 at 9:54 AM, Carl Mueller < > carl.muel...@smartthings.com> wrote: > >> We have a cluster that is subject to the one-year gossip bug. >> >> We'd like to update the seed node list via JMX without restart, since our >> foolishly single-seed-node in this forsaken cluster is being autoculled in >> AWS. >> >> Is this possible? It is not marked volatile in the Config of the source >> code, so I doubt it. >> > > > > -- > - > Nate McCall > Wellington, NZ > @zznate > > CTO > Apache Cassandra Consulting > http://www.thelastpickle.com >
Re: cassl 2.1.x seed node update via JMX
This capability was *just* added in CASSANDRA-14190 and only in trunk. Previously (as described in the ticket above), the seed node list is only updated when doing a shadow round, removing an endpoint or restarting (look for callers of o.a.c.gms.Gossiper#buildSeedsList() if you're curious). A rolling restart is the usual SOP for that. On Fri, Mar 23, 2018 at 9:54 AM, Carl Mueller <carl.muel...@smartthings.com> wrote: > We have a cluster that is subject to the one-year gossip bug. > > We'd like to update the seed node list via JMX without restart, since our > foolishly single-seed-node in this forsaken cluster is being autoculled in > AWS. > > Is this possible? It is not marked volatile in the Config of the source > code, so I doubt it. > -- - Nate McCall Wellington, NZ @zznate CTO Apache Cassandra Consulting http://www.thelastpickle.com
cassl 2.1.x seed node update via JMX
We have a cluster that is subject to the one-year gossip bug. We'd like to update the seed node list via JMX without restart, since our foolishly single-seed-node in this forsaken cluster is being autoculled in AWS. Is this possible? It is not marked volatile in the Config of the source code, so I doubt it.
Re: Seed nodes of DC2 creating own versions of system keyspaces
On Tue, Mar 6, 2018 at 8:28 PM, Jeff Jirsa <jji...@gmail.com> wrote: > > Sorry, I wasnt as precise as I should have been: > > In 3.0 and newer, a bootstrapping node will wait until it has schema > before it bootstraps. HOWEVER, we make the ssystem_auth/system_distributed, > etc keyspaces as a node starts up, before it requests the schema from the > rest of the cluster. > > You will see some schema exchanges go through the cluster as new 3.0 nodes > come online, but it's a no-op schema change. > Well, this I also see from the code, but it doesn't answer the question of "why". :) Is this again because of the very first seed node corner case? Will it hang indefinitely waiting for schema from other nodes if it would try? -- Alex
Re: Seed nodes of DC2 creating own versions of system keyspaces
On Tue, Mar 6, 2018 at 9:50 AM, Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On 6 Mar 2018 16:55, "Jeff Jirsa" <jji...@gmail.com> wrote: > > On Mar 6, 2018, at 12:32 AM, Oleksandr Shulgin < > oleksandr.shul...@zalando.de> wrote: > > On 5 Mar 2018 16:13, "Jeff Jirsa" <jji...@gmail.com> wrote: > > On Mar 5, 2018, at 6:40 AM, Oleksandr Shulgin < > oleksandr.shul...@zalando.de> wrote: > > We were deploying a second DC today with 3 seed nodes (30 nodes in total) > and we have noticed that all seed nodes reported the following: > > INFO 10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces, > params=KeyspaceParams{durable_writes=true, replication=ReplicationParams{ > class=org.apache.cassandra.locator.SimpleStrategy, > replication_factor=2}}, ... > > followed by similar lines for system_distributed and system_auth. Is this > to be expected? > > They’re written with timestamp=0 to ensure they’re created at least once, > but if you’ve ever issued an ALTER to the table or keyspace, your modified > version will win through normal schema reconciliation process. > > > OK. Any specific reason why non-bootstrapping nodes don't wait for schema > propagation before joining the ring? > > > > They do in 3.0 and newer, the built in keyspaces still get auto created > before that happens > > > We are seeing this on 3.0.15, but if it's no longer the case with newer > versions, then fine. > > > Sorry, I wasnt as precise as I should have been: In 3.0 and newer, a bootstrapping node will wait until it has schema before it bootstraps. HOWEVER, we make the ssystem_auth/system_distributed, etc keyspaces as a node starts up, before it requests the schema from the rest of the cluster. You will see some schema exchanges go through the cluster as new 3.0 nodes come online, but it's a no-op schema change.
Re: Seed nodes of DC2 creating own versions of system keyspaces
On 6 Mar 2018 16:55, "Jeff Jirsa" <jji...@gmail.com> wrote: On Mar 6, 2018, at 12:32 AM, Oleksandr Shulgin <oleksandr.shul...@zalando.de> wrote: On 5 Mar 2018 16:13, "Jeff Jirsa" <jji...@gmail.com> wrote: On Mar 5, 2018, at 6:40 AM, Oleksandr Shulgin <oleksandr.shul...@zalando.de> wrote: We were deploying a second DC today with 3 seed nodes (30 nodes in total) and we have noticed that all seed nodes reported the following: INFO 10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces, params=KeyspaceParams{durable_writes=true, replication=ReplicationParams{ class=org.apache.cassandra.locator.SimpleStrategy, replication_factor=2}}, ... followed by similar lines for system_distributed and system_auth. Is this to be expected? They’re written with timestamp=0 to ensure they’re created at least once, but if you’ve ever issued an ALTER to the table or keyspace, your modified version will win through normal schema reconciliation process. OK. Any specific reason why non-bootstrapping nodes don't wait for schema propagation before joining the ring? They do in 3.0 and newer, the built in keyspaces still get auto created before that happens We are seeing this on 3.0.15, but if it's no longer the case with newer versions, then fine. Thanks, -- Alex
Re: Seed nodes of DC2 creating own versions of system keyspaces
-- Jeff Jirsa > On Mar 6, 2018, at 12:32 AM, Oleksandr Shulgin <oleksandr.shul...@zalando.de> > wrote: > > On 5 Mar 2018 16:13, "Jeff Jirsa" <jji...@gmail.com> wrote: >> On Mar 5, 2018, at 6:40 AM, Oleksandr Shulgin <oleksandr.shul...@zalando.de> >> wrote: >> We were deploying a second DC today with 3 seed nodes (30 nodes in total) >> and we have noticed that all seed nodes reported the following: >> >> INFO 10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces, >> params=KeyspaceParams{durable_writes=true, >> replication=ReplicationParams{class=org.apache.cassandra.locator.SimpleStrategy, >> replication_factor=2}}, ... >> >> followed by similar lines for system_distributed and system_auth. Is this >> to be expected? > They’re written with timestamp=0 to ensure they’re created at least once, but > if you’ve ever issued an ALTER to the table or keyspace, your modified > version will win through normal schema reconciliation process. > > OK. Any specific reason why non-bootstrapping nodes don't wait for schema > propagation before joining the ring? > They do in 3.0 and newer, the built in keyspaces still get auto created before that happens
Re: Seed nodes of DC2 creating own versions of system keyspaces
On 5 Mar 2018 16:13, "Jeff Jirsa" <jji...@gmail.com> wrote: On Mar 5, 2018, at 6:40 AM, Oleksandr Shulgin <oleksandr.shul...@zalando.de> wrote: We were deploying a second DC today with 3 seed nodes (30 nodes in total) and we have noticed that all seed nodes reported the following: INFO 10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces, params=KeyspaceParams{durable_writes=true, replication=ReplicationParams{ class=org.apache.cassandra.locator.SimpleStrategy, replication_factor=2}}, ... followed by similar lines for system_distributed and system_auth. Is this to be expected? They’re written with timestamp=0 to ensure they’re created at least once, but if you’ve ever issued an ALTER to the table or keyspace, your modified version will win through normal schema reconciliation process. OK. Any specific reason why non-bootstrapping nodes don't wait for schema propagation before joining the ring? -- Alex
Re: Seed nodes of DC2 creating own versions of system keyspaces
> On Mar 5, 2018, at 6:40 AM, Oleksandr Shulgin <oleksandr.shul...@zalando.de> > wrote: > > Hi, > > We were deploying a second DC today with 3 seed nodes (30 nodes in total) and > we have noticed that all seed nodes reported the following: > > INFO 10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces, > params=KeyspaceParams{durable_writes=true, > replication=ReplicationParams{class=org.apache.cassandra.locator.SimpleStrategy, > replication_factor=2}}, ... > > followed by similar lines for system_distributed and system_auth. Is this to > be expected? They’re written with timestamp=0 to ensure they’re created at least once, but if you’ve ever issued an ALTER to the table or keyspace, your modified version will win through normal schema reconciliation process. > > Cassandra version is 3.0.15. The DC2 was added to NTS replication setting > for all of the non-local keyspaces in advance, even before starting any of > the new nodes. The schema versions reported by `nodetool describecluster' > are consistent accross DCs, that is: all nodes are on the same version. > > All new nodes use auto_bootstrap=true (in order for > allocate_tokens_for_keyspace=mydata_ks to take effect), the seeds ignore this > setting and report it. The non-seed nodes didn't try to create the system > keyspaces on their own. > > I would expect that even if we don't add the DC2 in advance, the new nodes > should be able to learn about existing system keyspaces and wouldn't try to > create their own. Ultimately we will run `nodetool rebuild' on every node in > DC2, but I would like to understand why this schema disagreement initially? > > Thanks, > -- > Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 > 127-59-707 >
Seed nodes of DC2 creating own versions of system keyspaces
Hi, We were deploying a second DC today with 3 seed nodes (30 nodes in total) and we have noticed that all seed nodes reported the following: INFO 10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces, params=KeyspaceParams{durable_writes=true, replication=ReplicationParams{class=org.apache.cassandra.locator.SimpleStrategy, replication_factor=2}}, ... followed by similar lines for system_distributed and system_auth. Is this to be expected? Cassandra version is 3.0.15. The DC2 was added to NTS replication setting for all of the non-local keyspaces in advance, even before starting any of the new nodes. The schema versions reported by `nodetool describecluster' are consistent accross DCs, that is: all nodes are on the same version. All new nodes use auto_bootstrap=true (in order for allocate_tokens_for_keyspace=mydata_ks to take effect), the seeds ignore this setting and report it. The non-seed nodes didn't try to create the system keyspaces on their own. I would expect that even if we don't add the DC2 in advance, the new nodes should be able to learn about existing system keyspaces and wouldn't try to create their own. Ultimately we will run `nodetool rebuild' on every node in DC2, but I would like to understand why this schema disagreement initially? Thanks, -- Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 127-59-707
RE: On a 12-node Cluster, Starting C* on a Seed Node Increases ReadLatency from 150ms to 1.5 sec.
I understand you use Apache Cassandra 2.2.8. :) - Yes. It was a typo In Apache Cassandra 2.2.8, this triggers incremental repairs I believe, - Yes, default as of 2.2 and using primary range which repairs runs on every node in the cluster . Did you replace the node in-place? - Yes. We removed from its seed provider list. Otherwise, it won’t bootstrap. . You should be able to have nodes going down, or being fairly slow … - When we stopped C* on this node, read performance recovered well. Once started, and now with no repairs running at all, latency increased bad to over 1.5 secs. This affected the node (in AZ 1) and the other 8 nodes ( 4 in AZ 2 and 4 in AZ 3). That is, it slowed down the other 2 AZ’s. - The application reads with CL=LQ - This behavior I do not understand. There is no streaming. My coworker Alexander wrote about this a few month ago, i - We have been looking into Reaper for past 2 months. Work in progress. And thank you for the thorough response. From: Alain RODRIGUEZ Sent: Friday, March 2, 2018 11:43 AM To: user cassandra.apache.org Subject: Re: On a 12-node Cluster, Starting C* on a Seed Node Increases ReadLatency from 150ms to 1.5 sec. Hello, This is a 2.8.8. cluster That's an exotic version! I understand you use Apache Cassandra 2.2.8. :) This single node was a seed node and it was running a ‘repair -pr’ at the time In Apache Cassandra 2.2.8, this triggers incremental repairs I believe, and they are relatively (some would say completely) broken. Let's say they caused a lot of troubles in many cases. If I am wrong and you are not running incremental repairs (default in your version off the top of my head) then you node might not have enough resource available to handle both the repair and the standard load. It might be something to check. Consequences of incremental repairs are: - Keeping SSTables split between repaired and not repaired table, increasing the number of SSTable - Anti-compaction (splits SSTables) is used to keep them grouped. This induces a lot of performances downsides such as (but not only): - inefficient tombstone eviction - More disk hit for the same queries - More compaction work Machine are then performing very poorly. My coworker Alexander wrote about this a few month ago, it might be of interest: http://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html If repairs are a pain point, you might be interested in checking http://cassandra-reaper.io/, that aims at making this operation easier and more efficient. I would say the fact this node is a seed nodes did not impact here, it is a coincidence due to the fact you picked a seed for the repair. Seed nodes are mostly working as any other node, excepted during bootstrap. So we decided to bootstrap it. I am not sure what happen when bootstrapping a seed node. I always removed it from the seed list first. Did you replace the node in-place? I guess if you had no warning and have no consistency issues, it's all good. All we were able to see is that the seed node in question was different in that it had 5000 sstables while all others had around 2300. After bootstrap, seed node sstables reduced to 2500. I would say this is fairly common (even more when using vnodes) as streaming of the data from all the other nodes is fast and compaction might take a while to catch up. Why would starting C* on a single seed node affect the cluster this bad? That's a fair question. It depends on factors such as the client configuration, the replication factor, the consistency level used. If the node is involved in some reads, then the average latency will drop. You should be able to have nodes going down, or being fairly slow and use the right nodes if the client is recent enough and well configured. Is it gossip? It might be, there were issues, but I believe in previous versions and / or on bigger cluster. I would dig for a 'repair' issue first, it seems more probable to me. I hope this helped, C*heers, --- Alain Rodriguez - @arodream - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2018-03-02 14:42 GMT+00:00 Fd Habash <fmhab...@gmail.com>: This is a 2.8.8. cluster with three AWS AZs, each with 4 nodes. Few days ago we noticed a single node’s read latency reaching 1.5 secs there was 8 others with read latencies going up near 900 ms. This single node was a seed node and it was running a ‘repair -pr’ at the time. We intervened as follows … • Stopping compactions during repair did not improve latency. • Killing repair brought down latency to 200 ms on the seed node and the other 8. • Restarting C* on the seed node increased latency again back to near 1.5 secs on the seed and other 8. At this point, there was no repair running and compactions were running. We left them alone. At this point, we saw that putting the seed node back in the cluster consis
Re: On a 12-node Cluster, Starting C* on a Seed Node Increases Read Latency from 150ms to 1.5 sec.
Hello, This is a 2.8.8. cluster That's an exotic version! I understand you use Apache Cassandra 2.2.8. :) This single node was a seed node and it was running a ‘repair -pr’ at the > time In Apache Cassandra 2.2.8, this triggers incremental repairs I believe, and they are relatively (some would say completely) broken. Let's say they caused a lot of troubles in many cases. If I am wrong and you are not running incremental repairs (default in your version off the top of my head) then you node might not have enough resource available to handle both the repair and the standard load. It might be something to check. Consequences of incremental repairs are: - Keeping SSTables split between repaired and not repaired table, increasing the number of SSTable - Anti-compaction (splits SSTables) is used to keep them grouped. This induces a lot of performances downsides such as (but not only): - inefficient tombstone eviction - More disk hit for the same queries - More compaction work Machine are then performing very poorly. My coworker Alexander wrote about this a few month ago, it might be of interest: http://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html If repairs are a pain point, you might be interested in checking http://cassandra-reaper.io/, that aims at making this operation easier and more efficient. I would say the fact this node is a seed nodes did not impact here, it is a coincidence due to the fact you picked a seed for the repair. Seed nodes are mostly working as any other node, excepted during bootstrap. So we decided to bootstrap it. > I am not sure what happen when bootstrapping a seed node. I always removed it from the seed list first. Did you replace the node in-place? I guess if you had no warning and have no consistency issues, it's all good. All we were able to see is that the seed node in question was different in > that it had 5000 sstables while all others had around 2300. After > bootstrap, seed node sstables reduced to 2500. > I would say this is fairly common (even more when using vnodes) as streaming of the data from all the other nodes is fast and compaction might take a while to catch up. Why would starting C* on a single seed node affect the cluster this bad? That's a fair question. It depends on factors such as the client configuration, the replication factor, the consistency level used. If the node is involved in some reads, then the average latency will drop. You should be able to have nodes going down, or being fairly slow and use the right nodes if the client is recent enough and well configured. > Is it gossip? > It might be, there were issues, but I believe in previous versions and / or on bigger cluster. I would dig for a 'repair' issue first, it seems more probable to me. I hope this helped, C*heers, --- Alain Rodriguez - @arodream - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2018-03-02 14:42 GMT+00:00 Fd Habash <fmhab...@gmail.com>: > This is a 2.8.8. cluster with three AWS AZs, each with 4 nodes. > > > > Few days ago we noticed a single node’s read latency reaching 1.5 secs > there was 8 others with read latencies going up near 900 ms. > > > > This single node was a seed node and it was running a ‘repair -pr’ at the > time. We intervened as follows … > > > >- Stopping compactions during repair did not improve latency. >- Killing repair brought down latency to 200 ms on the seed node and >the other 8. >- Restarting C* on the seed node increased latency again back to near >1.5 secs on the seed and other 8. At this point, there was no repair >running and compactions were running. We left them alone. > > > > At this point, we saw that putting the seed node back in the cluster > consistently worsened latencies on seed and 8 nodes = 9 out of the 12 nodes > in the cluster. > > > > So we decided to bootstrap it. During the bootstrapping and afterwards, > latencies remained near 200 ms which is what we wanted for now. > > > > All we were able to see is that the seed node in question was different in > that it had 5000 sstables while all others had around 2300. After > bootstrap, seed node sstables reduced to 2500. > > > > Why would starting C* on a single seed node affect the cluster this bad? > Again, no repair just 4 compactions that run routinely on it as well all > others. Is it gossip? What other plausible explanations are there? > > > > > Thank you > > >
On a 12-node Cluster, Starting C* on a Seed Node Increases Read Latency from 150ms to 1.5 sec.
This is a 2.8.8. cluster with three AWS AZs, each with 4 nodes. Few days ago we noticed a single node’s read latency reaching 1.5 secs there was 8 others with read latencies going up near 900 ms. This single node was a seed node and it was running a ‘repair -pr’ at the time. We intervened as follows … • Stopping compactions during repair did not improve latency. • Killing repair brought down latency to 200 ms on the seed node and the other 8. • Restarting C* on the seed node increased latency again back to near 1.5 secs on the seed and other 8. At this point, there was no repair running and compactions were running. We left them alone. At this point, we saw that putting the seed node back in the cluster consistently worsened latencies on seed and 8 nodes = 9 out of the 12 nodes in the cluster. So we decided to bootstrap it. During the bootstrapping and afterwards, latencies remained near 200 ms which is what we wanted for now. All we were able to see is that the seed node in question was different in that it had 5000 sstables while all others had around 2300. After bootstrap, seed node sstables reduced to 2500. Why would starting C* on a single seed node affect the cluster this bad? Again, no repair just 4 compactions that run routinely on it as well all others. Is it gossip? What other plausible explanations are there? Thank you
Re: Seed nodes and bootstrap (was: Re: Initializing a multiple node cluster (multiple datacenters))
On Mon, Feb 26, 2018 at 7:05 PM, Jeff Jirsawrote: > > I'll happily click the re-open button (you could have, too), but I'm not > sure what the 'right' fix is. Feel free to move discussion to 5836. > Thanks, Jeff. Somehow, I don't see any control elements to change issue status, even though I'm logged in, so I assume only project members / devs can do that. -- Alex
Re: Seed nodes and bootstrap (was: Re: Initializing a multiple node cluster (multiple datacenters))
That ticket was before I was really active contributing, but I tend to agree with your assessment: clearly there's pain point there, and we can do better than the status quo. The problem (as Jonathan notes) is that its a complicated subsystem, and the "obvious" fix probably isn't as obvious as it seems. I'll happily click the re-open button (you could have, too), but I'm not sure what the 'right' fix is. Feel free to move discussion to 5836. On Mon, Feb 26, 2018 at 12:51 AM, Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Fri, Feb 23, 2018 at 7:35 PM, Jeff Jirsa <jji...@gmail.com> wrote: > >> It comes up from time to time. Rob Coli spent years arguing that this >> behavior was confusing ( https://issues.apache.org/jira >> /browse/CASSANDRA-5836 ) , especially in the "I'm replacing a failed >> seed" sense. It also comes up when you're adding the first few hosts to a >> new DC (where they're new, but they're definitely going to be the seeds for >> the new DC). >> > > Jeff, > > I find the response on this ticket quite terrible: a number of independent > reports of significant problems caused by this behavior doesn't justify the > "Won't Fix" status, IMO. > > We were also hit by this one time when the expected location of data > directory has changed in our Docker image. We were performing a rolling > update of the cluster and the first two nodes that we've updated happened > to be seeds. They started happily with blank data directory and were > serving read requests. Ouch. We only realized there was a problem then > the next node that we've updated failed to start. The only reason is that > it *did* try to bootstrap and failed. > > People use to repeat "seed nodes are not different from non-seeds" and > it's true from the perspective of a client application. The same people > would repeat "seeds don't bootstrap" as some kind of magical incantation, > so seeds *are* different and in a subtle way for the operator. But I don't > believe that this difference is justified. When creating a brand new > cluster there is no practical difference as to using auto_bootstrap=true or > false, because there is no data or clients, so the seed nodes behave > exactly the same way as non-seeds. When adding a new DC you are supposed > to set auto_boostrap=false explicitly, so again no difference. > > Where it matters however, is node behavior in *unexpected* circumstances. > If seeds nodes were truly not different from non-seeds in this regard, > there would be less surprises, because of the total node uniformity within > the cluster. > > Therefore, I argue that the ticket should be reopened. > > Regards, > -- > Alex > >
Seed nodes and bootstrap (was: Re: Initializing a multiple node cluster (multiple datacenters))
On Fri, Feb 23, 2018 at 7:35 PM, Jeff Jirsa <jji...@gmail.com> wrote: > It comes up from time to time. Rob Coli spent years arguing that this > behavior was confusing ( https://issues.apache.org/ > jira/browse/CASSANDRA-5836 ) , especially in the "I'm replacing a failed > seed" sense. It also comes up when you're adding the first few hosts to a > new DC (where they're new, but they're definitely going to be the seeds for > the new DC). > Jeff, I find the response on this ticket quite terrible: a number of independent reports of significant problems caused by this behavior doesn't justify the "Won't Fix" status, IMO. We were also hit by this one time when the expected location of data directory has changed in our Docker image. We were performing a rolling update of the cluster and the first two nodes that we've updated happened to be seeds. They started happily with blank data directory and were serving read requests. Ouch. We only realized there was a problem then the next node that we've updated failed to start. The only reason is that it *did* try to bootstrap and failed. People use to repeat "seed nodes are not different from non-seeds" and it's true from the perspective of a client application. The same people would repeat "seeds don't bootstrap" as some kind of magical incantation, so seeds *are* different and in a subtle way for the operator. But I don't believe that this difference is justified. When creating a brand new cluster there is no practical difference as to using auto_bootstrap=true or false, because there is no data or clients, so the seed nodes behave exactly the same way as non-seeds. When adding a new DC you are supposed to set auto_boostrap=false explicitly, so again no difference. Where it matters however, is node behavior in *unexpected* circumstances. If seeds nodes were truly not different from non-seeds in this regard, there would be less surprises, because of the total node uniformity within the cluster. Therefore, I argue that the ticket should be reopened. Regards, -- Alex
Re: Replacing a Seed Node
On Thu, Aug 3, 2017 at 3:00 PM, Fd Habash <fmhab...@gmail.com> wrote: > Hi all … > > I know there is plenty of docs on how to replace a seed node, but some are > steps are contradictory e.g. need to remote the node from seed list for > entire cluster. > > > > My cluster has 6 nodes with 3 seeds running C* 2.8. One seed node was > terminated by AWS. > Hi, First of all -- are you using instance storage or EBS? If the latter: is it attached with a setting to delete the volume on instance termination? In other words: do you still have the data files from that node? If you still have that EBS volume, you can start a replacement instance with that volume attached with the same private IP address (unless it was taken by any other EC2 instance meanwhile). This would be preferred way, since the node just gets UP again without bootstrapping and just needs to replay hints or be repaired (if it was down longer than max_hint_window which is 3 hours by default). I came up with this procedure. Did I miss anything … > > > >1. Remove the node (decomm or removenode) based on its current status >2. Remove the node from its own seed list > 1. No need to remove it from other nodes. My cluster has 3 seeds >3. Restart C* with auto_bootstrap = true >4. Once autobootstrap is done, re-add the node as seed in its own >Cassandra.yaml again >5. Restart C* on this node >6. No need to restart other nodes in the cluster > > You won't be able to decommission if the node is not up. At the same time, you can avoid changing topology (first to remove the dead node, then to bootstrap a new one) by using -Dcassandra.replace_address=172.31.xx.yyy i.e. address of that dead node. If your Cassandra versions supports it, use replace_address_first_boot. This should bootstrap the node by streaming exactly the data your dead seed node was responsible for previously. After this is done, you still need to do a rolling restart of all nodes, updating their seed list. You should remove the IP address of the dead seed and add the address of any currently healthy node, not necessarily this freshly boot strapped one: consider balancing Availability Zones, so that you have a seed node in each AZ. Regards, -- Alex
Replacing a Seed Node
Hi all … I know there is plenty of docs on how to replace a seed node, but some are steps are contradictory e.g. need to remote the node from seed list for entire cluster. My cluster has 6 nodes with 3 seeds running C* 2.8. One seed node was terminated by AWS. I came up with this procedure. Did I miss anything … 1) Remove the node (decomm or removenode) based on its current status 2) Remove the node from its own seed list a. No need to remove it from other nodes. My cluster has 3 seeds 3) Restart C* with auto_bootstrap = true 4) Once autobootstrap is done, re-add the node as seed in its own Cassandra.yaml again 5) Restart C* on this node 6) No need to restart other nodes in the cluster Thank you
Re: first node in a cluster - should it be a seed node
No, any node can be seed node. But to start the cluster and nodes addition you need some node as seed node. Make sure to include at least one node from each DC as seed. DO NOT make all nodes as seed. > On Jul 14, 2017, at 10:08 AM, Vikram Goyal G <vikram.g.go...@ericsson.com> > wrote: > > Hello, > > Can you please comment if first node in a cluster must be a seed node. Is > it mandatory or not? How will it behave > > Regards, > Vikram
first node in a cluster - should it be a seed node
Hello, Can you please comment if first node in a cluster must be a seed node. Is it mandatory or not? How will it behave Regards, Vikram
Re: Seed gossip version will not connect with that version
Restart could help with hung gossip but check you network as possible root cause for this. Sent from my iPhone > On Jul 5, 2017, at 7:23 AM, Jean Carlo <jean.jeancar...@gmail.com> wrote: > > Hello > > I have repairs that hangs because this problem > > WARN [MessagingService-Outgoing-/10.0.0.143] 2017-07-04 10:29:50,076 > OutboundTcpConnection.java:416 - Seed gossip version is -2147483648; will not > connect with that version > INFO [HANDSHAKE-/10.0.0.143] 2017-07-04 10:29:50,076 > OutboundTcpConnection.java:496 - Cannot handshake version with /10.0.0.143 > INFO [HANDSHAKE-/10.0.0.143] 2017-07-04 10:29:50,090 > OutboundTcpConnection.java:487 - Handshaking version with /10.0.0.143 > > > Is it enough to solve by restarting cassandra both nodes ? > > > Best regards > > > Jean Carlo > > "The best way to predict the future is to invent it" Alan Kay
Seed gossip version will not connect with that version
Hello I have repairs that hangs because this problem WARN [MessagingService-Outgoing-/10.0.0.143] 2017-07-04 10:29:50,076 OutboundTcpConnection.java:416 - Seed gossip version is -2147483648; will not connect with that version INFO [HANDSHAKE-/10.0.0.143] 2017-07-04 10:29:50,076 OutboundTcpConnection.java:496 - Cannot handshake version with /10.0.0.143 INFO [HANDSHAKE-/10.0.0.143] 2017-07-04 10:29:50,090 OutboundTcpConnection.java:487 - Handshaking version with /10.0.0.143 Is it enough to solve by restarting cassandra both nodes ? Best regards Jean Carlo "The best way to predict the future is to invent it" Alan Kay
stress tool: random seed
Hi, I'm populating database with YAML. Each time I run stress tool I get the same rows, i.e. the same data generated and no new rows appear. Is there any option to generate each time new data? I would like to test growing database, but don't want insert each time all data. I found a kind of workaround by changing partition key distribution boundaries, but is there better way? Regards, Vlad
Re: Seed nodes as part of cluster
Awesome, thanks for clarification. So why new nodes can’t connect to ANY seed node's IP that is returned by DNS? Why the IPs must be “hardcoded”? — Roman > On May 1, 2017, at 2:11 PM, daemeon reiydelle <daeme...@gmail.com> wrote: > > Caps below for emphasis, not shouting ;{) > > Seed nodes are IDENTICAL to all other node hdfs nodes or you will wish > otherwise. Folks get confused because of terminoligy. I refer to this stuff > as "the seed node service of a normal hdfs node". ANY HDFS NODE IS ABLE TO > ACT AS A SEED NODE BY DEFINITION. But ONLY the nodes listed as seeds in the > XML will be contacted, however. > > The seed "function" is only used by new nodes when they FIRST join the > cluster for the FIRST time, then never used again (once an node joins the > cluster it is using different protocols, a separate list of nodes, etc.). > > ... > > Daemeon C.M. Reiydelle > USA (+1) 415.501.0198 > London (+44) (0) 20 8144 9872 > > On Mon, May 1, 2017 at 2:05 PM, Roman Naumenko <ro...@sproutling.com > <mailto:ro...@sproutling.com>> wrote: > So they are like any other “data” node… but special? > > I’m so freaking confused by this seed nodes design. > > — > Roman > >> On May 1, 2017, at 1:37 PM, vasu gunja <vasu.no...@gmail.com >> <mailto:vasu.no...@gmail.com>> wrote: >> >> Seed will contain meta data + actual data too >> >> On Mon, May 1, 2017 at 3:34 PM, Roman Naumenko <ro...@sproutling.com >> <mailto:ro...@sproutling.com>> wrote: >> Hi, >> >> I’d like to confirm that seed nodes doesn’t contain any data. Is it correct? >> >> Can the instances for seed nodes be smaller size than for data nodes? >> >> Thank you >> Roman >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> <mailto:user-unsubscr...@cassandra.apache.org> >> For additional commands, e-mail: user-h...@cassandra.apache.org >> <mailto:user-h...@cassandra.apache.org> >> >> > >
Re: Seed nodes as part of cluster
Caps below for emphasis, not shouting ;{) Seed nodes are IDENTICAL to all other node hdfs nodes or you will wish otherwise. Folks get confused because of terminoligy. I refer to this stuff as "the seed node service of a normal hdfs node". ANY HDFS NODE IS ABLE TO ACT AS A SEED NODE BY DEFINITION. But ONLY the nodes listed as seeds in the XML will be contacted, however. The seed "function" is only used by new nodes when they FIRST join the cluster for the FIRST time, then never used again (once an node joins the cluster it is using different protocols, a separate list of nodes, etc.). *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Mon, May 1, 2017 at 2:05 PM, Roman Naumenko <ro...@sproutling.com> wrote: > So they are like any other “data” node… but special? > > I’m so freaking confused by this seed nodes design. > > — > Roman > > On May 1, 2017, at 1:37 PM, vasu gunja <vasu.no...@gmail.com> wrote: > > Seed will contain meta data + actual data too > > On Mon, May 1, 2017 at 3:34 PM, Roman Naumenko <ro...@sproutling.com> > wrote: > >> Hi, >> >> I’d like to confirm that seed nodes doesn’t contain any data. Is it >> correct? >> >> Can the instances for seed nodes be smaller size than for data nodes? >> >> Thank you >> Roman >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> >> > >
Re: Seed nodes as part of cluster
So they are like any other “data” node… but special? I’m so freaking confused by this seed nodes design. — Roman > On May 1, 2017, at 1:37 PM, vasu gunja <vasu.no...@gmail.com> wrote: > > Seed will contain meta data + actual data too > > On Mon, May 1, 2017 at 3:34 PM, Roman Naumenko <ro...@sproutling.com > <mailto:ro...@sproutling.com>> wrote: > Hi, > > I’d like to confirm that seed nodes doesn’t contain any data. Is it correct? > > Can the instances for seed nodes be smaller size than for data nodes? > > Thank you > Roman > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > <mailto:user-unsubscr...@cassandra.apache.org> > For additional commands, e-mail: user-h...@cassandra.apache.org > <mailto:user-h...@cassandra.apache.org> > >
Re: Seed nodes as part of cluster
Seed will contain meta data + actual data too On Mon, May 1, 2017 at 3:34 PM, Roman Naumenko <ro...@sproutling.com> wrote: > Hi, > > I’d like to confirm that seed nodes doesn’t contain any data. Is it > correct? > > Can the instances for seed nodes be smaller size than for data nodes? > > Thank you > Roman > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >
Seed nodes as part of cluster
Hi, I’d like to confirm that seed nodes doesn’t contain any data. Is it correct? Can the instances for seed nodes be smaller size than for data nodes? Thank you Roman - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Seed Private / Public Broadcast IP
HI All, I am currently running a multi-region setup in AWS. I have a single cluster across two datacenters in different regions. In order to communicate cross-region in AWS, I have my broadcast_address set to public IPs and my listen_address set to the instance's private IP. I believe that this is the recommended setup and everything works great. Now I want to expand my cluster to include my company's office as a third datacenter. I have VPN tunnels established to both AWS datacenters, and I need to exclusively use private IP addresses to communicate from our office to AWS. If I connect via the AWS instance's public IP, then my traffic gets NAT through my office firewall - which then cannot connect and I cannot provide local instances with public IPs. On my new nodes, I've tried setting the seeds entry in cassandra.yaml to the private IP of the seeds in AWS. Cassandra can initially connect to the seed nodes via the private IP, but then the seeds provide my local instance with their brodcast_address - the public ip - and this causes problems. Is there any way to change that behavior, such that my new, local nodes ignore the broadcast_address provided to them? How else might I accomplish the above? Outside of configuring the two AWS regions to connect via private IP, which is no small task, I don't see any workaround. Any help is most appreciated. Thanks, Asher
Re: if seed is diff on diff nodes, any problem ?
As per my understanding, 2 same seed nodes per dc is the way to go.. If u r not creating two isolated set of nodes in ur cluster, there may be nodes referring each other in a way that everyone is able to know everyone else.. Anuj Sent from Yahoo Mail on Android From:Chris Mawata chris.maw...@gmail.com Date:Sun, 26 Jul, 2015 at 6:52 am Subject:Re: if seed is diff on diff nodes, any problem ? I think you could end up with partitioning where a you have two cliques of nodes that gossip to each other but not to nodes outside the clique On Jul 25, 2015 3:29 PM, rock zhang r...@alohar.com wrote: Hi All, I have 6 node, most of them are using node1 as seed, but I just find out 2 nodes are using node3 as seed, but everything looks fine. Does that mean seed node does not have to be same on all nodes ? Thanks Rock
Re: if seed is diff on diff nodes, any problem ?
Seeds are used in two different ways: 1) When joining the ring, the joining node knows NOTHING about the cluster, so it uses a seed list to discover the cluster. Once discovered, it saves the peers to disk, so subsequent starts it will find/reconnect to other nodes beyond just the explicitly listed seeds. This is the most important usage of seeds, and the one where if you fail to chose seeds in the right way you can end up in two distinct clusters. In your case, since node3 was probably in the ring and behaving, when nodes connected to node3 on their first start, they probably joined the ring fine. 2) During gossip, you can assume that each node gossips with 1 seed, 1 healthy node, and 1 dead node (if it exists). If you have a small, fixed seed list (such as 2 nodes per datacenter), you can guarantee that schema/topology changes are propagated very quickly, because all nodes will talk to those seeds right away. If you have randomized seeds, you may take quite some time for schema changes to propagate, but normal operation (reads/writes) will continue to be perfectly fine. In short: you’re probably fine, change the seeds in cassandra.yaml so they’re picked up on next start, but it’s probably not worth a rolling restart if everything’s behaving as intended. - Jeff On 7/25/15, 12:28 PM, rock zhang r...@alohar.com wrote: Hi All, I have 6 node, most of them are using node1 as seed, but I just find out 2 nodes are using node3 as seed, but everything looks fine. Does that mean seed node does not have to be same on all nodes ? Thanks Rock smime.p7s Description: S/MIME cryptographic signature
if seed is diff on diff nodes, any problem ?
Hi All, I have 6 node, most of them are using node1 as seed, but I just find out 2 nodes are using node3 as seed, but everything looks fine. Does that mean seed node does not have to be same on all nodes ? Thanks Rock
Re: if seed is diff on diff nodes, any problem ?
I think you could end up with partitioning where a you have two cliques of nodes that gossip to each other but not to nodes outside the clique On Jul 25, 2015 3:29 PM, rock zhang r...@alohar.com wrote: Hi All, I have 6 node, most of them are using node1 as seed, but I just find out 2 nodes are using node3 as seed, but everything looks fine. Does that mean seed node does not have to be same on all nodes ? Thanks Rock
RE: Seed gossip version error
Hi Amlan, We have the same pb with Cassandra 2.1.5. I have no hint (yet) to follow. Did you found the root of this pb ? Thanks. Regards, Dominique [@@ THALES GROUP INTERNAL @@] De : Amlan Roy [mailto:amlan@cleartrip.com] Envoyé : mercredi 1 juillet 2015 12:46 À : user@cassandra.apache.org Objet : Seed gossip version error Hi, I have a running cluster running with version 2.1.7. Two of the machines went down and they are not joining the cluster even after restart. I see the following WARN message in system.log in all the nodes: system.log:WARN [MessagingService-Outgoing-cassandra2.cleartrip.com/172.18.3.32http://MessagingService-Outgoing-cassandra2.cleartrip.com/172.18.3.32] 2015-07-01 13:00:41,878 OutboundTcpConnection.java:414 - Seed gossip version is -2147483648; will not connect with that version Please let me know if you have faced the same problem. Regards, Amlan
RE: Seed gossip version error
Thanks for your reply. Yes, I am sure all nodes are running the same version. On second thoughts, I think my gossip pb is due to intense GC activities, leading to be even not able to do a gossip handshake ! Regards, Dominique [@@ THALES GROUP INTERNAL @@] De : Carlos Rolo [mailto:r...@pythian.com] Envoyé : mardi 21 juillet 2015 18:33 À : user@cassandra.apache.org Objet : Re: Seed gossip version error That error should only occur when you have a mismatch between the Seed version and the new node version. Are you sure all your nodes are running in the same version? Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.comhttp://www.pythian.com/ On Tue, Jul 21, 2015 at 5:37 PM, DE VITO Dominique dominique.dev...@thalesgroup.commailto:dominique.dev...@thalesgroup.com wrote: Hi Amlan, We have the same pb with Cassandra 2.1.5. I have no hint (yet) to follow. Did you found the root of this pb ? Thanks. Regards, Dominique [@@ THALES GROUP INTERNAL @@] De : Amlan Roy [mailto:amlan@cleartrip.commailto:amlan@cleartrip.com] Envoyé : mercredi 1 juillet 2015 12:46 À : user@cassandra.apache.orgmailto:user@cassandra.apache.org Objet : Seed gossip version error Hi, I have a running cluster running with version 2.1.7. Two of the machines went down and they are not joining the cluster even after restart. I see the following WARN message in system.log in all the nodes: system.log:WARN [MessagingService-Outgoing-cassandra2.cleartrip.com/172.18.3.32http://MessagingService-Outgoing-cassandra2.cleartrip.com/172.18.3.32] 2015-07-01 13:00:41,878 OutboundTcpConnection.java:414 - Seed gossip version is -2147483648; will not connect with that version Please let me know if you have faced the same problem. Regards, Amlan --
Re: Seed gossip version error
That error should only occur when you have a mismatch between the Seed version and the new node version. Are you sure all your nodes are running in the same version? Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Tue, Jul 21, 2015 at 5:37 PM, DE VITO Dominique dominique.dev...@thalesgroup.com wrote: Hi Amlan, We have the same pb with Cassandra 2.1.5. I have no hint (yet) to follow. Did you found the root of this pb ? Thanks. Regards, Dominique [@@ THALES GROUP INTERNAL @@] *De :* Amlan Roy [mailto:amlan@cleartrip.com] *Envoyé :* mercredi 1 juillet 2015 12:46 *À :* user@cassandra.apache.org *Objet :* Seed gossip version error Hi, I have a running cluster running with version 2.1.7. Two of the machines went down and they are not joining the cluster even after restart. I see the following WARN message in system.log in all the nodes: system.log:WARN [ MessagingService-Outgoing-cassandra2.cleartrip.com/172.18.3.32] 2015-07-01 13:00:41,878 OutboundTcpConnection.java:414 - Seed gossip version is -2147483648; will not connect with that version Please let me know if you have faced the same problem. Regards, Amlan -- --
Seed gossip version error
Hi, I have a running cluster running with version 2.1.7. Two of the machines went down and they are not joining the cluster even after restart. I see the following WARN message in system.log in all the nodes: system.log:WARN [MessagingService-Outgoing-cassandra2.cleartrip.com/172.18.3.32] 2015-07-01 13:00:41,878 OutboundTcpConnection.java:414 - Seed gossip version is -2147483648; will not connect with that version Please let me know if you have faced the same problem. Regards, Amlan
Re: Seed Node OOM
Hi, Is your OOM on heap or on native memory ? Since 2.1 put a lot of things on native memory I would say that it is almost always bad to have 6 GB out of 8 for the heap (unless you have a very small data set), since in the 2 GB remaining you have to keep bloom filters, indexes and more + Page caching if you have free space in there... If the OOM is in the heap, is there any sign of pressure in the logs (PARNEW / CMS) ? Also did you activated GC logs to troubleshoot this manually or through a third party application ? +1 with Sebastien for the hprof analyse. Rob might also be right by pointing the memory link. Hope this will help. C*heers, Alain 2015-06-15 19:39 GMT+02:00 Robert Coli rc...@eventbrite.com: On Sat, Jun 13, 2015 at 4:39 AM, Oleksandr Petrov oleksandr.pet...@gmail.com wrote: We're using Cassandra, recently migrated to 2.1.6, and we're experiencing constant OOMs in one of our clusters. Maybe this memory leak? https://issues.apache.org/jira/browse/CASSANDRA-9549 =Rob
Re: Seed Node OOM
On Sat, Jun 13, 2015 at 4:39 AM, Oleksandr Petrov oleksandr.pet...@gmail.com wrote: We're using Cassandra, recently migrated to 2.1.6, and we're experiencing constant OOMs in one of our clusters. Maybe this memory leak? https://issues.apache.org/jira/browse/CASSANDRA-9549 =Rob
Re: Seed Node OOM
The commitlog size is likely a red herring. In 2.0 we had 1gb commitlogs by default. In 2.1 we have 8gb commitlogs by default. This is configurable at the yaml. Not sure what's causing the OOM. Did it generate an hprof file you can analyze? On Jun 13, 2015 7:42 AM, Oleksandr Petrov oleksandr.pet...@gmail.com wrote: Sorry I completely forgot to mention it in an original message: we have rather large commitlog directory (which is usually rather small), 8G of commitlogs. Draining and flushing didn't help. On Sat, Jun 13, 2015 at 1:39 PM, Oleksandr Petrov oleksandr.pet...@gmail.com wrote: Hi, We're using Cassandra, recently migrated to 2.1.6, and we're experiencing constant OOMs in one of our clusters. It's a rather small cluster: 3 nodes, EC2 xlarge: 2CPUs, 8GB RAM, set up with datastax AMI. Configs (yaml and env.sh) are rather default: we've changed only concurrent compactions to 2 (although tried 1, too), tried setting HEAP and NEW to different values, ranging from 4G/200 to 6G/200M. Write load is rather small: 200-300 small payloads (4 varchar fields as a primary key, 2 varchar fields and a couple of long/double fields), plus some larger (1-2kb) payloads with a rate of 10-20 messages per second. We do a lot of range scans, but they are rather quick. It kind of started overnight. Compaction is taking a long time. Other two nodes in a cluster behave absolutely normally: no hinted handoffs, normal heap sizes. There were no write bursts, no tables added no indexes changed. Anyone experienced something similar? Maybe any pointers? -- alex p -- alex p
Re: Seed Node OOM
Sorry I completely forgot to mention it in an original message: we have rather large commitlog directory (which is usually rather small), 8G of commitlogs. Draining and flushing didn't help. On Sat, Jun 13, 2015 at 1:39 PM, Oleksandr Petrov oleksandr.pet...@gmail.com wrote: Hi, We're using Cassandra, recently migrated to 2.1.6, and we're experiencing constant OOMs in one of our clusters. It's a rather small cluster: 3 nodes, EC2 xlarge: 2CPUs, 8GB RAM, set up with datastax AMI. Configs (yaml and env.sh) are rather default: we've changed only concurrent compactions to 2 (although tried 1, too), tried setting HEAP and NEW to different values, ranging from 4G/200 to 6G/200M. Write load is rather small: 200-300 small payloads (4 varchar fields as a primary key, 2 varchar fields and a couple of long/double fields), plus some larger (1-2kb) payloads with a rate of 10-20 messages per second. We do a lot of range scans, but they are rather quick. It kind of started overnight. Compaction is taking a long time. Other two nodes in a cluster behave absolutely normally: no hinted handoffs, normal heap sizes. There were no write bursts, no tables added no indexes changed. Anyone experienced something similar? Maybe any pointers? -- alex p -- alex p
Seed Node OOM
Hi, We're using Cassandra, recently migrated to 2.1.6, and we're experiencing constant OOMs in one of our clusters. It's a rather small cluster: 3 nodes, EC2 xlarge: 2CPUs, 8GB RAM, set up with datastax AMI. Configs (yaml and env.sh) are rather default: we've changed only concurrent compactions to 2 (although tried 1, too), tried setting HEAP and NEW to different values, ranging from 4G/200 to 6G/200M. Write load is rather small: 200-300 small payloads (4 varchar fields as a primary key, 2 varchar fields and a couple of long/double fields), plus some larger (1-2kb) payloads with a rate of 10-20 messages per second. We do a lot of range scans, but they are rather quick. It kind of started overnight. Compaction is taking a long time. Other two nodes in a cluster behave absolutely normally: no hinted handoffs, normal heap sizes. There were no write bursts, no tables added no indexes changed. Anyone experienced something similar? Maybe any pointers? -- alex p
Re: Seed Node
On Thu, Mar 19, 2015 at 3:56 PM, jean paul researche...@gmail.com wrote: Please,i have a question a bout the seed node.. as i read it is the bootstrap node, each new node joins the seed node that's it? if it leaves the cluster, how can a new node joins the rest of the group ? What a seed is within Cassandra is apparently formally undefined. To answer your question, ones just picks a random other node and uses that as the seed. The only thing a new node is doing from the seed is discovering the rest of the cluster. For more detail (and some handwaving) : https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032 =Rob
Seed Node
Hello All, Please,i have a question a bout the seed node.. as i read it is the bootstrap node, each new node joins the seed node that's it? if it leaves the cluster, how can a new node joins the rest of the group ? Thanks a lot for answer. Best Regards.
Re: Is there harm from having all the nodes in the seed list?
On Tue, Sep 23, 2014 at 10:31 AM, Donald Smith donald.sm...@audiencescience.com wrote: Is there any harm from having all the nodes listed in the seeds list in cassandra.yaml? Yes, seed nodes cannot bootstrap. https://issues.apache.org/jira/browse/CASSANDRA-5836 See comments there for details on how this actually doesn't make any sense. The correct solution is almost certainly to have a dynamic seed provider, which is why DSE and Priam both do that. But in practice it mostly doesn't matter except in the annoying yet common CASSANDRA-5836 case. =Rob
Is there harm from having all the nodes in the seed list?
Is there any harm from having all the nodes listed in the seeds list in cassandra.yaml? Donald A. Smith | Senior Software Engineer P: 425.201.3900 x 3866 C: (206) 819-5965 F: (646) 443-2333 dona...@audiencescience.commailto:dona...@audiencescience.com [AudienceScience]
Re: Is there harm from having all the nodes in the seed list?
Well, having all nodes in the seed list does not compromise any correctness of gossip protocol. However there will be extra network traffic when nodes are starting because it will ping all nodes for topology discovery, AFAIK On Tue, Sep 23, 2014 at 7:31 PM, Donald Smith donald.sm...@audiencescience.com wrote: Is there any harm from having all the nodes listed in the seeds list in cassandra.yaml? *Donald A. Smith* | Senior Software Engineer P: 425.201.3900 x 3866 C: (206) 819-5965 F: (646) 443-2333 dona...@audiencescience.com [image: AudienceScience]
Re: Rebuilding a cassandra seed node with the same tokens and same IP address
On Fri, Aug 29, 2014 at 7:09 PM, Donald Smith donald.sm...@audiencescience.com wrote: But the node is a seed node and cassandra won't bootstrap seed nodes. Perhaps removing that node's address from the seeds list on the other nodes (and on that node) will be sufficient. That's what Replacing a Dead Seed Node http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html suggests. Perhaps I can remove the ip address from the seeds list on all nodes in the cluster, restart all the nodes, and then restart the bad node with auto_bootstrap=true. Just temporarily remove it it from its own seed list and use replace_address with auto_bootstrap=true. You need replace_address to bootstrap the node into the range it already owns. The fact that you don't have to remove it from the other nodes' seed lists suggests that there is something fundamentally confused about seed nodes cannot bootstrap implementation detail. https://issues.apache.org/jira/browse/CASSANDRA-5836?focusedCommentId=13727032page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13727032 =Rob
Rebuilding a cassandra seed node with the same tokens and same IP address
One of our nodes is getting an increasing number of pending compactions due, we think, to https://issues.apache.org/jira/browse/CASSANDRA-7145 , which is fixed in future version 2.0.11 . (We had the same error a month ago, but at that time we were in pre-production and could just clean the disks on all the nodes and restart. Now we want to be cleverer.) To overcome the issue we figure we should just rebuild the node using the same token range, to avoid unneeded data reshuffling. So we figure we should (1) find the tokens in use on that node via nodetool ring, (2) stop cassandra on that node, (3) delete the data directory, (4) Use the tokens saved in step (1) as the initial_token list, and (5) restart the node. But the node is a seed node and cassandra won't bootstrap seed nodes. Perhaps removing that node's address from the seeds list on the other nodes (and on that node) will be sufficient. That's what Replacing a Dead Seed Nodehttp://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_seed_node.html suggests. Perhaps I can remove the ip address from the seeds list on all nodes in the cluster, restart all the nodes, and then restart the bad node with auto_bootstrap=true. I want to use the same IP address. and so I don't think I can follow the instructions at http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_replace_node_t.html, because it assumes the IP address of the dead node and the new node differ. If I just start it up it will start serving traffic and read requests will fail. It wouldn't be the end of the world (the production use isn't critical yet). Should we use nodetool rebuild $LOCAL_DC? (though I think that's mostly for adding a data center) Should I add it back in and do nodetool repair? I'm afraid that would be too slow. Again, don't want to REMOVE the node from the cluster: that would cause reshuffling of token ranges and data. I want to use the same token range. Any suggestions? Thanks, Don
1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete
I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3 non-seed nodes. One of each in each availability zone with 1.2.15 and my non-seed nodes never join the cluster. If I run 1.2.14 everything works fine. We are not using vnodes and all of the initial_token values are assigned based on the Murmur3 calculations. This isn't a data migration from a previous version. It is a completely clean cluster which I am starting from scratch. The seed nodes come up and join the cluster just fine. But none of my non-seed nodes are joining the cluster. In the logs I am seeing the following from one of my non-seed nodes. Note the repeats of the last lines that never go away. INFO 15:58:54,729 Handshaking version with /10.0.12.13 INFO 15:58:55,724 Handshaking version with /10.0.32.126 INFO 15:58:56,726 Handshaking version with /10.0.22.230 INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio is 12.87932647333957 (just-counted was 12.87932647333957). calculation took 19ms for 38 columns INFO 15:58:57,679 Handshaking version with /10.0.22.206 INFO 15:58:57,726 Handshaking version with /10.0.22.230 INFO 15:58:58,728 Handshaking version with /10.0.12.13 INFO 15:58:59,730 Handshaking version with /10.0.12.103 INFO 15:59:06,090 Handshaking version with /10.0.32.126 * INFO 15:59:23,932 JOINING: waiting for schema information to complete INFO 15:59:24,932 JOINING: waiting for schema information to complete INFO 15:59:25,933 JOINING: waiting for schema information to complete INFO 15:59:26,933 JOINING: waiting for schema information to complete INFO 15:59:27,934 JOINING: waiting for schema information to complete INFO 15:59:28,934 JOINING: waiting for schema information to complete INFO 15:59:29,935 JOINING: waiting for schema information to complete INFO 15:59:30,935 JOINING: waiting for schema information to complete* So I suspect it is some sort of bootstrapping issue. I checked the CHANGES.txt and noticed this for 1.2.15. *Move handling of migration event source to solve bootstrap race (CASSANDRA-6648)* I looked at 6648 and there seems, based on some of the comments that there is a lack of confidence in this problem. Has anyone else seen this problem? -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com
Re: 1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete
If you don't have a schema, you are probably hitting this https://issues.apache.org/jira/browse/CASSANDRA-6685 On Tue, Feb 11, 2014 at 8:22 AM, John Pyeatt john.pye...@singlewire.comwrote: I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3 non-seed nodes. One of each in each availability zone with 1.2.15 and my non-seed nodes never join the cluster. If I run 1.2.14 everything works fine. We are not using vnodes and all of the initial_token values are assigned based on the Murmur3 calculations. This isn't a data migration from a previous version. It is a completely clean cluster which I am starting from scratch. The seed nodes come up and join the cluster just fine. But none of my non-seed nodes are joining the cluster. In the logs I am seeing the following from one of my non-seed nodes. Note the repeats of the last lines that never go away. INFO 15:58:54,729 Handshaking version with /10.0.12.13 INFO 15:58:55,724 Handshaking version with /10.0.32.126 INFO 15:58:56,726 Handshaking version with /10.0.22.230 INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio is 12.87932647333957 (just-counted was 12.87932647333957). calculation took 19ms for 38 columns INFO 15:58:57,679 Handshaking version with /10.0.22.206 INFO 15:58:57,726 Handshaking version with /10.0.22.230 INFO 15:58:58,728 Handshaking version with /10.0.12.13 INFO 15:58:59,730 Handshaking version with /10.0.12.103 INFO 15:59:06,090 Handshaking version with /10.0.32.126 * INFO 15:59:23,932 JOINING: waiting for schema information to complete INFO 15:59:24,932 JOINING: waiting for schema information to complete INFO 15:59:25,933 JOINING: waiting for schema information to complete INFO 15:59:26,933 JOINING: waiting for schema information to complete INFO 15:59:27,934 JOINING: waiting for schema information to complete INFO 15:59:28,934 JOINING: waiting for schema information to complete INFO 15:59:29,935 JOINING: waiting for schema information to complete INFO 15:59:30,935 JOINING: waiting for schema information to complete* So I suspect it is some sort of bootstrapping issue. I checked the CHANGES.txt and noticed this for 1.2.15. *Move handling of migration event source to solve bootstrap race (CASSANDRA-6648)* I looked at 6648 and there seems, based on some of the comments that there is a lack of confidence in this problem. Has anyone else seen this problem? -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com
Re: 1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete
On 02/11/2014 10:34 AM, sankalp kohli wrote: If you don't have a schema, you are probably hitting this https://issues.apache.org/jira/browse/CASSANDRA-6685 Looks like #6685 was committed to the cassandra-1.2 branch, yesterday. SNAPSHOT artifacts can be grabbed for the latest build of each branch, if anyone's watching for something to be committed. I just finished setting this up in jenkins, yesterday. http://cassci.datastax.com/job/cassandra-1.2/lastSuccessfulBuild/ http://cassci.datastax.com/job/cassandra-2.0/lastSuccessfulBuild/ http://cassci.datastax.com/job/trunk/lastSuccessfulBuild/ Insert these are unreleased snapshots disclaimer, here -- Kind regards, Michael
Query on Seed node
Hi , I have a 4 node cassandra cluster with one node marked as seed node. When i checked the data directory of seed node , it has two folders /keyspace/columnfamily. But sstable db files are not available.the folder is empty.The db files are available in remaining nodes. I want to know the reason why db files are not created in seed node ? what will happen if all the nodes in a cluster is marked as seed node ?? Aravindan Thangavelu Tata Consultancy Services Mailto: aravinda...@tcs.com Website: http://www.tcs.com Experience certainty. IT Services Business Solutions Consulting =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you