Re: Adding new DC
Thanks Erick! On Thu, Jul 22, 2021 at 2:29 PM Erick Ramirez wrote: > I wouldn't use either of the steps you outlined. Neither of them are > correct. > > Follow the procedure documented here instead -- > https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsAddDCToCluster.html. > Cheers! > -- Shaurya Gupta
Re: Adding new DC
I wouldn't use either of the steps you outlined. Neither of them are correct. Follow the procedure documented here instead -- https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsAddDCToCluster.html. Cheers!
Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect
Hi, We have received another report of this issue and this time we were able to identify the bug and fix it. Today's release of the driver (version 3.16.1) contains this fix. The JIRA issue is CSHARP-943 [1] Thanks, João Reis [1] https://datastax-oss.atlassian.net/browse/CSHARP-943 Gediminas Blazys escreveu no dia segunda, 18/05/2020 à(s) 07:16: > Hey, > > > > Apologies for the late reply João. > > > > We really, really appreciate your interest and likewise we could not > reproduce this issue anywhere else but in production where it occurred, > which is slightly undesirable. As we could not afford to keep the DC in > this state we have removed it from our cluster. I’m afraid we cannot > provide you with the info you’ve requested. > > > > Gediminas > > > > *From:* João Reis > *Sent:* Tuesday, May 12, 2020 19:58 > *To:* user@cassandra.apache.org > *Subject:* Re: [EXTERNAL] Re: Adding new DC results in clients failing to > connect > > > > Unfortunately I'm not able to reproduce this. > > > > Would it be possible for you to run a couple of queries and give us the > results? The queries are "SELECT * FROM system.peers" and "SELECT * FROM > system_schema.keyspaces". You should run both of these queries on any node > that the driver uses to set up the control connection when that error > occurs. To determine the node you can look for this driver log message: > "Connection established to [NODE_ADDRESS] using protocol version [VERSION]." > > > > It should be easier to reproduce the issue with the results of those > queries. > > > > Thanks, > > João Reis > > > > Gediminas Blazys escreveu no dia > sexta, 8/05/2020 à(s) 08:27: > > Hello, > > > > Thanks for looking into this. As far as the time for token map calculation > goes, we are considering reducing the number of vnodes for future DCs. > However, in the mean time we were able to deploy another DC8 (testing the > hypothesis that this may be isolated to DC7 only) and the deployment > worked. DC8 is part of the cluster now, currently being rebuilt and we did > not notice login issues with this expansion. So the topology now is this: > > > > DC1 - 18 nodes - 256 vnodes - working > > DC2 - 18 nodes - 256 vnodes - working > > DC3 - 18 nodes - 256 vnodes - working > > DC4 - 18 nodes - 256 vnodes - working > > DC5 - 18 nodes - 256 vnodes - working > > DC6 - 60 nodes - 256 vnodes - working > > DC7 - 60 nodes - 256 vnodes - once added to replication, clients can't > connect to any DC > > DC8 - 60 nodes - 256 vnodes - rebuilding at the moment, including this DC > into replication did not cause login issues. > > > > The major difference between DC7 and other DCs is that in DC7 we only have > two racks while in other locations we use three, the replication factor > however for all keyspaces remains the same – 3 for all user defined > keyspaces. Maybe this is something that could cause issues with duplicates? > It's > a theoretical but cassandra having to place two replicas on the same rack > maybe placed both the primary and a backup replica on the same node. Hence > a duplicate... > > > > Gediminas > > > > *From:* João Reis > *Sent:* Thursday, May 7, 2020 19:22 > *To:* user@cassandra.apache.org > *Subject:* Re: [EXTERNAL] Re: Adding new DC results in clients failing to > connect > > > > Hi, > > > > I don't believe that the peers entry is responsible for that exception. > Looking at the driver code, I can't even think of a scenario where that > exception would be thrown... I will run some tests in the next couple of > days to try and figure something out. > > > > One thing that is certain from those log messages is that the tokenmap > computation is very slow (20 seconds). With 100+ nodes and 256 vnodes per > node, we should expect the token map computation to be a bit slower but 20 > seconds is definitely too much. I've opened CSHARP-901 to track this. [1] > > > > João Reis > > > > [1] https://datastax-oss.atlassian.net/browse/CSHARP-901 > <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatastax-oss.atlassian.net%2Fbrowse%2FCSHARP-901&data=02%7C01%7CGediminas.Blazys%40microsoft.com%7C699b48a02b404847fd1908d7f695b020%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637248995123287800&sdata=cX1uFLsvyJPt%2FdL6x84d1CdYCN8m17A%2FpTFi1VmrG1c%3D&reserved=0> > > > > Gediminas Blazys escreveu no dia > segunda, 4/05/2020 à(s) 11:13: > > Hello again, > > > > Looking into system.peers we found that some nodes contain entries about > themse
RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect
Hey, Apologies for the late reply João. We really, really appreciate your interest and likewise we could not reproduce this issue anywhere else but in production where it occurred, which is slightly undesirable. As we could not afford to keep the DC in this state we have removed it from our cluster. I’m afraid we cannot provide you with the info you’ve requested. Gediminas From: João Reis Sent: Tuesday, May 12, 2020 19:58 To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect Unfortunately I'm not able to reproduce this. Would it be possible for you to run a couple of queries and give us the results? The queries are "SELECT * FROM system.peers" and "SELECT * FROM system_schema.keyspaces". You should run both of these queries on any node that the driver uses to set up the control connection when that error occurs. To determine the node you can look for this driver log message: "Connection established to [NODE_ADDRESS] using protocol version [VERSION]." It should be easier to reproduce the issue with the results of those queries. Thanks, João Reis Gediminas Blazys mailto:gediminas.bla...@microsoft.com.invalid>> escreveu no dia sexta, 8/05/2020 à(s) 08:27: Hello, Thanks for looking into this. As far as the time for token map calculation goes, we are considering reducing the number of vnodes for future DCs. However, in the mean time we were able to deploy another DC8 (testing the hypothesis that this may be isolated to DC7 only) and the deployment worked. DC8 is part of the cluster now, currently being rebuilt and we did not notice login issues with this expansion. So the topology now is this: DC1 - 18 nodes - 256 vnodes - working DC2 - 18 nodes - 256 vnodes - working DC3 - 18 nodes - 256 vnodes - working DC4 - 18 nodes - 256 vnodes - working DC5 - 18 nodes - 256 vnodes - working DC6 - 60 nodes - 256 vnodes - working DC7 - 60 nodes - 256 vnodes - once added to replication, clients can't connect to any DC DC8 - 60 nodes - 256 vnodes - rebuilding at the moment, including this DC into replication did not cause login issues. The major difference between DC7 and other DCs is that in DC7 we only have two racks while in other locations we use three, the replication factor however for all keyspaces remains the same – 3 for all user defined keyspaces. Maybe this is something that could cause issues with duplicates? It's a theoretical but cassandra having to place two replicas on the same rack maybe placed both the primary and a backup replica on the same node. Hence a duplicate... Gediminas From: João Reis mailto:joao.r.r...@outlook.com>> Sent: Thursday, May 7, 2020 19:22 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hi, I don't believe that the peers entry is responsible for that exception. Looking at the driver code, I can't even think of a scenario where that exception would be thrown... I will run some tests in the next couple of days to try and figure something out. One thing that is certain from those log messages is that the tokenmap computation is very slow (20 seconds). With 100+ nodes and 256 vnodes per node, we should expect the token map computation to be a bit slower but 20 seconds is definitely too much. I've opened CSHARP-901 to track this. [1] João Reis [1] https://datastax-oss.atlassian.net/browse/CSHARP-901<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatastax-oss.atlassian.net%2Fbrowse%2FCSHARP-901&data=02%7C01%7CGediminas.Blazys%40microsoft.com%7C699b48a02b404847fd1908d7f695b020%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637248995123287800&sdata=cX1uFLsvyJPt%2FdL6x84d1CdYCN8m17A%2FpTFi1VmrG1c%3D&reserved=0> Gediminas Blazys mailto:gediminas.bla...@microsoft.com.invalid>> escreveu no dia segunda, 4/05/2020 à(s) 11:13: Hello again, Looking into system.peers we found that some nodes contain entries about themselves with null values. Not sure if this could be an issue, maybe someone saw something similar? This state is there before including the funky DC into replication. peer data_center host_id preferred_ip rack release_version rpc_address schema_version tokens null null 192.168.104.111 null null null null null Have a wonderful day 😊 Gediminas From: Gediminas Blazys mailto:gediminas.bla...@microsoft.com.INVALID>> Sent: Monday, May 4, 2020 10:09 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hello, Thanks for the reply. Following your advice we took a look at system.local for seed nodes and compared that data with nodetool ring. Both sources contain the same tokens for these
Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect
Unfortunately I'm not able to reproduce this. Would it be possible for you to run a couple of queries and give us the results? The queries are "SELECT * FROM system.peers" and "SELECT * FROM system_schema.keyspaces". You should run both of these queries on any node that the driver uses to set up the control connection when that error occurs. To determine the node you can look for this driver log message: "Connection established to [NODE_ADDRESS] using protocol version [VERSION]." It should be easier to reproduce the issue with the results of those queries. Thanks, João Reis Gediminas Blazys escreveu no dia sexta, 8/05/2020 à(s) 08:27: > Hello, > > > > Thanks for looking into this. As far as the time for token map calculation > goes, we are considering reducing the number of vnodes for future DCs. > However, in the mean time we were able to deploy another DC8 (testing the > hypothesis that this may be isolated to DC7 only) and the deployment > worked. DC8 is part of the cluster now, currently being rebuilt and we did > not notice login issues with this expansion. So the topology now is this: > > > > DC1 - 18 nodes - 256 vnodes - working > > DC2 - 18 nodes - 256 vnodes - working > > DC3 - 18 nodes - 256 vnodes - working > > DC4 - 18 nodes - 256 vnodes - working > > DC5 - 18 nodes - 256 vnodes - working > > DC6 - 60 nodes - 256 vnodes - working > > DC7 - 60 nodes - 256 vnodes - once added to replication, clients can't > connect to any DC > > DC8 - 60 nodes - 256 vnodes - rebuilding at the moment, including this DC > into replication did not cause login issues. > > > > The major difference between DC7 and other DCs is that in DC7 we only have > two racks while in other locations we use three, the replication factor > however for all keyspaces remains the same – 3 for all user defined > keyspaces. Maybe this is something that could cause issues with duplicates? > It's > a theoretical but cassandra having to place two replicas on the same rack > maybe placed both the primary and a backup replica on the same node. Hence > a duplicate... > > > > Gediminas > > > > *From:* João Reis > *Sent:* Thursday, May 7, 2020 19:22 > *To:* user@cassandra.apache.org > *Subject:* Re: [EXTERNAL] Re: Adding new DC results in clients failing to > connect > > > > Hi, > > > > I don't believe that the peers entry is responsible for that exception. > Looking at the driver code, I can't even think of a scenario where that > exception would be thrown... I will run some tests in the next couple of > days to try and figure something out. > > > > One thing that is certain from those log messages is that the tokenmap > computation is very slow (20 seconds). With 100+ nodes and 256 vnodes per > node, we should expect the token map computation to be a bit slower but 20 > seconds is definitely too much. I've opened CSHARP-901 to track this. [1] > > > > João Reis > > > > [1] https://datastax-oss.atlassian.net/browse/CSHARP-901 > <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatastax-oss.atlassian.net%2Fbrowse%2FCSHARP-901&data=02%7C01%7CGediminas.Blazys%40microsoft.com%7Cb82cb4f2ca784a9fd4a608d7f2a2d300%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637244653454584013&sdata=%2B9ojISBaiyNt%2Fvlyat2wOCgDbFJIyXjmjuYMhPCB4YU%3D&reserved=0> > > > > Gediminas Blazys escreveu no dia > segunda, 4/05/2020 à(s) 11:13: > > Hello again, > > > > Looking into system.peers we found that some nodes contain entries about > themselves with null values. Not sure if this could be an issue, maybe > someone saw something similar? This state is there before including the > funky DC into replication. > > peer > > data_center > > host_id > > preferred_ip > > rack > > release_version > > rpc_address > > schema_version > > tokens > > > > null > > null > > 192.168.104.111 > > null > > null > > null > > null > > null > > > > Have a wonderful day 😊 > > > > Gediminas > > > > *From:* Gediminas Blazys > *Sent:* Monday, May 4, 2020 10:09 > *To:* user@cassandra.apache.org > *Subject:* RE: [EXTERNAL] Re: Adding new DC results in clients failing to > connect > > > > Hello, > > > > Thanks for the reply. > > > > Following your advice we took a look at system.local for seed nodes and > compared that data with nodetool ring. Both sources contain the same tokens > for these specific hosts. Will continue looking into system.peers. &
RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect
Hello, Thanks for looking into this. As far as the time for token map calculation goes, we are considering reducing the number of vnodes for future DCs. However, in the mean time we were able to deploy another DC8 (testing the hypothesis that this may be isolated to DC7 only) and the deployment worked. DC8 is part of the cluster now, currently being rebuilt and we did not notice login issues with this expansion. So the topology now is this: DC1 - 18 nodes - 256 vnodes - working DC2 - 18 nodes - 256 vnodes - working DC3 - 18 nodes - 256 vnodes - working DC4 - 18 nodes - 256 vnodes - working DC5 - 18 nodes - 256 vnodes - working DC6 - 60 nodes - 256 vnodes - working DC7 - 60 nodes - 256 vnodes - once added to replication, clients can't connect to any DC DC8 - 60 nodes - 256 vnodes - rebuilding at the moment, including this DC into replication did not cause login issues. The major difference between DC7 and other DCs is that in DC7 we only have two racks while in other locations we use three, the replication factor however for all keyspaces remains the same – 3 for all user defined keyspaces. Maybe this is something that could cause issues with duplicates? It's a theoretical but cassandra having to place two replicas on the same rack maybe placed both the primary and a backup replica on the same node. Hence a duplicate... Gediminas From: João Reis Sent: Thursday, May 7, 2020 19:22 To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hi, I don't believe that the peers entry is responsible for that exception. Looking at the driver code, I can't even think of a scenario where that exception would be thrown... I will run some tests in the next couple of days to try and figure something out. One thing that is certain from those log messages is that the tokenmap computation is very slow (20 seconds). With 100+ nodes and 256 vnodes per node, we should expect the token map computation to be a bit slower but 20 seconds is definitely too much. I've opened CSHARP-901 to track this. [1] João Reis [1] https://datastax-oss.atlassian.net/browse/CSHARP-901<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatastax-oss.atlassian.net%2Fbrowse%2FCSHARP-901&data=02%7C01%7CGediminas.Blazys%40microsoft.com%7Cb82cb4f2ca784a9fd4a608d7f2a2d300%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637244653454584013&sdata=%2B9ojISBaiyNt%2Fvlyat2wOCgDbFJIyXjmjuYMhPCB4YU%3D&reserved=0> Gediminas Blazys mailto:gediminas.bla...@microsoft.com.invalid>> escreveu no dia segunda, 4/05/2020 à(s) 11:13: Hello again, Looking into system.peers we found that some nodes contain entries about themselves with null values. Not sure if this could be an issue, maybe someone saw something similar? This state is there before including the funky DC into replication. peer data_center host_id preferred_ip rack release_version rpc_address schema_version tokens null null 192.168.104.111 null null null null null Have a wonderful day 😊 Gediminas From: Gediminas Blazys mailto:gediminas.bla...@microsoft.com.INVALID>> Sent: Monday, May 4, 2020 10:09 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hello, Thanks for the reply. Following your advice we took a look at system.local for seed nodes and compared that data with nodetool ring. Both sources contain the same tokens for these specific hosts. Will continue looking into system.peers. We have enabled more verbosity on the C# driver and this is the message that we get now: ControlConnection: 05/03/2020 14:28:42.346 +03:00 : Updating keyspaces metadata ControlConnection: 05/03/2020 14:28:42.377 +03:00 : Rebuilding token map ControlConnection: 05/03/2020 14:29:03.837 +03:00 : Finished building TokenMap for 7 keyspaces and 210 hosts. It took 19403 milliseconds. ControlConnection: 05/03/2020 14:29:03.901 +03:00 ALARMA: ENDPOINT: <>:9042 EXCEPTION: System.ArgumentException: The source argument contains duplicate keys. at System.Collections.Concurrent.ConcurrentDictionary`2.InitializeFromCollection(IEnumerable`1 collection) at System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 collection, IEqualityComparer`1 comparer) at System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 collection) at Cassandra.TokenMap..ctor(TokenFactory factory, IReadOnlyDictionary`2 tokenToHostsByKeyspace, List`1 ring, IReadOnlyDictionary`2 primaryReplicas, IReadOnlyDictionary`2 keyspaceTokensCache, IReadOnlyDictionary`2 datacenters, Int32 numberOfHostsWithTokens) at Cassandra.TokenMap.Build(String partitioner, ICollection`1 hosts, ICollection`1 keyspaces) at Cassandra.Metadata.d__59.MoveNext() --- End of stack
Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect
Hi, I don't believe that the peers entry is responsible for that exception. Looking at the driver code, I can't even think of a scenario where that exception would be thrown... I will run some tests in the next couple of days to try and figure something out. One thing that is certain from those log messages is that the tokenmap computation is very slow (20 seconds). With 100+ nodes and 256 vnodes per node, we should expect the token map computation to be a bit slower but 20 seconds is definitely too much. I've opened CSHARP-901 to track this. [1] João Reis [1] https://datastax-oss.atlassian.net/browse/CSHARP-901 Gediminas Blazys escreveu no dia segunda, 4/05/2020 à(s) 11:13: > Hello again, > > > > Looking into system.peers we found that some nodes contain entries about > themselves with null values. Not sure if this could be an issue, maybe > someone saw something similar? This state is there before including the > funky DC into replication. > > peer > > data_center > > host_id > > preferred_ip > > rack > > release_version > > rpc_address > > schema_version > > tokens > > > > null > > null > > 192.168.104.111 > > null > > null > > null > > null > > null > > > > Have a wonderful day 😊 > > > > Gediminas > > > > *From:* Gediminas Blazys > *Sent:* Monday, May 4, 2020 10:09 > *To:* user@cassandra.apache.org > *Subject:* RE: [EXTERNAL] Re: Adding new DC results in clients failing to > connect > > > > Hello, > > > > Thanks for the reply. > > > > Following your advice we took a look at system.local for seed nodes and > compared that data with nodetool ring. Both sources contain the same tokens > for these specific hosts. Will continue looking into system.peers. > > > > We have enabled more verbosity on the C# driver and this is the message > that we get now: > > > > ControlConnection: 05/03/2020 14:28:42.346 +03:00 : Updating keyspaces > metadata > > ControlConnection: 05/03/2020 14:28:42.377 +03:00 : Rebuilding token map > > ControlConnection: 05/03/2020 14:29:03.837 +03:00 : Finished building > TokenMap for 7 keyspaces and 210 hosts. It took 19403 milliseconds. > > ControlConnection: 05/03/2020 14:29:03.901 +03:00 ALARMA: ENDPOINT: > <>:9042 EXCEPTION: System.ArgumentException: The source argument > contains duplicate keys. > >at > System.Collections.Concurrent.ConcurrentDictionary`2.InitializeFromCollection(IEnumerable`1 > collection) > >at > System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 > collection, IEqualityComparer`1 comparer) > >at > System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 > collection) > >at Cassandra.TokenMap..ctor(TokenFactory factory, IReadOnlyDictionary`2 > tokenToHostsByKeyspace, List`1 ring, IReadOnlyDictionary`2 primaryReplicas, > IReadOnlyDictionary`2 keyspaceTokensCache, IReadOnlyDictionary`2 > datacenters, Int32 numberOfHostsWithTokens) > >at Cassandra.TokenMap.Build(String partitioner, ICollection`1 hosts, > ICollection`1 keyspaces) > >at Cassandra.Metadata.d__59.MoveNext() > > --- End of stack trace from previous location where exception was thrown > --- > >at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task > task) > >at > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task) > >at > System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult() > >at Cassandra.Connections.ControlConnection.d__44.MoveNext() > > > > The error occurs on Cassandra.TokenMap. We are analyzing objects that the > driver initializes during the token map creation but we are yet to find > that dictionary with duplicated keys. > > Just to note, once this new DC is added to replication python driver is > unable to establish a connection either. cqlsh though, seems to be ok. It > is hard to say for sure, but for now at least, this issue seems to be > pointing to Cassandra. > > > > Gediminas > > > > *From:* Jorge Bay Gondra > *Sent:* Thursday, April 30, 2020 11:45 > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Re: Adding new DC results in clients failing to > connect > > > > Hi, > > You can enable logging at driver to see what's happening under the hood: > https://docs.datastax.com/en/developer/csharp-driver/3.14/faq/#how-can-i-enable-logging-in-the-driver > <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.data
RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect
Hello again, Looking into system.peers we found that some nodes contain entries about themselves with null values. Not sure if this could be an issue, maybe someone saw something similar? This state is there before including the funky DC into replication. peer data_center host_id preferred_ip rack release_version rpc_address schema_version tokens null null 192.168.104.111 null null null null null Have a wonderful day 😊 Gediminas From: Gediminas Blazys Sent: Monday, May 4, 2020 10:09 To: user@cassandra.apache.org Subject: RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hello, Thanks for the reply. Following your advice we took a look at system.local for seed nodes and compared that data with nodetool ring. Both sources contain the same tokens for these specific hosts. Will continue looking into system.peers. We have enabled more verbosity on the C# driver and this is the message that we get now: ControlConnection: 05/03/2020 14:28:42.346 +03:00 : Updating keyspaces metadata ControlConnection: 05/03/2020 14:28:42.377 +03:00 : Rebuilding token map ControlConnection: 05/03/2020 14:29:03.837 +03:00 : Finished building TokenMap for 7 keyspaces and 210 hosts. It took 19403 milliseconds. ControlConnection: 05/03/2020 14:29:03.901 +03:00 ALARMA: ENDPOINT: <>:9042 EXCEPTION: System.ArgumentException: The source argument contains duplicate keys. at System.Collections.Concurrent.ConcurrentDictionary`2.InitializeFromCollection(IEnumerable`1 collection) at System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 collection, IEqualityComparer`1 comparer) at System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 collection) at Cassandra.TokenMap..ctor(TokenFactory factory, IReadOnlyDictionary`2 tokenToHostsByKeyspace, List`1 ring, IReadOnlyDictionary`2 primaryReplicas, IReadOnlyDictionary`2 keyspaceTokensCache, IReadOnlyDictionary`2 datacenters, Int32 numberOfHostsWithTokens) at Cassandra.TokenMap.Build(String partitioner, ICollection`1 hosts, ICollection`1 keyspaces) at Cassandra.Metadata.d__59.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult() at Cassandra.Connections.ControlConnection.d__44.MoveNext() The error occurs on Cassandra.TokenMap. We are analyzing objects that the driver initializes during the token map creation but we are yet to find that dictionary with duplicated keys. Just to note, once this new DC is added to replication python driver is unable to establish a connection either. cqlsh though, seems to be ok. It is hard to say for sure, but for now at least, this issue seems to be pointing to Cassandra. Gediminas From: Jorge Bay Gondra mailto:jorgebaygon...@gmail.com>> Sent: Thursday, April 30, 2020 11:45 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hi, You can enable logging at driver to see what's happening under the hood: https://docs.datastax.com/en/developer/csharp-driver/3.14/faq/#how-can-i-enable-logging-in-the-driver<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.datastax.com%2Fen%2Fdeveloper%2Fcsharp-driver%2F3.14%2Ffaq%2F%23how-can-i-enable-logging-in-the-driver&data=02%7C01%7CGediminas.Blazys%40microsoft.com%7C6a5b382a16e54752bb8e08d7effa07bc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637241729477296305&sdata=a3XX8EzNAZk7ak3EE3Q7U4kxTtNii2svHqNpoKZgADI%3D&reserved=0> With logging information, it should be easy to track the issue down. Can you query system.local and system.peers on a seed node / contact point to see if all the node list / token info is expected. You can compare it to nodetool ring info. Not directly related: 256 vnodes is probably more than you want. Thanks, Jorge On Thu, Apr 30, 2020 at 9:48 AM Gediminas Blazys mailto:gediminas.bla...@microsoft.com.invalid>> wrote: Hello, We have run into a very interesting issue and maybe some of you have encountered it or just have an idea where to look. We are working towards adding new dcs into our cluster, here's the current topology: DC1 - 18 nodes DC2 - 18 nodes DC3 - 18 nodes DC4 - 18 nodes DC5 - 18 nodes Recently we introduced a new DC6 (60 nodes) into our cluster. The joining and rebuilding of DC6 went smoothly, clients are using it without issue. This is how it looked after joining DC6: DC1 - 18 nodes DC2 - 18 nodes DC3 - 18 nodes DC4 - 18 nodes DC5 - 18 nodes DC6 - 60 nodes Next we wanted to add another DC7 (also 60 nodes) makin
RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect
Hello, Thanks for the reply. Following your advice we took a look at system.local for seed nodes and compared that data with nodetool ring. Both sources contain the same tokens for these specific hosts. Will continue looking into system.peers. We have enabled more verbosity on the C# driver and this is the message that we get now: ControlConnection: 05/03/2020 14:28:42.346 +03:00 : Updating keyspaces metadata ControlConnection: 05/03/2020 14:28:42.377 +03:00 : Rebuilding token map ControlConnection: 05/03/2020 14:29:03.837 +03:00 : Finished building TokenMap for 7 keyspaces and 210 hosts. It took 19403 milliseconds. ControlConnection: 05/03/2020 14:29:03.901 +03:00 ALARMA: ENDPOINT: <>:9042 EXCEPTION: System.ArgumentException: The source argument contains duplicate keys. at System.Collections.Concurrent.ConcurrentDictionary`2.InitializeFromCollection(IEnumerable`1 collection) at System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 collection, IEqualityComparer`1 comparer) at System.Collections.Concurrent.ConcurrentDictionary`2..ctor(IEnumerable`1 collection) at Cassandra.TokenMap..ctor(TokenFactory factory, IReadOnlyDictionary`2 tokenToHostsByKeyspace, List`1 ring, IReadOnlyDictionary`2 primaryReplicas, IReadOnlyDictionary`2 keyspaceTokensCache, IReadOnlyDictionary`2 datacenters, Int32 numberOfHostsWithTokens) at Cassandra.TokenMap.Build(String partitioner, ICollection`1 hosts, ICollection`1 keyspaces) at Cassandra.Metadata.d__59.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult() at Cassandra.Connections.ControlConnection.d__44.MoveNext() The error occurs on Cassandra.TokenMap. We are analyzing objects that the driver initializes during the token map creation but we are yet to find that dictionary with duplicated keys. Just to note, once this new DC is added to replication python driver is unable to establish a connection either. cqlsh though, seems to be ok. It is hard to say for sure, but for now at least, this issue seems to be pointing to Cassandra. Gediminas From: Jorge Bay Gondra Sent: Thursday, April 30, 2020 11:45 To: user@cassandra.apache.org Subject: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hi, You can enable logging at driver to see what's happening under the hood: https://docs.datastax.com/en/developer/csharp-driver/3.14/faq/#how-can-i-enable-logging-in-the-driver<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.datastax.com%2Fen%2Fdeveloper%2Fcsharp-driver%2F3.14%2Ffaq%2F%23how-can-i-enable-logging-in-the-driver&data=02%7C01%7CGediminas.Blazys%40microsoft.com%7Ca2e21ad89d9543a1882f08d7ece2db21%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637238332167202413&sdata=l0gUaUmMLzva4FInwoJf%2FNJvL%2FSffYrf2JWwicONVoM%3D&reserved=0> With logging information, it should be easy to track the issue down. Can you query system.local and system.peers on a seed node / contact point to see if all the node list / token info is expected. You can compare it to nodetool ring info. Not directly related: 256 vnodes is probably more than you want. Thanks, Jorge On Thu, Apr 30, 2020 at 9:48 AM Gediminas Blazys mailto:gediminas.bla...@microsoft.com.invalid>> wrote: Hello, We have run into a very interesting issue and maybe some of you have encountered it or just have an idea where to look. We are working towards adding new dcs into our cluster, here's the current topology: DC1 - 18 nodes DC2 - 18 nodes DC3 - 18 nodes DC4 - 18 nodes DC5 - 18 nodes Recently we introduced a new DC6 (60 nodes) into our cluster. The joining and rebuilding of DC6 went smoothly, clients are using it without issue. This is how it looked after joining DC6: DC1 - 18 nodes DC2 - 18 nodes DC3 - 18 nodes DC4 - 18 nodes DC5 - 18 nodes DC6 - 60 nodes Next we wanted to add another DC7 (also 60 nodes) making it a total of 210 nodes in the cluster, and while joining new nodes went smoothly, once we changed the replication of user defined keyspaces to include DC7, no clients were able to connect to Cassandra (regardless of which DC is being addressed). They would throw an exception that I have provided at the end of the email. Cassandra version 3.11.4. C# driver version 3.12.0. Also tested with 3.14.0. We use dc round robin policy and update ring metadata for connecting clients. Amount of vnodes per node: 256 The stack trace starts with an exception 'The source argument contains duplicate keys.'. Maybe you know what kind of data is in this dictionary? What data can be duplicated here? Clients are unable to connect until the moment we
Re: Adding new DC results in clients failing to connect
Hi, You can enable logging at driver to see what's happening under the hood: https://docs.datastax.com/en/developer/csharp-driver/3.14/faq/#how-can-i-enable-logging-in-the-driver With logging information, it should be easy to track the issue down. Can you query system.local and system.peers on a seed node / contact point to see if all the node list / token info is expected. You can compare it to nodetool ring info. Not directly related: 256 vnodes is probably more than you want. Thanks, Jorge On Thu, Apr 30, 2020 at 9:48 AM Gediminas Blazys wrote: > Hello, > > > > We have run into a very interesting issue and maybe some of you have > encountered it or just have an idea where to look. > > > > We are working towards adding new dcs into our cluster, here's the current > topology: > > DC1 - 18 nodes > > DC2 - 18 nodes > > DC3 - 18 nodes > > DC4 - 18 nodes > > DC5 - 18 nodes > > > > Recently we introduced a new DC6 (60 nodes) into our cluster. The joining > and rebuilding of DC6 went smoothly, clients are using it without issue. > This is how it looked after joining DC6: > > DC1 - 18 nodes > > DC2 - 18 nodes > > DC3 - 18 nodes > > DC4 - 18 nodes > > DC5 - 18 nodes > > DC6 - 60 nodes > > > > Next we wanted to add another DC7 (also 60 nodes) making it a total of 210 > nodes in the cluster, and while joining new nodes went smoothly, once we > changed the replication of user defined keyspaces to include DC7, no > clients were able to connect to Cassandra (regardless of which DC is being > addressed). They would throw an exception that I have provided at the end > of the email. > > > > Cassandra version 3.11.4. > > C# driver version 3.12.0. Also tested with 3.14.0. We use dc round robin > policy and update ring metadata for connecting clients. > > Amount of vnodes per node: 256 > > > > The stack trace starts with an exception 'The source argument contains > duplicate keys.'. Maybe you know what kind of data is in this dictionary? > What data can be duplicated here? > > > > Clients are unable to connect until the moment we remove DC7 from > replication. Once replication is adjusted to exclude DC7, clients can > connect normally. > > > > Cassandra.NoHostAvailableException: All hosts tried for query failed > (tried <>:9042: ArgumentException 'The source argument contains > duplicate keys.')2020/04/29 10:19:27.51410636 > > at > Cassandra.Connections.ControlConnection.d__39.MoveNext()2020/04/29 > 10:19:27.51410636 > > --- End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Connections.ControlConnection.d__36.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Tasks.TaskHelper.d__10.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Cluster.d__50.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.ClusterLifecycleManager.d__3.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Cluster.d__47`1.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/29 10:19:27.51410636 > > System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()2020/04/29 > 10:19:27.51410636 > > System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task > task)2020/04/29 10:19:27.51410636 > > Cassandra.Cluster.d__46.MoveNext()2020/04/29 > 10:19:27.51410636 > > End of stack trace from previous location where exception was thrown > ---2020/04/
Re: Adding new DC with different version of Cassandra
I agree with Jeff here: It's not recommended to do that but it should be still fine :). Something that might be slightly safer (even though 3.11.0 is buggy as mentioned above...) could be to add a 3.11.0 cluster. Do the streaming with 3.11.0, upgrade the new DC only, switch clients over, terminate old DC. Here I talked about it a bit: https://thelastpickle.com/blog/2019/02/26/data-center-switch.html#use-cases. There are also other information you might find useful, as this post is a runbook that details actions to do a Data Center switch. It looks like a good fit :). C*heers, --- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le lun. 1 juil. 2019 à 21:38, Rahul Reddy a écrit : > Thanks Jeff, > > We want to migrate to Apache 3.11.3 once entire cluster in apache we > eventually decommission datastax DC > > On Mon, Jul 1, 2019, 9:31 AM Jeff Jirsa wrote: > >> Should be fine, but you probably want to upgrade anyway, there were a few >> really important bugs fixed since 3.11.0 >> >> > On Jul 1, 2019, at 3:25 AM, Rahul Reddy >> wrote: >> > >> > Hello All, >> > >> > We have datastax Cassandra cluster which uses 3.11.0 and we want to add >> new DC with apache Cassandra 3.11.3. we tried doing the same and data got >> streamed to new DC. Since we are able to stream data any other issues we >> need to consider. Is it because of same type of sstables used in both the >> cases it let me add new DC? >> > >> > >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> >>
Re: Adding new DC with different version of Cassandra
Thanks Jeff, We want to migrate to Apache 3.11.3 once entire cluster in apache we eventually decommission datastax DC On Mon, Jul 1, 2019, 9:31 AM Jeff Jirsa wrote: > Should be fine, but you probably want to upgrade anyway, there were a few > really important bugs fixed since 3.11.0 > > > On Jul 1, 2019, at 3:25 AM, Rahul Reddy > wrote: > > > > Hello All, > > > > We have datastax Cassandra cluster which uses 3.11.0 and we want to add > new DC with apache Cassandra 3.11.3. we tried doing the same and data got > streamed to new DC. Since we are able to stream data any other issues we > need to consider. Is it because of same type of sstables used in both the > cases it let me add new DC? > > > > > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >
Re: Adding new DC with different version of Cassandra
Should be fine, but you probably want to upgrade anyway, there were a few really important bugs fixed since 3.11.0 > On Jul 1, 2019, at 3:25 AM, Rahul Reddy wrote: > > Hello All, > > We have datastax Cassandra cluster which uses 3.11.0 and we want to add new > DC with apache Cassandra 3.11.3. we tried doing the same and data got > streamed to new DC. Since we are able to stream data any other issues we need > to consider. Is it because of same type of sstables used in both the cases it > let me add new DC? > > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
RE: [EXTERNAL] RE: Adding new DC?
Kunal, Also to check: You should use the same list of seeds, probably two in each data center if you will have five nodes in each, in all the yaml files. All the seeds node addresses from all the data centers listed in each yaml file where it says “-seeds:”. I’m not sure from your previous replies if you’re doing that. Let us know your results. Kenneth Brotman From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] Sent: Monday, March 12, 2018 7:14 PM To: 'user@cassandra.apache.org' Subject: RE: [EXTERNAL] RE: Adding new DC? Kunal, Sorry for asking you things you already answered. You provided a lot of good information and you know what you’re are doing. It’s going to be something really simple to figure out. While I read through the thread more closely, I’m guessing we are right on top of it so could I ask you: Please read through https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configMultiNetworks.html as it probably has the answer. One of things it says specifically is: Additional cassandra.yaml configuration for non-EC2 implementations If multiple network interfaces are used in a non-EC2 implementation, enable thelisten_on_broadcast_address option. listen_on_broadcast_address: true In non-EC2 environments, the public address to private address routing is not automatically enabled. Enabling listen_on_broadcast_address allows DSE to listen on both listen_address andbroadcast_address with two network interfaces. Please consider that specially and be sure everything else it mentions is done You said you changed the broadcast_rpc_address in one of the instances in GCE and saw a change. Did you update the other nodes in GCE? And then restarted each one (in a rolling manner)? Did you restart each node in each datacenter starting with the seed nodes since you last updated a yaml file? Could the client in your application be causing the problem? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:43 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? Yes, that's correct. The customer wants us to migrate the cassandra setup in their AWS account. Thanks, Kunal On 13 March 2018 at 04:56, Kenneth Brotman wrote: I didn’t understand something. Are you saying you are using one data center on Google and one on Amazon? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:24 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? On 13 March 2018 at 03:28, Kenneth Brotman wrote: You can’t migrate and upgrade at the same time perhaps but you could do one and then the other so as to end up on new version. I’m guessing it’s an error in the yaml file or a port not open. Is there any good reason for a production cluster to still be on version 2.1x? I'm not trying to migrate AND upgrade at the same time. However, the apt repo shows only 2.120 as the available version. This is the output from the new node in AWS ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra cassandra: Installed: 2.1.20 Candidate: 2.1.20 Version table: *** 2.1.20 500 500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 Packages 100 /var/lib/dpkg/status Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node into GCE nodes. As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE firewall for the public IP of the AWS instance. I mentioned earlier - there are some differences in the column types - for example, date (>= 2.2) vs. timestamp (2.1.x) The application has not been updated yet. Hence sticking to 2.1.x for now. And, so far, 2.1.x has been serving the purpose. Kunal Kenneth Brotman From: Durity, Sean R [mailto:sean_r_dur...@homedepot.com] Sent: Monday, March 12, 2018 11:36 AM To: user@cassandra.apache.org Subject: RE: [EXTERNAL] RE: Adding new DC? You cannot migrate and upgrade at the same time across major versions. Streaming is (usually) not compatible between versions. As to the migration question, I would expect that you may need to put the external-facing ip addresses in several places in the cassandra.yaml file. And, yes, it would require a restart. Why is a non-restart more desirable? Most Cassandra changes require a restart, but you can do a rolling restart and not impact your application. This is fairly normal admin work and can/should be automated. How large is the cluster to migrate (# of nodes and size of data). The preferred method might depend on how much data needs to move. Is any application outage acceptable? Sean Durity lord of the (C*) rings (Staff Systems Engineer – Cassandra) From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent:
RE: [EXTERNAL] RE: Adding new DC?
Kunal, While we are looking into all this I feel compelled to ask you to check your security configurations now that you are using public addresses to communicate inter-node across data centers. Are you sure you are using best practices? Kenneth Brotman From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] Sent: Monday, March 12, 2018 7:14 PM To: 'user@cassandra.apache.org' Subject: RE: [EXTERNAL] RE: Adding new DC? Kunal, Sorry for asking you things you already answered. You provided a lot of good information and you know what you’re are doing. It’s going to be something really simple to figure out. While I read through the thread more closely, I’m guessing we are right on top of it so could I ask you: Please read through https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configMultiNetworks.html as it probably has the answer. One of things it says specifically is: Additional cassandra.yaml configuration for non-EC2 implementations If multiple network interfaces are used in a non-EC2 implementation, enable thelisten_on_broadcast_address option. listen_on_broadcast_address: true In non-EC2 environments, the public address to private address routing is not automatically enabled. Enabling listen_on_broadcast_address allows DSE to listen on both listen_address andbroadcast_address with two network interfaces. Please consider that specially and be sure everything else it mentions is done You said you changed the broadcast_rpc_address in one of the instances in GCE and saw a change. Did you update the other nodes in GCE? And then restarted each one (in a rolling manner)? Did you restart each node in each datacenter starting with the seed nodes since you last updated a yaml file? Could the client in your application be causing the problem? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:43 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? Yes, that's correct. The customer wants us to migrate the cassandra setup in their AWS account. Thanks, Kunal On 13 March 2018 at 04:56, Kenneth Brotman wrote: I didn’t understand something. Are you saying you are using one data center on Google and one on Amazon? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:24 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? On 13 March 2018 at 03:28, Kenneth Brotman wrote: You can’t migrate and upgrade at the same time perhaps but you could do one and then the other so as to end up on new version. I’m guessing it’s an error in the yaml file or a port not open. Is there any good reason for a production cluster to still be on version 2.1x? I'm not trying to migrate AND upgrade at the same time. However, the apt repo shows only 2.120 as the available version. This is the output from the new node in AWS ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra cassandra: Installed: 2.1.20 Candidate: 2.1.20 Version table: *** 2.1.20 500 500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 Packages 100 /var/lib/dpkg/status Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node into GCE nodes. As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE firewall for the public IP of the AWS instance. I mentioned earlier - there are some differences in the column types - for example, date (>= 2.2) vs. timestamp (2.1.x) The application has not been updated yet. Hence sticking to 2.1.x for now. And, so far, 2.1.x has been serving the purpose. Kunal Kenneth Brotman From: Durity, Sean R [mailto:sean_r_dur...@homedepot.com] Sent: Monday, March 12, 2018 11:36 AM To: user@cassandra.apache.org Subject: RE: [EXTERNAL] RE: Adding new DC? You cannot migrate and upgrade at the same time across major versions. Streaming is (usually) not compatible between versions. As to the migration question, I would expect that you may need to put the external-facing ip addresses in several places in the cassandra.yaml file. And, yes, it would require a restart. Why is a non-restart more desirable? Most Cassandra changes require a restart, but you can do a rolling restart and not impact your application. This is fairly normal admin work and can/should be automated. How large is the cluster to migrate (# of nodes and size of data). The preferred method might depend on how much data needs to move. Is any application outage acceptable? Sean Durity lord of the (C*) rings (Staff Systems Engineer – Cassandra) From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 10:20 PM To: user@cassandra.apache.org Subject: [EXTERNAL] RE: Adding new DC? Hi Kenne
RE: [EXTERNAL] RE: Adding new DC?
Kunal, Sorry for asking you things you already answered. You provided a lot of good information and you know what you’re are doing. It’s going to be something really simple to figure out. While I read through the thread more closely, I’m guessing we are right on top of it so could I ask you: Please read through https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configMultiNetworks.html as it probably has the answer. One of things it says specifically is: Additional cassandra.yaml configuration for non-EC2 implementations If multiple network interfaces are used in a non-EC2 implementation, enable thelisten_on_broadcast_address option. listen_on_broadcast_address: true In non-EC2 environments, the public address to private address routing is not automatically enabled. Enabling listen_on_broadcast_address allows DSE to listen on both listen_address andbroadcast_address with two network interfaces. Please consider that specially and be sure everything else it mentions is done You said you changed the broadcast_rpc_address in one of the instances in GCE and saw a change. Did you update the other nodes in GCE? And then restarted each one (in a rolling manner)? Did you restart each node in each datacenter starting with the seed nodes since you last updated a yaml file? Could the client in your application be causing the problem? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:43 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? Yes, that's correct. The customer wants us to migrate the cassandra setup in their AWS account. Thanks, Kunal On 13 March 2018 at 04:56, Kenneth Brotman wrote: I didn’t understand something. Are you saying you are using one data center on Google and one on Amazon? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:24 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? On 13 March 2018 at 03:28, Kenneth Brotman wrote: You can’t migrate and upgrade at the same time perhaps but you could do one and then the other so as to end up on new version. I’m guessing it’s an error in the yaml file or a port not open. Is there any good reason for a production cluster to still be on version 2.1x? I'm not trying to migrate AND upgrade at the same time. However, the apt repo shows only 2.120 as the available version. This is the output from the new node in AWS ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra cassandra: Installed: 2.1.20 Candidate: 2.1.20 Version table: *** 2.1.20 500 500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 Packages 100 /var/lib/dpkg/status Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node into GCE nodes. As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE firewall for the public IP of the AWS instance. I mentioned earlier - there are some differences in the column types - for example, date (>= 2.2) vs. timestamp (2.1.x) The application has not been updated yet. Hence sticking to 2.1.x for now. And, so far, 2.1.x has been serving the purpose. Kunal Kenneth Brotman From: Durity, Sean R [mailto:sean_r_dur...@homedepot.com] Sent: Monday, March 12, 2018 11:36 AM To: user@cassandra.apache.org Subject: RE: [EXTERNAL] RE: Adding new DC? You cannot migrate and upgrade at the same time across major versions. Streaming is (usually) not compatible between versions. As to the migration question, I would expect that you may need to put the external-facing ip addresses in several places in the cassandra.yaml file. And, yes, it would require a restart. Why is a non-restart more desirable? Most Cassandra changes require a restart, but you can do a rolling restart and not impact your application. This is fairly normal admin work and can/should be automated. How large is the cluster to migrate (# of nodes and size of data). The preferred method might depend on how much data needs to move. Is any application outage acceptable? Sean Durity lord of the (C*) rings (Staff Systems Engineer – Cassandra) From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 10:20 PM To: user@cassandra.apache.org Subject: [EXTERNAL] RE: Adding new DC? Hi Kenneth, Replies inline below. On 12-Mar-2018 3:40 AM, "Kenneth Brotman" wrote: Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Application side constraints - some data types are different between 2.1.x and 3.x (for example, date vs. timestamp). Besides, this is p
Re: [EXTERNAL] RE: Adding new DC?
Yes, that's correct. The customer wants us to migrate the cassandra setup in their AWS account. Thanks, Kunal On 13 March 2018 at 04:56, Kenneth Brotman wrote: > I didn’t understand something. Are you saying you are using one data > center on Google and one on Amazon? > > > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Monday, March 12, 2018 4:24 PM > *To:* user@cassandra.apache.org > *Cc:* Nikhil Soman > *Subject:* Re: [EXTERNAL] RE: Adding new DC? > > > > > > On 13 March 2018 at 03:28, Kenneth Brotman > wrote: > > You can’t migrate and upgrade at the same time perhaps but you could do > one and then the other so as to end up on new version. I’m guessing it’s > an error in the yaml file or a port not open. Is there any good reason for > a production cluster to still be on version 2.1x? > > > > I'm not trying to migrate AND upgrade at the same time. However, the apt > repo shows only 2.120 as the available version. > > This is the output from the new node in AWS > > > > ubuntu@ip-10-0-43-213:*~*$ apt-cache policy cassandra > cassandra: > Installed: 2.1.20 > Candidate: 2.1.20 > Version table: > *** 2.1.20 500 >500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 > Packages >100 /var/lib/dpkg/status > > Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node > into GCE nodes. > > As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE > firewall for the public IP of the AWS instance. > > > > I mentioned earlier - there are some differences in the column types - for > example, date (>= 2.2) vs. timestamp (2.1.x) > > The application has not been updated yet. > > Hence sticking to 2.1.x for now. > > > > And, so far, 2.1.x has been serving the purpose. > > Kunal > > > > > > Kenneth Brotman > > > > *From:* Durity, Sean R [mailto:sean_r_dur...@homedepot.com] > *Sent:* Monday, March 12, 2018 11:36 AM > *To:* user@cassandra.apache.org > *Subject:* RE: [EXTERNAL] RE: Adding new DC? > > > > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com > ] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > > Thanks, > > Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 2:32 PM > *To:* user@cassandra.apache.org > *Subject:* Adding new DC? > > > > Hi all, > > > > We currently have a cluster in GCE for one of the customers. > > They want it to be migrated to AWS. > > > > I have setup one node in AWS to join into the cluster by following: > > https://docsdatastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5F
Re: [EXTERNAL] RE: Adding new DC?
On 13 March 2018 at 04:54, Kenneth Brotman wrote: > Kunal, > > > > Please provide the following setting from the yaml files you are using: > > > > seeds: > In GCE: seeds: "10.142.14.27" In AWS (new node being added): seeds: "35.196.96.247,35.227.127.245,35.196.241.232" (these are the public IP addresses of 3 nodes from GCE) I have verified that I am able to do cqlsh from the AWS instance to all 3 ip addresses. > listen_address: > We use the listen_interface setting instead of listen_address. In GCE: listen_interface: eth0 (running ubuntu 14.04 LTS) In AWS: listen_interface: ens3 (running ubuntu 16.04 LTS) > broadcast_address: > I tried setting broadcast_address to one instance in GCE: broadcast_address: 35.196.96.247 In AWS: broadcast_address: 13.127.89.251 (this is the public/elastic IP of the node in AWS) rpc_address: > Like listen_address, we use rpc_interface. In GCE: rpc_interface: eth0 In AWS: rpc_interface: ens3 > endpoint_snitch: > In both setups, we currently use GossipingPropertyFileSnitch. The cassandra-rackdc.properties files from both setups: GCE: dc=DC1 rack=RAC1 AWS: dc=DC2 rack=RAC1 > auto_bootstrap: > When the google cloud instances started up, we hadn't set this explicitly - so, they started off with default value (auto_bootstrap: true) However, as outlined in the datastax doc for adding new dc, I had added 'auto_bootstrap: false' to the google cloud instances (not restarted the service as per the doc). In the AWS instance, I had added 'auto_bootstrap: false' - the doc says we need to do "nodetool rebuild" and hence no automatic bootstrapping. But, haven't gotten to that step yet. Thanks, Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Monday, March 12, 2018 4:13 PM > *To:* user@cassandra.apache.org > *Cc:* Nikhil Soman > *Subject:* Re: [EXTERNAL] RE: Adding new DC? > > > > > > On 13 March 2018 at 00:06, Durity, Sean R > wrote: > > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > > > > I'm not trying to upgrade as of now - first priority is the migration. > > We can look at version upgrade later on. > > > > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > > > > I just tried setting the broadcast_address in one of the instances in GCE > to its public IP and restarted the service. > > However, it now shows all other nodes (in GCE) as DN in nodetool status > output and the other nodes also report this node as DN with its > internal/private IP address. > > > > I also tried setting the broadcast_rpc_address to the internal/private IP > address - still the same. > > > > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > > > > No. of nodes: 5 > > RF: 3 > > Data size (as reported by the load factor in nodetool status output): > ~30GB per node > > > > Thanks, > Kunal > > > > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk. > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > > Thanks, > > Kunal > > K
RE: [EXTERNAL] RE: Adding new DC?
I didn’t understand something. Are you saying you are using one data center on Google and one on Amazon? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:24 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? On 13 March 2018 at 03:28, Kenneth Brotman wrote: You can’t migrate and upgrade at the same time perhaps but you could do one and then the other so as to end up on new version. I’m guessing it’s an error in the yaml file or a port not open. Is there any good reason for a production cluster to still be on version 2.1x? I'm not trying to migrate AND upgrade at the same time. However, the apt repo shows only 2.120 as the available version. This is the output from the new node in AWS ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra cassandra: Installed: 2.1.20 Candidate: 2.1.20 Version table: *** 2.1.20 500 500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 Packages 100 /var/lib/dpkg/status Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node into GCE nodes. As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE firewall for the public IP of the AWS instance. I mentioned earlier - there are some differences in the column types - for example, date (>= 2.2) vs. timestamp (2.1.x) The application has not been updated yet. Hence sticking to 2.1.x for now. And, so far, 2.1.x has been serving the purpose. Kunal Kenneth Brotman From: Durity, Sean R [mailto:sean_r_dur...@homedepot.com] Sent: Monday, March 12, 2018 11:36 AM To: user@cassandra.apache.org Subject: RE: [EXTERNAL] RE: Adding new DC? You cannot migrate and upgrade at the same time across major versions. Streaming is (usually) not compatible between versions. As to the migration question, I would expect that you may need to put the external-facing ip addresses in several places in the cassandra.yaml file. And, yes, it would require a restart. Why is a non-restart more desirable? Most Cassandra changes require a restart, but you can do a rolling restart and not impact your application. This is fairly normal admin work and can/should be automated. How large is the cluster to migrate (# of nodes and size of data). The preferred method might depend on how much data needs to move. Is any application outage acceptable? Sean Durity lord of the (C*) rings (Staff Systems Engineer – Cassandra) From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 10:20 PM To: user@cassandra.apache.org Subject: [EXTERNAL] RE: Adding new DC? Hi Kenneth, Replies inline below. On 12-Mar-2018 3:40 AM, "Kenneth Brotman" wrote: Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Application side constraints - some data types are different between 2.1.x and 3.x (for example, date vs. timestamp). Besides, this is production setup - so, cannot take risk Are both data centers in the same region on AWS? Can you provide yaml file for us to see? No, they are in different regions - GCE setup is in us-east while AWS setup is in Asia-south (Mumbai) Thanks, Kunal Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 2:32 PM To: user@cassandra.apache.org Subject: Adding new DC? Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docsdatastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo&s=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk&e=> Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=D
Re: [EXTERNAL] RE: Adding new DC?
On 13 March 2018 at 03:28, Kenneth Brotman wrote: > You can’t migrate and upgrade at the same time perhaps but you could do > one and then the other so as to end up on new version. I’m guessing it’s > an error in the yaml file or a port not open. Is there any good reason for > a production cluster to still be on version 2.1x? > I'm not trying to migrate AND upgrade at the same time. However, the apt repo shows only 2.1.20 as the available version. This is the output from the new node in AWS ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra cassandra: Installed: 2.1.20 Candidate: 2.1.20 Version table: *** 2.1.20 500 500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 Packages 100 /var/lib/dpkg/status Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node into GCE nodes. As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE firewall for the public IP of the AWS instance. I mentioned earlier - there are some differences in the column types - for example, date (>= 2.2) vs. timestamp (2.1.x) The application has not been updated yet. Hence sticking to 2.1.x for now. And, so far, 2.1.x has been serving the purpose. Kunal > > Kenneth Brotman > > > > *From:* Durity, Sean R [mailto:sean_r_dur...@homedepot.com] > *Sent:* Monday, March 12, 2018 11:36 AM > *To:* user@cassandra.apache.org > *Subject:* RE: [EXTERNAL] RE: Adding new DC? > > > > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com > ] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk. > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > > Thanks, > > Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 2:32 PM > *To:* user@cassandra.apache.org > *Subject:* Adding new DC? > > > > Hi all, > > > > We currently have a cluster in GCE for one of the customers. > > They want it to be migrated to AWS. > > > > I have setup one node in AWS to join into the cluster by following: > > https://docs.datastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo&s=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk&e=> > > > > Will add more nodes once the first one joins successfully. > > > > The node in AWS has an elastic IP - which is white-listed for ports > 7000-7001, 7199, 9042 in GCE firewall. > > > > The snitch is set to GossipingPropertyFileSnitch. The GCE setup has > dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. > > > > When I start cassandra service on the AWS instance, I see the version > handshake msgs in the logs trying to connect to the public IPs of the GCE > nodes: > > OutboundTcpConnection.
RE: [EXTERNAL] RE: Adding new DC?
Kunal, Please provide the following setting from the yaml files you are using: seeds: listen_address: broadcast_address: rpc_address: endpoint_snitch: auto_bootstrap: Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Monday, March 12, 2018 4:13 PM To: user@cassandra.apache.org Cc: Nikhil Soman Subject: Re: [EXTERNAL] RE: Adding new DC? On 13 March 2018 at 00:06, Durity, Sean R wrote: You cannot migrate and upgrade at the same time across major versions. Streaming is (usually) not compatible between versions. I'm not trying to upgrade as of now - first priority is the migration. We can look at version upgrade later on. As to the migration question, I would expect that you may need to put the external-facing ip addresses in several places in the cassandra.yaml file. And, yes, it would require a restart. Why is a non-restart more desirable? Most Cassandra changes require a restart, but you can do a rolling restart and not impact your application. This is fairly normal admin work and can/should be automated. I just tried setting the broadcast_address in one of the instances in GCE to its public IP and restarted the service. However, it now shows all other nodes (in GCE) as DN in nodetool status output and the other nodes also report this node as DN with its internal/private IP address. I also tried setting the broadcast_rpc_address to the internal/private IP address - still the same. How large is the cluster to migrate (# of nodes and size of data). The preferred method might depend on how much data needs to move. Is any application outage acceptable? No. of nodes: 5 RF: 3 Data size (as reported by the load factor in nodetool status output): ~30GB per node Thanks, Kunal Sean Durity lord of the (C*) rings (Staff Systems Engineer – Cassandra) From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 10:20 PM To: user@cassandra.apache.org Subject: [EXTERNAL] RE: Adding new DC? Hi Kenneth, Replies inline below. On 12-Mar-2018 3:40 AM, "Kenneth Brotman" wrote: Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Application side constraints - some data types are different between 2.1.x and 3.x (for example, date vs. timestamp). Besides, this is production setup - so, cannot take risk. Are both data centers in the same region on AWS? Can you provide yaml file for us to see? No, they are in different regions - GCE setup is in us-east while AWS setup is in Asia-south (Mumbai) Thanks, Kunal Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhedkar@gmailcom <mailto:kgangakhed...@gmail.com> ] Sent: Sunday, March 11, 2018 2:32 PM To: user@cassandra.apache.org Subject: Adding new DC? Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo&s=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk&e=> Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=DC1). In cassandra.yaml file, I'm only using listen_interface and rpc_interface settings - no explicit IP addresses used - so, ends up using the internal private IP ranges. Do I need to explicitly add the broadcast_address? for both side? Would that require restarting of cassandra service on GCE side? Or is it possible to change that setting on-the-fly without a restart? I would prefer a non-restart option. PS: The cassandra version running in GCE is 2.1.18 while the new node setup in AWS is running 2.1.20 - just in case if that's relevant Thanks, Kunal _
Re: [EXTERNAL] RE: Adding new DC?
On 13 March 2018 at 00:06, Durity, Sean R wrote: > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > I'm not trying to upgrade as of now - first priority is the migration. We can look at version upgrade later on. > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > I just tried setting the broadcast_address in one of the instances in GCE to its public IP and restarted the service. However, it now shows all other nodes (in GCE) as DN in nodetool status output and the other nodes also report this node as DN with its internal/private IP address. I also tried setting the broadcast_rpc_address to the internal/private IP address - still the same. > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > No. of nodes: 5 RF: 3 Data size (as reported by the load factor in nodetool status output): ~30GB per node Thanks, Kunal > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk. > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > > Thanks, > > Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 2:32 PM > *To:* user@cassandra.apache.org > *Subject:* Adding new DC? > > > > Hi all, > > > > We currently have a cluster in GCE for one of the customers. > > They want it to be migrated to AWS. > > > > I have setup one node in AWS to join into the cluster by following: > > https://docs.datastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo&s=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk&e=> > > > > Will add more nodes once the first one joins successfully. > > > > The node in AWS has an elastic IP - which is white-listed for ports > 7000-7001, 7199, 9042 in GCE firewall. > > > > The snitch is set to GossipingPropertyFileSnitch. The GCE setup has > dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. > > > > When I start cassandra service on the AWS instance, I see the version > handshake msgs in the logs trying to connect to the public IPs of the GCE > nodes: > > OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx > > However, nodetool status output on both sides don't show the other side at > all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS > setup doesn't show old DC (dc=DC1). > > > > In cassandra.yaml file, I'm only using listen_interface and rpc_interface > settings - no explicit IP addresses used - so, ends up using the internal > private IP ranges. > > > > Do I need to explicitly add the broadcast_address? for both side? > > Would that require restarting of cassandra service on GCE side? Or is it > possible to change that setting on-the-fly without a restart? > > > > I would prefer a non-restart option. > > > > PS: The cassandra ve
RE: [EXTERNAL] RE: Adding new DC?
You can’t migrate and upgrade at the same time perhaps but you could do one and then the other so as to end up on new version. I’m guessing it’s an error in the yaml file or a port not open. Is there any good reason for a production cluster to still be on version 2.1x? Kenneth Brotman From: Durity, Sean R [mailto:sean_r_dur...@homedepot.com] Sent: Monday, March 12, 2018 11:36 AM To: user@cassandra.apache.org Subject: RE: [EXTERNAL] RE: Adding new DC? You cannot migrate and upgrade at the same time across major versions. Streaming is (usually) not compatible between versions. As to the migration question, I would expect that you may need to put the external-facing ip addresses in several places in the cassandra.yaml file. And, yes, it would require a restart. Why is a non-restart more desirable? Most Cassandra changes require a restart, but you can do a rolling restart and not impact your application. This is fairly normal admin work and can/should be automated. How large is the cluster to migrate (# of nodes and size of data). The preferred method might depend on how much data needs to move. Is any application outage acceptable? Sean Durity lord of the (C*) rings (Staff Systems Engineer – Cassandra) From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 10:20 PM To: user@cassandra.apache.org Subject: [EXTERNAL] RE: Adding new DC? Hi Kenneth, Replies inline below. On 12-Mar-2018 3:40 AM, "Kenneth Brotman" wrote: Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Application side constraints - some data types are different between 2.1.x and 3.x (for example, date vs. timestamp). Besides, this is production setup - so, cannot take risk. Are both data centers in the same region on AWS? Can you provide yaml file for us to see? No, they are in different regions - GCE setup is in us-east while AWS setup is in Asia-south (Mumbai) Thanks, Kunal Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 2:32 PM To: user@cassandra.apache.org Subject: Adding new DC? Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo&s=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk&e=> Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=DC1). In cassandra.yaml file, I'm only using listen_interface and rpc_interface settings - no explicit IP addresses used - so, ends up using the internal private IP ranges. Do I need to explicitly add the broadcast_address? for both side? Would that require restarting of cassandra service on GCE side? Or is it possible to change that setting on-the-fly without a restart? I would prefer a non-restart option. PS: The cassandra version running in GCE is 2.1.18 while the new node setup in AWS is running 2.1.20 - just in case if that's relevant Thanks, Kunal _ The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attach
RE: [EXTERNAL] RE: Adding new DC?
You cannot migrate and upgrade at the same time across major versions. Streaming is (usually) not compatible between versions. As to the migration question, I would expect that you may need to put the external-facing ip addresses in several places in the cassandra.yaml file. And, yes, it would require a restart. Why is a non-restart more desirable? Most Cassandra changes require a restart, but you can do a rolling restart and not impact your application. This is fairly normal admin work and can/should be automated. How large is the cluster to migrate (# of nodes and size of data). The preferred method might depend on how much data needs to move. Is any application outage acceptable? Sean Durity lord of the (C*) rings (Staff Systems Engineer – Cassandra) From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 10:20 PM To: user@cassandra.apache.org Subject: [EXTERNAL] RE: Adding new DC? Hi Kenneth, Replies inline below. On 12-Mar-2018 3:40 AM, "Kenneth Brotman" mailto:kenbrot...@yahoo.com.invalid>> wrote: Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Application side constraints - some data types are different between 2.1.x and 3.x (for example, date vs. timestamp). Besides, this is production setup - so, cannot take risk. Are both data centers in the same region on AWS? Can you provide yaml file for us to see? No, they are in different regions - GCE setup is in us-east while AWS setup is in Asia-south (Mumbai) Thanks, Kunal Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com<mailto:kgangakhed...@gmail.com>] Sent: Sunday, March 11, 2018 2:32 PM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: Adding new DC? Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo&s=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk&e=> Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=DC1). In cassandra.yaml file, I'm only using listen_interface and rpc_interface settings - no explicit IP addresses used - so, ends up using the internal private IP ranges. Do I need to explicitly add the broadcast_address? for both side? Would that require restarting of cassandra service on GCE side? Or is it possible to change that setting on-the-fly without a restart? I would prefer a non-restart option. PS: The cassandra version running in GCE is 2.1.18 while the new node setup in AWS is running 2.1.20 - just in case if that's relevant Thanks, Kunal The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: Adding new DC?
How did you distribute your seed nodes across whole cluster? -- Rahul Singh rahul.si...@anant.us Anant Corporation On Mar 12, 2018, 5:12 AM -0400, Oleksandr Shulgin , wrote: > > On Sun, Mar 11, 2018 at 10:31 PM, Kunal Gangakhedkar > > wrote: > > > Hi all, > > > > > > We currently have a cluster in GCE for one of the customers. > > > They want it to be migrated to AWS. > > > > > > I have setup one node in AWS to join into the cluster by following: > > > https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html > > > > > > Will add more nodes once the first one joins successfully. > > > > > > The node in AWS has an elastic IP - which is white-listed for ports > > > 7000-7001, 7199, 9042 in GCE firewall. > > > > > > The snitch is set to GossipingPropertyFileSnitch. The GCE setup has > > > dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. > > > > > > When I start cassandra service on the AWS instance, I see the version > > > handshake msgs in the logs trying to connect to the public IPs of the GCE > > > nodes: > > > OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx > > > > > > However, nodetool status output on both sides don't show the other side > > > at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the > > > AWS setup doesn't show old DC (dc=DC1). > > > > > > In cassandra.yaml file, I'm only using listen_interface and rpc_interface > > > settings - no explicit IP addresses used - so, ends up using the internal > > > private IP ranges. > > > > > > Do I need to explicitly add the broadcast_address? > > > > On the AWS side you could use EC2MultiRegionSnitch: it will assign the > > appropriate address (Elastic IP) to this, as well as set DC and rack from > > the EC2 Availability Zone. > > > > > for both side? > > > > I would expect that you have to specify proper broadcast_address on the GCE > > side as well. > > > > > Would that require restarting of cassandra service on GCE side? Or is it > > > possible to change that setting on-the-fly without a restart? > > > > A restart is required AFAIK. > > > > -- > > Alex > >
Re: Adding new DC?
On Sun, Mar 11, 2018 at 10:31 PM, Kunal Gangakhedkar < kgangakhed...@gmail.com> wrote: > Hi all, > > We currently have a cluster in GCE for one of the customers. > They want it to be migrated to AWS. > > I have setup one node in AWS to join into the cluster by following: > https://docs.datastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html > > Will add more nodes once the first one joins successfully. > > The node in AWS has an elastic IP - which is white-listed for ports > 7000-7001, 7199, 9042 in GCE firewall. > > The snitch is set to GossipingPropertyFileSnitch. The GCE setup has > dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. > > When I start cassandra service on the AWS instance, I see the version > handshake msgs in the logs trying to connect to the public IPs of the GCE > nodes: > OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx > > However, nodetool status output on both sides don't show the other side at > all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS > setup doesn't show old DC (dc=DC1). > > In cassandra.yaml file, I'm only using listen_interface and rpc_interface > settings - no explicit IP addresses used - so, ends up using the internal > private IP ranges. > > Do I need to explicitly add the broadcast_address? > On the AWS side you could use EC2MultiRegionSnitch: it will assign the appropriate address (Elastic IP) to this, as well as set DC and rack from the EC2 Availability Zone. > for both side? > I would expect that you have to specify proper broadcast_address on the GCE side as well. > Would that require restarting of cassandra service on GCE side? Or is it > possible to change that setting on-the-fly without a restart? > A restart is required AFAIK. -- Alex
RE: Adding new DC?
Hi Kenneth, Replies inline below. On 12-Mar-2018 3:40 AM, "Kenneth Brotman" wrote: Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Application side constraints - some data types are different between 2.1.x and 3.x (for example, date vs. timestamp). Besides, this is production setup - so, cannot take risk. Are both data centers in the same region on AWS? Can you provide yaml file for us to see? No, they are in different regions - GCE setup is in us-east while AWS setup is in Asia-south (Mumbai) Thanks, Kunal Kenneth Brotman *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] *Sent:* Sunday, March 11, 2018 2:32 PM *To:* user@cassandra.apache.org *Subject:* Adding new DC? Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docs.datastax.com/en/cassandra/2.1/cassandra/ operations/ops_add_dc_to_cluster_t.html Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=DC1). In cassandra.yaml file, I'm only using listen_interface and rpc_interface settings - no explicit IP addresses used - so, ends up using the internal private IP ranges. Do I need to explicitly add the broadcast_address? for both side? Would that require restarting of cassandra service on GCE side? Or is it possible to change that setting on-the-fly without a restart? I would prefer a non-restart option. PS: The cassandra version running in GCE is 2.1.18 while the new node setup in AWS is running 2.1.20 - just in case if that's relevant Thanks, Kunal
RE: Adding new DC?
Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Are both data centers in the same region on AWS? Can you provide yaml file for us to see? Kenneth Brotman From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] Sent: Sunday, March 11, 2018 2:32 PM To: user@cassandra.apache.org Subject: Adding new DC? Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=DC1). In cassandra.yaml file, I'm only using listen_interface and rpc_interface settings - no explicit IP addresses used - so, ends up using the internal private IP ranges. Do I need to explicitly add the broadcast_address? for both side? Would that require restarting of cassandra service on GCE side? Or is it possible to change that setting on-the-fly without a restart? I would prefer a non-restart option. PS: The cassandra version running in GCE is 2.1.18 while the new node setup in AWS is running 2.1.20 - just in case if that's relevant Thanks, Kunal