Hi Alain, Thanks a lot for a helping out!
Some of the basic keyspace / cluster info you requested: # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh CREATE KEYSPACE system_distributed WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; CREATE TABLE system_distributed.repair_history ( keyspace_name text, columnfamily_name text, id timeuuid, coordinator inet, exception_message text, exception_stacktrace text, finished_at timestamp, parent_id timeuuid, participants set<inet>, range_begin text, range_end text, started_at timestamp, status text, PRIMARY KEY ((keyspace_name, columnfamily_name), id) ) WITH CLUSTERING ORDER BY (id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = 'Repair history' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 0 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 3600000 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; CREATE TABLE system_distributed.parent_repair_history ( parent_id timeuuid PRIMARY KEY, columnfamily_names set<text>, exception_message text, exception_stacktrace text, finished_at timestamp, keyspace_name text, requested_ranges set<text>, started_at timestamp, successful_ranges set<text> ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = 'Repair history' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 0 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 3600000 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; CREATE TABLE system_distributed.repair_history ( keyspace_name text, columnfamily_name text, id timeuuid, coordinator inet, exception_message text, exception_stacktrace text, finished_at timestamp, parent_id timeuuid, participants set<inet>, range_begin text, range_end text, started_at timestamp, status text, PRIMARY KEY ((keyspace_name, columnfamily_name), id) ) WITH CLUSTERING ORDER BY (id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = 'Repair history' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 0 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 3600000 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; CREATE TABLE system_distributed.parent_repair_history ( parent_id timeuuid PRIMARY KEY, columnfamily_names set<text>, exception_message text, exception_stacktrace text, finished_at timestamp, keyspace_name text, requested_ranges set<text>, started_at timestamp, successful_ranges set<text> ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = 'Repair history' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 0 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 3600000 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; # nodetool status Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN xxx.xxx.145.5 693,63 GB 256 ? 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1 UN xxx.xxx.145.225 648,55 GB 256 ? f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1 UN xxx.xxx.145.160 608,31 GB 256 ? d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1 UN xxx.xxx.145.67 552,93 GB 256 ? 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1 UN xxx.xxx.145.227 636,68 GB 256 ? 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1 UN xxx.xxx.146.105 610,9 GB 256 ? 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1 UN xxx.xxx.147.136 666,82 GB 256 ? bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1 UN xxx.xxx.146.213 609,79 GB 256 ? 6416275c-7570-48a9-957f-2daca71d31aa RAC1 UN xxx.xxx.146.20 664,44 GB 256 ? b016df7e-f694-4ef3-928c-8783853e9a07 RAC1 UN xxx.xxx.146.209 615,44 GB 256 ? 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1 UN xxx.xxx.146.241 668,91 GB 256 ? 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1 UN xxx.xxx.147.211 641,33 GB 256 ? 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1 UN xxx.xxx.147.125 647,03 GB 256 ? 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1 Datacenter: DC2 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN xxx.xxx.7.99 18,76 MB 256 ? d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1 UN xxx.xxx.6.135 16,04 MB 256 ? 463f480a-baf3-4230-86b7-1106251ebfad RAC1 UN xxx.xxx.7.229 17,36 MB 256 ? 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1 UN xxx.xxx.7.5 14,01 MB 256 ? ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1 UN xxx.xxx.7.4 14,93 MB 256 ? 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1 UN xxx.xxx.6.10 16,77 MB 256 ? bbb66068-bf06-438d-81ee-965e201e8fff RAC1 UN xxx.xxx.6.15 14,95 MB 256 ? 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1 UN xxx.xxx.7.140 17,38 MB 256 ? 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1 UN xxx.xxx.7.113 19,14 MB 256 ? 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1 UN xxx.xxx.6.118 16,7 MB 256 ? 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1 UN xxx.xxx.6.248 17,29 MB 256 ? 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1 UN xxx.xxx.5.24 16,55 MB 256 ? 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1 UN xxx.xxx.7.189 16,63 MB 256 ? be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1 UN xxx.xxx.5.124 20,37 MB 256 ? 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1 UN xxx.xxx.6.60 24,57 MB 256 ? cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1 Datacenter: DC3 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN xxx.xxx.151.102 389,41 GB 256 ? 1740a473-e304-467c-a682-d1b4b0595ffa RAC1 UN xxx.xxx.149.161 367,82 GB 256 ? 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1 UN xxx.xxx.149.226 390,88 GB 256 ? b8ca4576-2632-4198-ac87-10243c0c554e RAC1 UN xxx.xxx.151.162 408,35 GB 256 ? 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1 UN xxx.xxx.149.109 369,33 GB 256 ? 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1 UN xxx.xxx.150.172 362,32 GB 256 ? ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1 UN xxx.xxx.149.238 388,98 GB 256 ? a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1 UN xxx.xxx.151.232 435,31 GB 256 ? 500a43ab-ae77-4a07-876c-171cb34c549b RAC1 UN xxx.xxx.151.43 410,69 GB 256 ? b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1 UN xxx.xxx.151.139 407,47 GB 256 ? ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1 UN xxx.xxx.151.213 375,05 GB 256 ? 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1 UN xxx.xxx.149.177 401,91 GB 256 ? b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1 UN xxx.xxx.150.145 388,76 GB 256 ? 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1 UN xxx.xxx.149.48 385,43 GB 256 ? ad3ea388-203c-4b26-a368-934a6105cc6e RAC1 UN xxx.xxx.150.189 384,52 GB 256 ? f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1 UN xxx.xxx.151.220 357,56 GB 256 ? feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1 UN xxx.xxx.149.121 355,64 GB 256 ? 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1 UN xxx.xxx.151.218 416,57 GB 256 ? bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1 UN xxx.xxx.150.26 383,06 GB 256 ? 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1 Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless # nodetool status system_distributed Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN xxx.xxx.145.5 693,63 GB 256 6,2% 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1 UN xxx.xxx.145.225 648,55 GB 256 6,8% f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1 UN xxx.xxx.145.160 608,31 GB 256 6,5% d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1 UN xxx.xxx.145.67 552,93 GB 256 6,1% 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1 UN xxx.xxx.145.227 636,68 GB 256 6,0% 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1 UN xxx.xxx.146.105 610,9 GB 256 6,1% 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1 UN xxx.xxx.147.136 666,82 GB 256 6,3% bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1 UN xxx.xxx.146.213 609,79 GB 256 6,0% 6416275c-7570-48a9-957f-2daca71d31aa RAC1 UN xxx.xxx.146.20 664,44 GB 256 7,0% b016df7e-f694-4ef3-928c-8783853e9a07 RAC1 UN xxx.xxx.146.209 615,44 GB 256 6,6% 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1 UN xxx.xxx.146.241 668,91 GB 256 6,2% 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1 UN xxx.xxx.147.211 641,33 GB 256 6,5% 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1 UN xxx.xxx.147.125 647,03 GB 256 6,3% 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1 Datacenter: DC2 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN xxx.xxx.7.99 18,76 MB 256 6,3% d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1 UN xxx.xxx.6.135 16,04 MB 256 6,1% 463f480a-baf3-4230-86b7-1106251ebfad RAC1 UN xxx.xxx.7.229 17,36 MB 256 5,9% 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1 UN xxx.xxx.7.5 14,01 MB 256 6,2% ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1 UN xxx.xxx.7.4 14,93 MB 256 6,4% 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1 UN xxx.xxx.6.10 16,77 MB 256 6,4% bbb66068-bf06-438d-81ee-965e201e8fff RAC1 UN xxx.xxx.6.15 14,95 MB 256 6,1% 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1 UN xxx.xxx.7.140 17,38 MB 256 6,7% 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1 UN xxx.xxx.7.113 19,14 MB 256 6,8% 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1 UN xxx.xxx.6.118 16,7 MB 256 6,7% 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1 UN xxx.xxx.6.248 17,29 MB 256 6,9% 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1 UN xxx.xxx.5.24 16,55 MB 256 6,8% 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1 UN xxx.xxx.7.189 16,63 MB 256 6,2% be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1 UN xxx.xxx.5.124 20,37 MB 256 6,3% 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1 UN xxx.xxx.6.60 24,57 MB 256 6,4% cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1 Datacenter: DC3 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN xxx.xxx.151.102 389,41 GB 256 6,4% 1740a473-e304-467c-a682-d1b4b0595ffa RAC1 UN xxx.xxx.149.161 367,82 GB 256 6,3% 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1 UN xxx.xxx.149.226 390,88 GB 256 6,2% b8ca4576-2632-4198-ac87-10243c0c554e RAC1 UN xxx.xxx.151.162 408,35 GB 256 6,4% 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1 UN xxx.xxx.149.109 369,33 GB 256 6,2% 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1 UN xxx.xxx.150.172 362,32 GB 256 6,0% ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1 UN xxx.xxx.149.238 388,98 GB 256 6,4% a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1 UN xxx.xxx.151.232 435,31 GB 256 6,6% 500a43ab-ae77-4a07-876c-171cb34c549b RAC1 UN xxx.xxx.151.43 410,69 GB 256 6,2% b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1 UN xxx.xxx.151.139 407,47 GB 256 6,2% ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1 UN xxx.xxx.151.213 375,05 GB 256 6,5% 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1 UN xxx.xxx.149.177 401,91 GB 256 6,6% b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1 UN xxx.xxx.150.145 388,76 GB 256 7,1% 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1 UN xxx.xxx.149.48 385,43 GB 256 6,2% ad3ea388-203c-4b26-a368-934a6105cc6e RAC1 UN xxx.xxx.150.189 384,52 GB 256 6,4% f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1 UN xxx.xxx.151.220 357,56 GB 256 6,1% feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1 UN xxx.xxx.149.121 355,64 GB 256 6,4% 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1 UN xxx.xxx.151.218 416,57 GB 256 6,3% bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1 UN xxx.xxx.150.26 383,06 GB 256 6,7% 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1 DC1 and DC3 are the old data centers. DC2 is the new one being added (as seen from the data loads). For the snitch we are using GossipingPropertyFileSnitch and a cassandra-rackdc.properties with config such as: dc=DC1 rack=RAC1 Just noticed that we also have cassandra-topology.properties present on the nodes, but it's up-to-date with all the nodes from the 3 data centers. I was wondering on whether the replication settings for the system_distributed keyspace might need a change, but didn't find any yet documentation pointing to that. Best regards, Timo On 22 September 2016 at 18:00, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > It could be a bug. > > Yet I am not very aware of this system_distributed keyspace, but from what > I see, it is using a simple strategy: > > root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" | > cqlsh $(hostname -I | awk '{print $1}') > > CREATE KEYSPACE system_distributed WITH replication = {'class': > 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; > > Let's first check some stuff. Could you share the output of: > > > - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh > [ip_address_of_the_server] > - nodetool status > - nodetool status system_distributed > - Let us know about the snitch you are using and the corresponding > configuration. > > > I am trying to make sure the command you used is expected to work, given > your setup. > > My guess is this you might need to alter this keyspace accordingly to your > cluster setup. > > Just guessing, hope that helps. > > C*heers, > ----------------------- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2016-09-22 15:47 GMT+02:00 Timo Ahokas <timo.aho...@gmail.com>: > >> Hi, >> >> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15) >> currently running in two data centers (13 and 19 nodes, RF3 in both). We >> are adding a third data center before decommissioning one of the earlier >> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the >> cluster (not set to bootstrap, as documented in >> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati >> ons/opsAddDCToCluster.html). >> >> When trying to rebuild nodes in the new DC from a previous DC (nodetool >> rebuild -- DC1), we get the following error: >> >> Unable to find sufficient sources for streaming range >> (597769692463489739,597931451954862346] in keyspace system_distributed >> >> The same error occurs which ever of the 2 existing DCs we try to rebuild >> from. >> >> We run pr repairs (nodetool repair -pr) on all nodes twice a week via >> cron. >> >> Any advice on how to get the rebuild started? >> >> Best regards, >> Timo >> >> >> >> >> >