Im trying to understand READ load in Cassandra across a multi-datacenter cluster. (Specifically why it seems to be hitting more than one DC) and hope someone can help.
From what Iím seeing here, a READ, with Consistency LOCAL_ONE, seems to be hitting All 3 datacenters, rather than just the one Iím connected to. I see 'Read 101 live and 0 tombstoned cells' from EACH of the 3 DC"s in the trace, which seems, wrong. I have tried every Consistency level, same result. This also is same from my C# code via the DataStax driver, (where I first noticed the issue). Can someone please shed some light on what is occurring ? Specifically I dont' want a query on one DC, going anywhere near the other 2 as a rule, as in production, these DC's will be accross slower links. Query: (NOTE: Whilst this uses a kairosdb table, i'm just playing with queries against it as it has 100k columns in this key for testing). cqlsh:kairosdb> consistency local_one Consistency level set to LOCAL_ONE. cqlsh:kairosdb> select * from data_points where key = 0x6d61726c796e2e746573742e74656d70340000000145b514a400726f6f6d3d6f66666963653a limit 1000; ... Some return data rows listed here which I've removed ....
Im trying to understand READ load in Cassandra across a multi-datacenter cluster. >From what Im seeing here, a READ, with Consistency LOCAL_ONE, seems to be >hitting All 3 datacenters, rather than just the one Im connected to. I see >'Read 101 live and 0 tombstoned cells' from EACH of the 3 DC"s in the trace, >which seems, wrong. I have tried every Consistency level, same result. This also is same from my C# code via the DataStax driver, (where I first noticed the issue). Can someone please shed some light on what is occurring ? Specifically I dont' want a query on one DC, going anywhere near the other 2 as a rule, as in production, these DC's will be accross slower links. Query: (NOTE: Whilst this uses a kairosdb table, i'm just playing with queries against it as it has 100k columns in this key for testing). cqlsh:kairosdb> consistency local_one Consistency level set to LOCAL_ONE. cqlsh:kairosdb> select * from data_points where key = 0x6d61726c796e2e746573742e74656d70340000000145b514a400726f6f6d3d6f66666963653a limit 1000; ... Some return data rows listed here which I've removed .... Query Respose Trace: activity | timestamp | source | source_elapsed ------------------------------------------------------------------------------------------------------------------------------------------+--------------+----------------+---------------- execute_cql3_query | 07:18:12,692 | 192.168.25.111 | 0 Message received from /192.168.25.111 | 07:18:00,706 | 192.168.25.131 | 50 Executing single-partition query on data_points | 07:18:00,707 | 192.168.25.131 | 760 Acquiring sstable references | 07:18:00,707 | 192.168.25.131 | 814 Merging memtable tombstones | 07:18:00,707 | 192.168.25.131 | 924 Bloom filter allows skipping sstable 191 | 07:18:00,707 | 192.168.25.131 | 1050 Bloom filter allows skipping sstable 190 | 07:18:00,707 | 192.168.25.131 | 1166 Key cache hit for sstable 189 | 07:18:00,707 | 192.168.25.131 | 1275 Seeking to partition beginning in data file | 07:18:00,707 | 192.168.25.131 | 1293 Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 07:18:00,708 | 192.168.25.131 | 2173 Merging data from memtables and 1 sstables | 07:18:00,708 | 192.168.25.131 | 2195 Read 1001 live and 0 tombstoned cells | 07:18:00,709 | 192.168.25.131 | 3259 Enqueuing response to /192.168.25.111 | 07:18:00,710 | 192.168.25.131 | 4006 Sending message to /192.168.25.111 | 07:18:00,710 | 192.168.25.131 | 4210 Parsing select * from data_points where key = 0x6d61726c796e2e746573742e74656d70340000000145b514a400726f6f6d3d6f66666963653a limit 1000; | 07:18:12,692 | 192.168.25.111 | 52 Preparing statement | 07:18:12,692 | 192.168.25.111 | 257 Sending message to /192.168.25.121 | 07:18:12,693 | 192.168.25.111 | 1099 Sending message to /192.168.25.131 | 07:18:12,693 | 192.168.25.111 | 1254 Executing single-partition query on data_points | 07:18:12,693 | 192.168.25.111 | 1269 Acquiring sstable references | 07:18:12,693 | 192.168.25.111 | 1284 Merging memtable tombstones | 07:18:12,694 | 192.168.25.111 | 1315 Key cache hit for sstable 205 | 07:18:12,694 | 192.168.25.111 | 1592 Seeking to partition beginning in data file | 07:18:12,694 | 192.168.25.111 | 1606 Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 07:18:12,695 | 192.168.25.111 | 2423 Merging data from memtables and 1 sstables | 07:18:12,695 | 192.168.25.111 | 2498 Read 1001 live and 0 tombstoned cells | 07:18:12,695 | 192.168.25.111 | 3167 Message received from /192.168.25.121 | 07:18:12,697 | 192.168.25.111 | null Processing response from /192.168.25.121 | 07:18:12,697 | 192.168.25.111 | null Message received from /192.168.25.131 | 07:18:12,699 | 192.168.25.111 | null Processing response from /192.168.25.131 | 07:18:12,699 | 192.168.25.111 | null Message received from /192.168.25.111 | 07:19:49,432 | 192.168.25.121 | 68 Executing single-partition query on data_points | 07:19:49,433 | 192.168.25.121 | 824 Acquiring sstable references | 07:19:49,433 | 192.168.25.121 | 840 Merging memtable tombstones | 07:19:49,433 | 192.168.25.121 | 898 Bloom filter allows skipping sstable 193 | 07:19:49,433 | 192.168.25.121 | 983 Key cache hit for sstable 192 | 07:19:49,433 | 192.168.25.121 | 1055 Seeking to partition beginning in data file | 07:19:49,433 | 192.168.25.121 | 1073 Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | 07:19:49,434 | 192.168.25.121 | 1803 Merging data from memtables and 1 sstables | 07:19:49,434 | 192.168.25.121 | 1839 Read 1001 live and 0 tombstoned cells | 07:19:49,434 | 192.168.25.121 | 2518 Enqueuing response to /192.168.25.111 | 07:19:49,435 | 192.168.25.121 | 3026 Sending message to /192.168.25.111 | 07:19:49,435 | 192.168.25.121 | 3128 Request complete | 07:18:12,696 | 192.168.25.111 | 4387 Other Stats about the cluster: [root@cdev101 conf]# nodetool status Datacenter: DC3 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 192.168.25.131 80.67 MB 256 34.2% 6ec61643-17d4-4a2e-8c44-57e08687a957 RAC1 Datacenter: DC2 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 192.168.25.121 79.46 MB 256 30.6% 976626fb-ea80-405b-abb0-eae703b0074d RAC1 Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 192.168.25.111 61.82 MB 256 35.2% 9475e2da-d926-42d0-83fb-0188d0f8f438 RAC1 cqlsh> describe keyspace kairosdb CREATE KEYSPACE kairosdb WITH replication = { 'class': 'NetworkTopologyStrategy', 'DC2': '1', 'DC3': '1', 'DC1': '1' }; USE kairosdb; CREATE TABLE data_points ( key blob, column1 blob, value blob, PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE AND bloom_filter_fp_chance=0.010000 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.000000 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=1.000000 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'};
Query Respose Trace: activity | timestamp | source | source_elapsed ------------------------------------------------------------------------------------------------------------------------------------------+--------------+----------------+---------------- execute_cql3_query | 07:18:12,692 | 192.168.25.111 | 0 Message received from /192.168.25.111 | 07:18:00,706 | 192.168.25.131 | 50 Executing single-partition query on data_points | 07:18:00,707 | 192.168.25.131 | 760 Acquiring sstable references | 07:18:00,707 | 192.168.25.131 | 814 Merging memtable tombstones | 07:18:00,707 | 192.168.25.131 | 924 Bloom filter allows skipping sstable 191 | 07:18:00,707 | 192.168.25.131 | 1050 Bloom filter allows skipping sstable 190 | 07:18:00,707 | 192.168.25.131 | 1166 Key cache hit for sstable 189 | 07:18:00,707 | 192.168.25.131 | 1275 Seeking to partition beginning in data file | 07:18:00,707 | 192.168.25.131 | 1293 Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 07:18:00,708 | 192.168.25.131 | 2173 Merging data from memtables and 1 sstables | 07:18:00,708 | 192.168.25.131 | 2195 Read 1001 live and 0 tombstoned cells | 07:18:00,709 | 192.168.25.131 | 3259 Enqueuing response to /192.168.25.111 | 07:18:00,710 | 192.168.25.131 | 4006 Sending message to /192.168.25.111 | 07:18:00,710 | 192.168.25.131 | 4210 Parsing select * from data_points where key = 0x6d61726c796e2e746573742e74656d70340000000145b514a400726f6f6d3d6f66666963653a limit 1000; | 07:18:12,692 | 192.168.25.111 | 52 Preparing statement | 07:18:12,692 | 192.168.25.111 | 257 Sending message to /192.168.25.121 | 07:18:12,693 | 192.168.25.111 | 1099 Sending message to /192.168.25.131 | 07:18:12,693 | 192.168.25.111 | 1254 Executing single-partition query on data_points | 07:18:12,693 | 192.168.25.111 | 1269 Acquiring sstable references | 07:18:12,693 | 192.168.25.111 | 1284 Merging memtable tombstones | 07:18:12,694 | 192.168.25.111 | 1315 Key cache hit for sstable 205 | 07:18:12,694 | 192.168.25.111 | 1592 Seeking to partition beginning in data file | 07:18:12,694 | 192.168.25.111 | 1606 Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 07:18:12,695 | 192.168.25.111 | 2423 Merging data from memtables and 1 sstables | 07:18:12,695 | 192.168.25.111 | 2498 Read 1001 live and 0 tombstoned cells | 07:18:12,695 | 192.168.25.111 | 3167 Message received from /192.168.25.121 | 07:18:12,697 | 192.168.25.111 | null Processing response from /192.168.25.121 | 07:18:12,697 | 192.168.25.111 | null Message received from /192.168.25.131 | 07:18:12,699 | 192.168.25.111 | null Processing response from /192.168.25.131 | 07:18:12,699 | 192.168.25.111 | null Message received from /192.168.25.111 | 07:19:49,432 | 192.168.25.121 | 68 Executing single-partition query on data_points | 07:19:49,433 | 192.168.25.121 | 824 Acquiring sstable references | 07:19:49,433 | 192.168.25.121 | 840 Merging memtable tombstones | 07:19:49,433 | 192.168.25.121 | 898 Bloom filter allows skipping sstable 193 | 07:19:49,433 | 192.168.25.121 | 983 Key cache hit for sstable 192 | 07:19:49,433 | 192.168.25.121 | 1055 Seeking to partition beginning in data file | 07:19:49,433 | 192.168.25.121 | 1073 Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | 07:19:49,434 | 192.168.25.121 | 1803 Merging data from memtables and 1 sstables | 07:19:49,434 | 192.168.25.121 | 1839 Read 1001 live and 0 tombstoned cells | 07:19:49,434 | 192.168.25.121 | 2518 Enqueuing response to /192.168.25.111 | 07:19:49,435 | 192.168.25.121 | 3026 Sending message to /192.168.25.111 | 07:19:49,435 | 192.168.25.121 | 3128 Request complete | 07:18:12,696 | 192.168.25.111 | 4387 Other Stats about the cluster: [root@cdev101 conf]# nodetool status Datacenter: DC3 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 192.168.25.131 80.67 MB 256 34.2% 6ec61643-17d4-4a2e-8c44-57e08687a957 RAC1 Datacenter: DC2 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 192.168.25.121 79.46 MB 256 30.6% 976626fb-ea80-405b-abb0-eae703b0074d RAC1 Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 192.168.25.111 61.82 MB 256 35.2% 9475e2da-d926-42d0-83fb-0188d0f8f438 RAC1 cqlsh> describe keyspace kairosdb CREATE KEYSPACE kairosdb WITH replication = { 'class': 'NetworkTopologyStrategy', 'DC2': '1', 'DC3': '1', 'DC1': '1' }; USE kairosdb; CREATE TABLE data_points ( key blob, column1 blob, value blob, PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE AND bloom_filter_fp_chance=0.010000 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.000000 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=1.000000 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'};