Re: Cassandra: Inconsistent data on reads (LOCAL_QUORUM)
I'll second that - we had some weird inconsistent reads for a long time that we finally tracked to a small number of clients with significant clock skew. Make very sure all your client (not just C*) machines have tightly-synced clocks. On Fri, Oct 12, 2018 at 7:40 PM maitrayee shah wrote: > We have seen inconsistent read if the clock on the nodes are not in sync. > > > Thank you > > Sent from my iPhone > > On Oct 12, 2018, at 1:50 PM, Naik, Ninad wrote: > > Hello, > > We're seeing inconsistent data while doing reads on cassandra. Here are > the details: > > It's is a wide column table. The columns can be added my multiple > machines, and read by multiple machines. The time between writes and reads > are in minutes, but sometimes can be in seconds. Writes happen every 2 > minutes. > > Now, while reading we're seeing the following cases of inconsistent reads: > >- One column was added. If a read was done after the column was added >(20 secs to 2 minutes after the write), Cassandra returns no data. As if >the key doesn't exist. If the application retries, it gets the data. >- A few columns exist for a row key. And a new column 'n' was added. >Again, a read happens a few minutes after the write. This time, only the >latest column 'n' is returned. In this case the app doesn't know that the >data is incomplete so it doesn't retry. If we manually retry, we see all >the columns. >- A few columns exist for a row key. And a new column 'n' is added. >When a read happens after the write, all columns but 'n' are returned. > > Here's what we've verified: > >- Both writes and reads are using 'LOCAL_QUORUM' consistency level. >- The replication is within local data center. No remote data center >is involved in the read or write. >- During the inconsistent reads, none of the nodes are undergoing GC >pauses >- There are no errors in cassandra logs >- Reads always happen after the writes. > > A few other details: Cassandra version: 2.1.9 DataStax java driver > version: 2.1.10.2 Replication Factor: 3 > > We don't see this problem in lower environments. We have seen this happen > once or twice last year, but since last few days it's happening quite > frequently. On an average 2 inconsistent reads every minute. > > Here's how the table definition looks like: > > CREATE TABLE "MY_TABLE" ( > key text, > sub_key text, > value text, > PRIMARY KEY ((key), sub_key) > ) WITH > bloom_filter_fp_chance=0.01 AND > caching='{"keys":"ALL", "rows_per_partition":"NONE"}' AND > comment='' AND > dclocal_read_repair_chance=0.10 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > default_time_to_live=0 AND > speculative_retry='ALWAYS' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > > Please point us in the right direction. Thanks ! > > > > The information contained in this e-mail message and any attachments may > be privileged and confidential. If the reader of this message is not the > intended recipient or an agent responsible for delivering it to the > intended recipient, you are hereby notified that any review, dissemination, > distribution or copying of this communication is strictly prohibited. If > you have received this communication in error, please notify the sender > immediately by replying to this e-mail and delete the message and any > attachments from your computer. > >
Re: Tracing in cassandra
Yes with range queries its timing out, one question was the where condition is primary key rather than clustering key. On Friday, October 12, 2018, Nitan Kainth wrote: > Did it still timeout? > > Sent from my iPhone > > On Oct 12, 2018, at 1:11 PM, Abdul Patel wrote: > > With limit 11 this is query.. > Select * from table where status=0 and tojen(user_id) >=token(126838) and > token(user_id) <= token > On Friday, October 12, 2018, Abdul Patel wrote: > >> Let me try with limit 11 ..we have 18 node cluster ..no nodes down.. >> >> On Friday, October 12, 2018, Nitan Kainth wrote: >> >>> Try query with partition key selection in where clause. But time for >>> limit 11 shouldn’t fail. Are all nodes up? Do you see any corruption in ay >>> sstable? >>> >>> Sent from my iPhone >>> >>> On Oct 12, 2018, at 11:40 AM, Abdul Patel wrote: >>> >>> Sean, >>> >>> here it is : >>> CREATE TABLE Keyspave.tblname ( >>> user_id bigint, >>> session_id text, >>> application_guid text, >>> last_access_time timestamp, >>> login_time timestamp, >>> status int, >>> terminated_by text, >>> update_time timestamp, >>> PRIMARY KEY (user_id, session_id) >>> ) WITH CLUSTERING ORDER BY (session_id ASC) >>> >>> also they see timeouts with limit 11 as well, so is it better to remove >>> with limit option ? or whats best to query such schema? >>> >>> On Fri, Oct 12, 2018 at 11:05 AM Durity, Sean R < >>> sean_r_dur...@homedepot.com> wrote: >>> Cross-partition = multiple partitions Simple example: Create table customer ( Customerid int, Name text, Lastvisit date, Phone text, Primary key (customerid) ); Query Select customerid from customer limit 5000; The query is asking for 5000 different partitions to be selected across the cluster. This is a very EXPENSIVE query for Cassandra, especially as the number of nodes goes up. Typically, you want to query a single partition. Read timeouts are usually caused by queries that are selecting many partitions or a very large partition. That is why a schema for the involved table could help. Sean Durity *From:* Abdul Patel *Sent:* Friday, October 12, 2018 10:04 AM *To:* user@cassandra.apache.org *Subject:* [EXTERNAL] Re: Tracing in cassandra Cpuld you elaborate cross partition query? On Friday, October 12, 2018, Durity, Sean R < sean_r_dur...@homedepot.com> wrote: I suspect you are doing a cross-partition query, which will not scale well (as you can see). What is the schema for the table involved? Sean Durity *From:* Abdul Patel *Sent:* Thursday, October 11, 2018 5:54 PM *To:* a...@instaclustr.com *Cc:* user@cassandra.apache.org *Subject:* [EXTERNAL] Re: Tracing in cassandra Query : SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; -Error: ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bd7c0-cd9e-11e8-8e99-15807bff4dfd | Parsing SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; | 10.54.145.32 | 4020 | Native-Transport-Requests-3 e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bfed0-cd9e-11e8-8e99-15807bff4dfd | Preparing statement | 10.54.145.32 | 5065 | Native-Transport-Requests-3 e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c25e0-cd9e-11e8-8e99-15807bff4dfd | Executing single-partition query on roles | 10.54.145.32 | 6171 | ReadStage-2 e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf0-cd9e-11e8-8e99-15807bff4dfd | Acquiring sstable references | 10.54.145.32 | 6362 | ReadStage-2 e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf1-cd9e-11e8-8e99-15807bff4dfd | Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | 10.54.145.32 | 6641 | Re adStage-2 e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf2-cd9e-11e8-8e99-15807bff4dfd | Key cache hit for sstable 346 | 10.54.145.32 | 6955 | ReadStage-2
Re: Cassandra: Inconsistent data on reads (LOCAL_QUORUM)
We have seen inconsistent read if the clock on the nodes are not in sync. Thank you Sent from my iPhone > On Oct 12, 2018, at 1:50 PM, Naik, Ninad wrote: > > Hello, > > We're seeing inconsistent data while doing reads on cassandra. Here are the > details: > > It's is a wide column table. The columns can be added my multiple machines, > and read by multiple machines. The time between writes and reads are in > minutes, but sometimes can be in seconds. Writes happen every 2 minutes. > > Now, while reading we're seeing the following cases of inconsistent reads: > > One column was added. If a read was done after the column was added (20 secs > to 2 minutes after the write), Cassandra returns no data. As if the key > doesn't exist. If the application retries, it gets the data. > A few columns exist for a row key. And a new column 'n' was added. Again, a > read happens a few minutes after the write. This time, only the latest column > 'n' is returned. In this case the app doesn't know that the data is > incomplete so it doesn't retry. If we manually retry, we see all the columns. > A few columns exist for a row key. And a new column 'n' is added. When a read > happens after the write, all columns but 'n' are returned. > Here's what we've verified: > > Both writes and reads are using 'LOCAL_QUORUM' consistency level. > The replication is within local data center. No remote data center is > involved in the read or write. > During the inconsistent reads, none of the nodes are undergoing GC pauses > There are no errors in cassandra logs > Reads always happen after the writes. > A few other details: Cassandra version: 2.1.9 DataStax java driver version: > 2.1.10.2 Replication Factor: 3 > > We don't see this problem in lower environments. We have seen this happen > once or twice last year, but since last few days it's happening quite > frequently. On an average 2 inconsistent reads every minute. > > Here's how the table definition looks like: > > CREATE TABLE "MY_TABLE" ( > key text, > sub_key text, > value text, > PRIMARY KEY ((key), sub_key) > ) WITH > bloom_filter_fp_chance=0.01 AND > caching='{"keys":"ALL", "rows_per_partition":"NONE"}' AND > comment='' AND > dclocal_read_repair_chance=0.10 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > default_time_to_live=0 AND > speculative_retry='ALWAYS' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > Please point us in the right direction. Thanks ! > > > > The information contained in this e-mail message and any attachments may be > privileged and confidential. If the reader of this message is not the > intended recipient or an agent responsible for delivering it to the intended > recipient, you are hereby notified that any review, dissemination, > distribution or copying of this communication is strictly prohibited. If you > have received this communication in error, please notify the sender > immediately by replying to this e-mail and delete the message and any > attachments from your computer.
unsubscribe
Cassandra: Inconsistent data on reads (LOCAL_QUORUM)
Hello, We're seeing inconsistent data while doing reads on cassandra. Here are the details: It's is a wide column table. The columns can be added my multiple machines, and read by multiple machines. The time between writes and reads are in minutes, but sometimes can be in seconds. Writes happen every 2 minutes. Now, while reading we're seeing the following cases of inconsistent reads: * One column was added. If a read was done after the column was added (20 secs to 2 minutes after the write), Cassandra returns no data. As if the key doesn't exist. If the application retries, it gets the data. * A few columns exist for a row key. And a new column 'n' was added. Again, a read happens a few minutes after the write. This time, only the latest column 'n' is returned. In this case the app doesn't know that the data is incomplete so it doesn't retry. If we manually retry, we see all the columns. * A few columns exist for a row key. And a new column 'n' is added. When a read happens after the write, all columns but 'n' are returned. Here's what we've verified: * Both writes and reads are using 'LOCAL_QUORUM' consistency level. * The replication is within local data center. No remote data center is involved in the read or write. * During the inconsistent reads, none of the nodes are undergoing GC pauses * There are no errors in cassandra logs * Reads always happen after the writes. A few other details: Cassandra version: 2.1.9 DataStax java driver version: 2.1.10.2 Replication Factor: 3 We don't see this problem in lower environments. We have seen this happen once or twice last year, but since last few days it's happening quite frequently. On an average 2 inconsistent reads every minute. Here's how the table definition looks like: CREATE TABLE "MY_TABLE" ( key text, sub_key text, value text, PRIMARY KEY ((key), sub_key) ) WITH bloom_filter_fp_chance=0.01 AND caching='{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment='' AND dclocal_read_repair_chance=0.10 AND gc_grace_seconds=864000 AND read_repair_chance=0.00 AND default_time_to_live=0 AND speculative_retry='ALWAYS' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; Please point us in the right direction. Thanks ! The information contained in this e-mail message and any attachments may be privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that any review, dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this e-mail and delete the message and any attachments from your computer.
Re: Tracing in cassandra
Did it still timeout? Sent from my iPhone > On Oct 12, 2018, at 1:11 PM, Abdul Patel wrote: > > With limit 11 this is query.. > Select * from table where status=0 and tojen(user_id) >=token(126838) and > token(user_id) <= token >> On Friday, October 12, 2018, Abdul Patel wrote: >> Let me try with limit 11 ..we have 18 node cluster ..no nodes down.. >> >>> On Friday, October 12, 2018, Nitan Kainth wrote: >>> Try query with partition key selection in where clause. But time for limit >>> 11 shouldn’t fail. Are all nodes up? Do you see any corruption in ay >>> sstable? >>> >>> Sent from my iPhone >>> On Oct 12, 2018, at 11:40 AM, Abdul Patel wrote: Sean, here it is : CREATE TABLE Keyspave.tblname ( user_id bigint, session_id text, application_guid text, last_access_time timestamp, login_time timestamp, status int, terminated_by text, update_time timestamp, PRIMARY KEY (user_id, session_id) ) WITH CLUSTERING ORDER BY (session_id ASC) also they see timeouts with limit 11 as well, so is it better to remove with limit option ? or whats best to query such schema? > On Fri, Oct 12, 2018 at 11:05 AM Durity, Sean R > wrote: > Cross-partition = multiple partitions > > > > Simple example: > > Create table customer ( > > Customerid int, > > Name text, > > Lastvisit date, > > Phone text, > > Primary key (customerid) ); > > > > Query > > Select customerid from customer limit 5000; > > > > The query is asking for 5000 different partitions to be selected across > the cluster. This is a very EXPENSIVE query for Cassandra, especially as > the number of nodes goes up. Typically, you want to query a single > partition. Read timeouts are usually caused by queries that are selecting > many partitions or a very large partition. That is why a schema for the > involved table could help. > > > > > > Sean Durity > > > > From: Abdul Patel > Sent: Friday, October 12, 2018 10:04 AM > To: user@cassandra.apache.org > Subject: [EXTERNAL] Re: Tracing in cassandra > > > > Cpuld you elaborate cross partition query? > > On Friday, October 12, 2018, Durity, Sean R > wrote: > > I suspect you are doing a cross-partition query, which will not scale > well (as you can see). What is the schema for the table involved? > > > > > > Sean Durity > > > > From: Abdul Patel > Sent: Thursday, October 11, 2018 5:54 PM > To: a...@instaclustr.com > Cc: user@cassandra.apache.org > Subject: [EXTERNAL] Re: Tracing in cassandra > > > > Query : > > SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; > > -Error: ReadTimeout: Error from server: code=1200 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'received_responses': 0, > 'required_responses': 1, 'consistency': 'ONE'} > > > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70bd7c0-cd9e-11e8-8e99-15807bff4dfd | > Parsing SELECT * FROM > keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; | 10.54.145.32 | > 4020 | Native-Transport-Requests-3 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70bfed0-cd9e-11e8-8e99-15807bff4dfd | > > Preparing statement | 10.54.145.32 | 5065 | >Native-Transport-Requests-3 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c25e0-cd9e-11e8-8e99-15807bff4dfd | > > Executing single-partition query on roles | 10.54.145.32 | 6171 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c4cf0-cd9e-11e8-8e99-15807bff4dfd | > > Acquiring sstable references | 10.54.145.32 | 6362 | > ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c4cf1-cd9e-11e8-8e99-15807bff4dfd | > Skipped 0/2 non-slice-intersecting > sstables, included 0 due to tombstones | 10.54.145.32 |
Re: Tracing in cassandra
With limit 11 this is query.. Select * from table where status=0 and tojen(user_id) >=token(126838) and token(user_id) <= token wrote: > Let me try with limit 11 ..we have 18 node cluster ..no nodes down.. > > On Friday, October 12, 2018, Nitan Kainth wrote: > >> Try query with partition key selection in where clause. But time for >> limit 11 shouldn’t fail. Are all nodes up? Do you see any corruption in ay >> sstable? >> >> Sent from my iPhone >> >> On Oct 12, 2018, at 11:40 AM, Abdul Patel wrote: >> >> Sean, >> >> here it is : >> CREATE TABLE Keyspave.tblname ( >> user_id bigint, >> session_id text, >> application_guid text, >> last_access_time timestamp, >> login_time timestamp, >> status int, >> terminated_by text, >> update_time timestamp, >> PRIMARY KEY (user_id, session_id) >> ) WITH CLUSTERING ORDER BY (session_id ASC) >> >> also they see timeouts with limit 11 as well, so is it better to remove >> with limit option ? or whats best to query such schema? >> >> On Fri, Oct 12, 2018 at 11:05 AM Durity, Sean R < >> sean_r_dur...@homedepot.com> wrote: >> >>> Cross-partition = multiple partitions >>> >>> >>> >>> Simple example: >>> >>> Create table customer ( >>> >>> Customerid int, >>> >>> Name text, >>> >>> Lastvisit date, >>> >>> Phone text, >>> >>> Primary key (customerid) ); >>> >>> >>> >>> Query >>> >>> Select customerid from customer limit 5000; >>> >>> >>> >>> The query is asking for 5000 different partitions to be selected across >>> the cluster. This is a very EXPENSIVE query for Cassandra, especially as >>> the number of nodes goes up. Typically, you want to query a single >>> partition. Read timeouts are usually caused by queries that are selecting >>> many partitions or a very large partition. That is why a schema for the >>> involved table could help. >>> >>> >>> >>> >>> >>> Sean Durity >>> >>> >>> >>> *From:* Abdul Patel >>> *Sent:* Friday, October 12, 2018 10:04 AM >>> *To:* user@cassandra.apache.org >>> *Subject:* [EXTERNAL] Re: Tracing in cassandra >>> >>> >>> >>> Cpuld you elaborate cross partition query? >>> >>> On Friday, October 12, 2018, Durity, Sean R >>> wrote: >>> >>> I suspect you are doing a cross-partition query, which will not scale >>> well (as you can see). What is the schema for the table involved? >>> >>> >>> >>> >>> >>> Sean Durity >>> >>> >>> >>> *From:* Abdul Patel >>> *Sent:* Thursday, October 11, 2018 5:54 PM >>> *To:* a...@instaclustr.com >>> *Cc:* user@cassandra.apache.org >>> *Subject:* [EXTERNAL] Re: Tracing in cassandra >>> >>> >>> >>> Query : >>> >>> SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; >>> >>> -Error: ReadTimeout: Error from server: code=1200 [Coordinator node >>> timed out waiting for replica nodes' responses] message="Operation timed >>> out - received only 0 responses." info={'received_responses': 0, >>> 'required_responses': 1, 'consistency': 'ONE'} >>> >>> >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bd7c0-cd9e-11e8-8e99-15807bff4dfd >>> | >>> Parsing SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT >>> 5000; | 10.54.145.32 | 4020 | >>> Native-Transport-Requests-3 >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bfed0-cd9e-11e8-8e99-15807bff4dfd >>> | >>> Preparing statement >>> | 10.54.145.32 | 5065 | >>> Native-Transport-Requests-3 >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c25e0-cd9e-11e8-8e99-15807bff4dfd >>> | >>> Executing >>> single-partition query on roles | 10.54.145.32 | 6171 >>> | ReadStage-2 >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf0-cd9e-11e8-8e99-15807bff4dfd >>> | >>>Acquiring >>> sstable references | 10.54.145.32 | 6362 >>> | ReadStage-2 >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf1-cd9e-11e8-8e99-15807bff4dfd >>> | >>> Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | >>> 10.54.145.32 | 6641 | Re >>> adStage-2 >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf2-cd9e-11e8-8e99-15807bff4dfd >>> | >>> Key cache hit >>> for sstable 346 | 10.54.145.32 | 6955 >>> | ReadStage-2 >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf3-cd9e-11e8-8e99-15807bff4dfd >>> | >>>Bloom filter allows >>> skipping sstable 347 | 10.54.145.32 | 7202 >>> | ReadStage-2 >>> >>> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7400-cd9e-11e8-8e99-15807bff4dfd >>> | >>> Merged >>> data from memtables and 2 sstables | 10.54.145.32 | 7386 >>> |
Re: Tracing in cassandra
Let me try with limit 11 ..we have 18 node cluster ..no nodes down.. On Friday, October 12, 2018, Nitan Kainth wrote: > Try query with partition key selection in where clause. But time for limit > 11 shouldn’t fail. Are all nodes up? Do you see any corruption in ay > sstable? > > Sent from my iPhone > > On Oct 12, 2018, at 11:40 AM, Abdul Patel wrote: > > Sean, > > here it is : > CREATE TABLE Keyspave.tblname ( > user_id bigint, > session_id text, > application_guid text, > last_access_time timestamp, > login_time timestamp, > status int, > terminated_by text, > update_time timestamp, > PRIMARY KEY (user_id, session_id) > ) WITH CLUSTERING ORDER BY (session_id ASC) > > also they see timeouts with limit 11 as well, so is it better to remove > with limit option ? or whats best to query such schema? > > On Fri, Oct 12, 2018 at 11:05 AM Durity, Sean R < > sean_r_dur...@homedepot.com> wrote: > >> Cross-partition = multiple partitions >> >> >> >> Simple example: >> >> Create table customer ( >> >> Customerid int, >> >> Name text, >> >> Lastvisit date, >> >> Phone text, >> >> Primary key (customerid) ); >> >> >> >> Query >> >> Select customerid from customer limit 5000; >> >> >> >> The query is asking for 5000 different partitions to be selected across >> the cluster. This is a very EXPENSIVE query for Cassandra, especially as >> the number of nodes goes up. Typically, you want to query a single >> partition. Read timeouts are usually caused by queries that are selecting >> many partitions or a very large partition. That is why a schema for the >> involved table could help. >> >> >> >> >> >> Sean Durity >> >> >> >> *From:* Abdul Patel >> *Sent:* Friday, October 12, 2018 10:04 AM >> *To:* user@cassandra.apache.org >> *Subject:* [EXTERNAL] Re: Tracing in cassandra >> >> >> >> Cpuld you elaborate cross partition query? >> >> On Friday, October 12, 2018, Durity, Sean R >> wrote: >> >> I suspect you are doing a cross-partition query, which will not scale >> well (as you can see). What is the schema for the table involved? >> >> >> >> >> >> Sean Durity >> >> >> >> *From:* Abdul Patel >> *Sent:* Thursday, October 11, 2018 5:54 PM >> *To:* a...@instaclustr.com >> *Cc:* user@cassandra.apache.org >> *Subject:* [EXTERNAL] Re: Tracing in cassandra >> >> >> >> Query : >> >> SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; >> >> -Error: ReadTimeout: Error from server: code=1200 [Coordinator node >> timed out waiting for replica nodes' responses] message="Operation timed >> out - received only 0 responses." info={'received_responses': 0, >> 'required_responses': 1, 'consistency': 'ONE'} >> >> >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bd7c0-cd9e-11e8-8e99-15807bff4dfd >> | >> Parsing SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT >> 5000; | 10.54.145.32 | 4020 | >> Native-Transport-Requests-3 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bfed0-cd9e-11e8-8e99-15807bff4dfd >> | >> Preparing statement >> | 10.54.145.32 | 5065 | >> Native-Transport-Requests-3 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c25e0-cd9e-11e8-8e99-15807bff4dfd >> | >> Executing >> single-partition query on roles | 10.54.145.32 | 6171 >> | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf0-cd9e-11e8-8e99-15807bff4dfd >> | >>Acquiring >> sstable references | 10.54.145.32 | 6362 >> | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf1-cd9e-11e8-8e99-15807bff4dfd >> | >> Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | >> 10.54.145.32 | 6641 | >> ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf2-cd9e-11e8-8e99-15807bff4dfd >> | >> Key cache hit >> for sstable 346 | 10.54.145.32 | 6955 >> | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf3-cd9e-11e8-8e99-15807bff4dfd >> | >>Bloom filter allows >> skipping sstable 347 | 10.54.145.32 | 7202 >> | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7400-cd9e-11e8-8e99-15807bff4dfd >> | >> Merged >> data from memtables and 2 sstables | 10.54.145.32 | 7386 >> | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7401-cd9e-11e8-8e99-15807bff4dfd >> | >> Read 1 live and 0 >> tombstone cells | 10.54.145.32 | 7519 >> | ReadStage-2 >> >>
Re: [EXTERNAL] Re: Tracing in cassandra
Try query with partition key selection in where clause. But time for limit 11 shouldn’t fail. Are all nodes up? Do you see any corruption in ay sstable? Sent from my iPhone > On Oct 12, 2018, at 11:40 AM, Abdul Patel wrote: > > Sean, > > here it is : > CREATE TABLE Keyspave.tblname ( > user_id bigint, > session_id text, > application_guid text, > last_access_time timestamp, > login_time timestamp, > status int, > terminated_by text, > update_time timestamp, > PRIMARY KEY (user_id, session_id) > ) WITH CLUSTERING ORDER BY (session_id ASC) > > also they see timeouts with limit 11 as well, so is it better to remove with > limit option ? or whats best to query such schema? > >> On Fri, Oct 12, 2018 at 11:05 AM Durity, Sean R >> wrote: >> Cross-partition = multiple partitions >> >> >> >> Simple example: >> >> Create table customer ( >> >> Customerid int, >> >> Name text, >> >> Lastvisit date, >> >> Phone text, >> >> Primary key (customerid) ); >> >> >> >> Query >> >> Select customerid from customer limit 5000; >> >> >> >> The query is asking for 5000 different partitions to be selected across the >> cluster. This is a very EXPENSIVE query for Cassandra, especially as the >> number of nodes goes up. Typically, you want to query a single partition. >> Read timeouts are usually caused by queries that are selecting many >> partitions or a very large partition. That is why a schema for the involved >> table could help. >> >> >> >> >> >> Sean Durity >> >> >> >> From: Abdul Patel >> Sent: Friday, October 12, 2018 10:04 AM >> To: user@cassandra.apache.org >> Subject: [EXTERNAL] Re: Tracing in cassandra >> >> >> >> Cpuld you elaborate cross partition query? >> >> On Friday, October 12, 2018, Durity, Sean R >> wrote: >> >> I suspect you are doing a cross-partition query, which will not scale well >> (as you can see). What is the schema for the table involved? >> >> >> >> >> >> Sean Durity >> >> >> >> From: Abdul Patel >> Sent: Thursday, October 11, 2018 5:54 PM >> To: a...@instaclustr.com >> Cc: user@cassandra.apache.org >> Subject: [EXTERNAL] Re: Tracing in cassandra >> >> >> >> Query : >> >> SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; >> >> -Error: ReadTimeout: Error from server: code=1200 [Coordinator node timed >> out waiting for replica nodes' responses] message="Operation timed out - >> received only 0 responses." info={'received_responses': 0, >> 'required_responses': 1, 'consistency': 'ONE'} >> >> >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bd7c0-cd9e-11e8-8e99-15807bff4dfd >> | >> Parsing SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT >> 5000; | 10.54.145.32 | 4020 | >> Native-Transport-Requests-3 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bfed0-cd9e-11e8-8e99-15807bff4dfd >> | >> Preparing statement | >> 10.54.145.32 | 5065 | Native-Transport-Requests-3 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c25e0-cd9e-11e8-8e99-15807bff4dfd >> | >> Executing single-partition query on roles | >> 10.54.145.32 | 6171 | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf0-cd9e-11e8-8e99-15807bff4dfd >> | >>Acquiring sstable references | >> 10.54.145.32 | 6362 | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf1-cd9e-11e8-8e99-15807bff4dfd >> | >> Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | >> 10.54.145.32 | 6641 | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf2-cd9e-11e8-8e99-15807bff4dfd >> | >> Key cache hit for sstable 346 | >> 10.54.145.32 | 6955 | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf3-cd9e-11e8-8e99-15807bff4dfd >> | >>Bloom filter allows skipping sstable 347 | >> 10.54.145.32 | 7202 | ReadStage-2 >> >> e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7400-cd9e-11e8-8e99-15807bff4dfd >> |
Re: [EXTERNAL] Re: Tracing in cassandra
Sean, here it is : CREATE TABLE Keyspave.tblname ( user_id bigint, session_id text, application_guid text, last_access_time timestamp, login_time timestamp, status int, terminated_by text, update_time timestamp, PRIMARY KEY (user_id, session_id) ) WITH CLUSTERING ORDER BY (session_id ASC) also they see timeouts with limit 11 as well, so is it better to remove with limit option ? or whats best to query such schema? On Fri, Oct 12, 2018 at 11:05 AM Durity, Sean R wrote: > Cross-partition = multiple partitions > > > > Simple example: > > Create table customer ( > > Customerid int, > > Name text, > > Lastvisit date, > > Phone text, > > Primary key (customerid) ); > > > > Query > > Select customerid from customer limit 5000; > > > > The query is asking for 5000 different partitions to be selected across > the cluster. This is a very EXPENSIVE query for Cassandra, especially as > the number of nodes goes up. Typically, you want to query a single > partition. Read timeouts are usually caused by queries that are selecting > many partitions or a very large partition. That is why a schema for the > involved table could help. > > > > > > Sean Durity > > > > *From:* Abdul Patel > *Sent:* Friday, October 12, 2018 10:04 AM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Re: Tracing in cassandra > > > > Cpuld you elaborate cross partition query? > > On Friday, October 12, 2018, Durity, Sean R > wrote: > > I suspect you are doing a cross-partition query, which will not scale well > (as you can see). What is the schema for the table involved? > > > > > > Sean Durity > > > > *From:* Abdul Patel > *Sent:* Thursday, October 11, 2018 5:54 PM > *To:* a...@instaclustr.com > *Cc:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Re: Tracing in cassandra > > > > Query : > > SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; > > -Error: ReadTimeout: Error from server: code=1200 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'received_responses': 0, > 'required_responses': 1, 'consistency': 'ONE'} > > > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70bd7c0-cd9e-11e8-8e99-15807bff4dfd > | > Parsing SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT > 5000; | 10.54.145.32 | 4020 | > Native-Transport-Requests-3 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70bfed0-cd9e-11e8-8e99-15807bff4dfd > | > Preparing statement | > 10.54.145.32 | 5065 | > Native-Transport-Requests-3 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c25e0-cd9e-11e8-8e99-15807bff4dfd | > > Executing > single-partition query on roles | 10.54.145.32 | 6171 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c4cf0-cd9e-11e8-8e99-15807bff4dfd > | > Acquiring sstable references | 10.54.145.32 | 6362 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c4cf1-cd9e-11e8-8e99-15807bff4dfd > | > Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | > 10.54.145.32 | 6641 | > ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c4cf2-cd9e-11e8-8e99-15807bff4dfd > | > Key cache hit for sstable 346 | 10.54.145.32 | 6955 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c4cf3-cd9e-11e8-8e99-15807bff4dfd > | >Bloom filter allows skipping sstable 347 | 10.54.145.32 > | 7202 | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c7400-cd9e-11e8-8e99-15807bff4dfd > | > Merged data > from memtables and 2 sstables | 10.54.145.32 | 7386 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c7401-cd9e-11e8-8e99-15807bff4dfd > | > Read 1 live and 0 tombstone cells | 10.54.145.32 | 7519 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c7402-cd9e-11e8-8e99-15807bff4dfd > | > Executing single-partition query on roles | 10.54.145.32 | 7826 > | ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c7403-cd9e-11e8-8e99-15807bff4dfd > | > Acquiring sstable references | 10.54.145.32 | 7924 > | ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c7404-cd9e-11e8-8e99-15807bff4dfd > | > Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | > 10.54.145.32 | 8060 | > ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | > e70c7405-cd9e-11e8-8e99-15807bff4dfd > | >
Re: Tracing in cassandra
Cpuld you elaborate cross partition query? On Friday, October 12, 2018, Durity, Sean R wrote: > I suspect you are doing a cross-partition query, which will not scale well > (as you can see). What is the schema for the table involved? > > > > > > Sean Durity > > > > *From:* Abdul Patel > *Sent:* Thursday, October 11, 2018 5:54 PM > *To:* a...@instaclustr.com > *Cc:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Re: Tracing in cassandra > > > > Query : > > SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT 5000; > > -Error: ReadTimeout: Error from server: code=1200 [Coordinator node timed > out waiting for replica nodes' responses] message="Operation timed out - > received only 0 responses." info={'received_responses': 0, > 'required_responses': 1, 'consistency': 'ONE'} > > > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bd7c0-cd9e-11e8-8e99-15807bff4dfd > | > Parsing SELECT * FROM keysoace.tablenameWHERE user_id = 390797583 LIMIT > 5000; | 10.54.145.32 | 4020 | > Native-Transport-Requests-3 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70bfed0-cd9e-11e8-8e99-15807bff4dfd > | > Preparing statement | > 10.54.145.32 | 5065 | > Native-Transport-Requests-3 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c25e0-cd9e-11e8-8e99-15807bff4dfd > | > Executing > single-partition query on roles | 10.54.145.32 | 6171 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf0-cd9e-11e8-8e99-15807bff4dfd > | >Acquiring > sstable references | 10.54.145.32 | 6362 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf1-cd9e-11e8-8e99-15807bff4dfd > | > Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | > 10.54.145.32 | 6641 | > ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf2-cd9e-11e8-8e99-15807bff4dfd > | > Key cache hit > for sstable 346 | 10.54.145.32 | 6955 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c4cf3-cd9e-11e8-8e99-15807bff4dfd > | >Bloom filter allows > skipping sstable 347 | 10.54.145.32 | 7202 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7400-cd9e-11e8-8e99-15807bff4dfd > | > Merged data from memtables and 2 sstables > | 10.54.145.32 | 7386 | > ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7401-cd9e-11e8-8e99-15807bff4dfd > | > Read 1 live and 0 > tombstone cells | 10.54.145.32 | 7519 > | ReadStage-2 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7402-cd9e-11e8-8e99-15807bff4dfd > | > Executing single-partition > query on roles | 10.54.145.32 | 7826 | > ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7403-cd9e-11e8-8e99-15807bff4dfd > | >Acquiring > sstable references | 10.54.145.32 | 7924 > | ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7404-cd9e-11e8-8e99-15807bff4dfd > | > Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | > 10.54.145.32 | 8060 | > ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7405-cd9e-11e8-8e99-15807bff4dfd > | > Key cache hit > for sstable 346 | 10.54.145.32 | 8137 > | ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7406-cd9e-11e8-8e99-15807bff4dfd > | >Bloom filter allows skipping sstable > 347 | 10.54.145.32 | 8187 | > ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c7407-cd9e-11e8-8e99-15807bff4dfd > | > Merged data from memtables > and 2 sstables | 10.54.145.32 | 8318 > | ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70c9b10-cd9e-11e8-8e99-15807bff4dfd > | > Read 1 live and 0 > tombstone cells | 10.54.145.32 | 8941 > | ReadStage-4 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70cc220-cd9e-11e8-8e99-15807bff4dfd > | > >Read-repair DC_LOCAL | 10.54.145.32 | 9468 | > Native-Transport-Requests-3 > > e70ac650-cd9e-11e8-8e99-15807bff4dfd | e70cc221-cd9e-11e8-8e99-15807bff4dfd > | > reading data from /10.54.145.31 >