Lots of write timeouts and missing data during decomission/bootstrap
We get lots of write timeouts when we decommission a node. About 80% of them are write timeout and just about 20% of them are read timeout. We’ve tried to adjust streamthroughput (and compaction throughput) for that matter and that doesn’t resolve the issue. We’ve increased write_request_timeout_in_ms … and read timeout as well. Is there anything else I should be looking at? I can’t seem to find the documentation that explains what the heck is happening. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts
Re: Lots of write timeouts and missing data during decomission/bootstrap
Looks like all of this is happening because we’re using CAS operations and the driver is going to SERIAL consistency level. SERIAL and LOCAL_SERIAL write failure scenarios¶ http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html?scroll=concept_ds_umf_5xx_zj__failure-scenariosIf one of three nodes is down, the Paxos commit fails under the following conditions: - CQL query-configured consistency level of ALL - Driver-configured serial consistency level of SERIAL - Replication factor of 3 I don’t understand why this would fail.. it seems completely broken in this situation. We were having write timeout at replication factor of 2 .. and a lot of people from the list said of course , because 2 nodes with 1 node down means there’s no quorum and paxos needs a quorum. .. and not sure why I missed that :-P So we went with 3 replicas, and a quorum, but this is new and I didn’t see this documented. We set the driver to QUORUM but then I guess the driver sees that this is a CAS operation and forces it back to SERIAL? Doesn’t this mean that all decommissions result in failures of CAS? This is Cassandra 2.0.9 btw. On Wed, Jul 1, 2015 at 2:22 PM, Kevin Burton bur...@spinn3r.com wrote: We get lots of write timeouts when we decommission a node. About 80% of them are write timeout and just about 20% of them are read timeout. We’ve tried to adjust streamthroughput (and compaction throughput) for that matter and that doesn’t resolve the issue. We’ve increased write_request_timeout_in_ms … and read timeout as well. Is there anything else I should be looking at? I can’t seem to find the documentation that explains what the heck is happening. -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts
Re: Lots of write timeouts and missing data during decomission/bootstrap
On Wed, Jul 1, 2015 at 2:58 PM, Kevin Burton bur...@spinn3r.com wrote: Looks like all of this is happening because we’re using CAS operations and the driver is going to SERIAL consistency level. ... This is Cassandra 2.0.9 btw. https://issues.apache.org/jira/browse/CASSANDRA-8640 =Rob (credit to iamaleksey on IRC for remembering the JIRA #)
Re: Lots of write timeouts and missing data during decomission/bootstrap
WOW.. nice. you rock!! On Wed, Jul 1, 2015 at 3:18 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jul 1, 2015 at 2:58 PM, Kevin Burton bur...@spinn3r.com wrote: Looks like all of this is happening because we’re using CAS operations and the driver is going to SERIAL consistency level. ... This is Cassandra 2.0.9 btw. https://issues.apache.org/jira/browse/CASSANDRA-8640 =Rob (credit to iamaleksey on IRC for remembering the JIRA #) -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts
Missing data
Hi, I have reloaded the data in my cluster of 3 nodes RF: 2. I have loaded about 2 billion rows in one table. I use LeveledCompactionStrategy on my table. I use version 2.1.6. I use the default cassandra.yaml, only the ip address for seeds and throughput has been change. I loaded my data with simple insert statements. This took a bit more than one day to load the data… and one more day to compact the data on all nodes. For me this is quite acceptable since I should not be doing this again. I have done this with previous versions like 2.1.3 and others and I basically had absolutely no problems. Now I read the log files on the client side, there I see no warning and no errors. On the nodes side there I see many WARNING, all related with tombstones, but there are no ERRORS. My problem is that I see some *many missing records* in the DB, and I have never observed this with previous versions. 1) Is this a know problem? 2) Do you have any idea how I could track down this problem? 3) What is the meaning of this WARNING (the only type of ERROR | WARN I could find)? WARN [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, slices=[388:201001-388:201412:!] 4) Is it possible to have Tombstone when we make no DELETE statements? I’m lost… Thanks for your help.
Re: Missing data
Hi Jean, The problem of that Warning is that you are reading too many tombstones per request. If you do have Tombstones without doing DELETE it because you probably TTL'ed the data when inserting (By mistake? Or did you set default_time_to_live in your table?). You can use nodetool cfstats to see how many tombstones per read slice you have. This is, probably, also the cause of your missing data. Data was tombstoned, so it is not available. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay jean.tremb...@zen-innovations.com wrote: Hi, I have reloaded the data in my cluster of 3 nodes RF: 2. I have loaded about 2 billion rows in one table. I use LeveledCompactionStrategy on my table. I use version 2.1.6. I use the default cassandra.yaml, only the ip address for seeds and throughput has been change. I loaded my data with simple insert statements. This took a bit more than one day to load the data… and one more day to compact the data on all nodes. For me this is quite acceptable since I should not be doing this again. I have done this with previous versions like 2.1.3 and others and I basically had absolutely no problems. Now I read the log files on the client side, there I see no warning and no errors. On the nodes side there I see many WARNING, all related with tombstones, but there are no ERRORS. My problem is that I see some *many missing records* in the DB, and I have never observed this with previous versions. 1) Is this a know problem? 2) Do you have any idea how I could track down this problem? 3) What is the meaning of this WARNING (the only type of ERROR | WARN I could find)? WARN [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, slices=[388:201001-388:201412:!] 4) Is it possible to have Tombstone when we make no DELETE statements? I’m lost… Thanks for your help. -- --
Re: Missing data
You can get tombstones from inserting null values. Not sure if that’s the problem, but it is another way of getting tombstones in your data. On Jun 15, 2015, at 10:50 AM, Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com wrote: Dear all, I identified a bit more closely the root cause of my missing data. The problem is occurring when I use dependency groupIdcom.datastax.cassandra/groupId artifactIdcassandra-driver-core/artifactId version2.1.6/version /dependency on my client against Cassandra 2.1.6. I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4. Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 2.1.6. !! So as far as I can locate the problem, I would say that the version 2.1.6 of the driver is not working properly and is loosing some of my records.!!! —— As far as my tombstones are concerned I don’t understand their origin. I removed all location in my code where I delete items, and I do not use TTL anywhere ( I don’t need this feature in my project). And yet I have many tombstones building up. Is there another origin for tombstone beside TTL, and deleting items? Could the compaction of LeveledCompactionStrategy be the origin of them? @Carlos thanks for your guidance. Kind regards Jean On 15 Jun 2015, at 11:17 , Carlos Rolo r...@pythian.commailto:r...@pythian.com wrote: Hi Jean, The problem of that Warning is that you are reading too many tombstones per request. If you do have Tombstones without doing DELETE it because you probably TTL'ed the data when inserting (By mistake? Or did you set default_time_to_live in your table?). You can use nodetool cfstats to see how many tombstones per read slice you have. This is, probably, also the cause of your missing data. Data was tombstoned, so it is not available. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.comhttp://www.pythian.com/ On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com wrote: Hi, I have reloaded the data in my cluster of 3 nodes RF: 2. I have loaded about 2 billion rows in one table. I use LeveledCompactionStrategy on my table. I use version 2.1.6. I use the default cassandra.yaml, only the ip address for seeds and throughput has been change. I loaded my data with simple insert statements. This took a bit more than one day to load the data… and one more day to compact the data on all nodes. For me this is quite acceptable since I should not be doing this again. I have done this with previous versions like 2.1.3 and others and I basically had absolutely no problems. Now I read the log files on the client side, there I see no warning and no errors. On the nodes side there I see many WARNING, all related with tombstones, but there are no ERRORS. My problem is that I see some *many missing records* in the DB, and I have never observed this with previous versions. 1) Is this a know problem? 2) Do you have any idea how I could track down this problem? 3) What is the meaning of this WARNING (the only type of ERROR | WARN I could find)? WARN [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, slices=[388:201001-388:201412:!] 4) Is it possible to have Tombstone when we make no DELETE statements? I’m lost… Thanks for your help. --
Re: Missing data
Thanks Robert, but I don’t insert NULL values, but thanks anyway. On 15 Jun 2015, at 19:16 , Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: You can get tombstones from inserting null values. Not sure if that’s the problem, but it is another way of getting tombstones in your data. On Jun 15, 2015, at 10:50 AM, Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com wrote: Dear all, I identified a bit more closely the root cause of my missing data. The problem is occurring when I use dependency groupIdcom.datastax.cassandra/groupId artifactIdcassandra-driver-core/artifactId version2.1.6/version /dependency on my client against Cassandra 2.1.6. I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4. Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 2.1.6. !! So as far as I can locate the problem, I would say that the version 2.1.6 of the driver is not working properly and is loosing some of my records.!!! —— As far as my tombstones are concerned I don’t understand their origin. I removed all location in my code where I delete items, and I do not use TTL anywhere ( I don’t need this feature in my project). And yet I have many tombstones building up. Is there another origin for tombstone beside TTL, and deleting items? Could the compaction of LeveledCompactionStrategy be the origin of them? @Carlos thanks for your guidance. Kind regards Jean On 15 Jun 2015, at 11:17 , Carlos Rolo r...@pythian.commailto:r...@pythian.com wrote: Hi Jean, The problem of that Warning is that you are reading too many tombstones per request. If you do have Tombstones without doing DELETE it because you probably TTL'ed the data when inserting (By mistake? Or did you set default_time_to_live in your table?). You can use nodetool cfstats to see how many tombstones per read slice you have. This is, probably, also the cause of your missing data. Data was tombstoned, so it is not available. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.comhttp://www.pythian.com/ On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com wrote: Hi, I have reloaded the data in my cluster of 3 nodes RF: 2. I have loaded about 2 billion rows in one table. I use LeveledCompactionStrategy on my table. I use version 2.1.6. I use the default cassandra.yaml, only the ip address for seeds and throughput has been change. I loaded my data with simple insert statements. This took a bit more than one day to load the data… and one more day to compact the data on all nodes. For me this is quite acceptable since I should not be doing this again. I have done this with previous versions like 2.1.3 and others and I basically had absolutely no problems. Now I read the log files on the client side, there I see no warning and no errors. On the nodes side there I see many WARNING, all related with tombstones, but there are no ERRORS. My problem is that I see some *many missing records* in the DB, and I have never observed this with previous versions. 1) Is this a know problem? 2) Do you have any idea how I could track down this problem? 3) What is the meaning of this WARNING (the only type of ERROR | WARN I could find)? WARN [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, slices=[388:201001-388:201412:!] 4) Is it possible to have Tombstone when we make no DELETE statements? I’m lost… Thanks for your help. --
Re: Missing data
Dear all, I identified a bit more closely the root cause of my missing data. The problem is occurring when I use dependency groupIdcom.datastax.cassandra/groupId artifactIdcassandra-driver-core/artifactId version2.1.6/version /dependency on my client against Cassandra 2.1.6. I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4. Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 2.1.6. !! So as far as I can locate the problem, I would say that the version 2.1.6 of the driver is not working properly and is loosing some of my records.!!! —— As far as my tombstones are concerned I don’t understand their origin. I removed all location in my code where I delete items, and I do not use TTL anywhere ( I don’t need this feature in my project). And yet I have many tombstones building up. Is there another origin for tombstone beside TTL, and deleting items? Could the compaction of LeveledCompactionStrategy be the origin of them? @Carlos thanks for your guidance. Kind regards Jean On 15 Jun 2015, at 11:17 , Carlos Rolo r...@pythian.commailto:r...@pythian.com wrote: Hi Jean, The problem of that Warning is that you are reading too many tombstones per request. If you do have Tombstones without doing DELETE it because you probably TTL'ed the data when inserting (By mistake? Or did you set default_time_to_live in your table?). You can use nodetool cfstats to see how many tombstones per read slice you have. This is, probably, also the cause of your missing data. Data was tombstoned, so it is not available. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.comhttp://www.pythian.com/ On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com wrote: Hi, I have reloaded the data in my cluster of 3 nodes RF: 2. I have loaded about 2 billion rows in one table. I use LeveledCompactionStrategy on my table. I use version 2.1.6. I use the default cassandra.yaml, only the ip address for seeds and throughput has been change. I loaded my data with simple insert statements. This took a bit more than one day to load the data… and one more day to compact the data on all nodes. For me this is quite acceptable since I should not be doing this again. I have done this with previous versions like 2.1.3 and others and I basically had absolutely no problems. Now I read the log files on the client side, there I see no warning and no errors. On the nodes side there I see many WARNING, all related with tombstones, but there are no ERRORS. My problem is that I see some *many missing records* in the DB, and I have never observed this with previous versions. 1) Is this a know problem? 2) Do you have any idea how I could track down this problem? 3) What is the meaning of this WARNING (the only type of ERROR | WARN I could find)? WARN [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, slices=[388:201001-388:201412:!] 4) Is it possible to have Tombstone when we make no DELETE statements? I’m lost… Thanks for your help. --
Re: Missing data
Theres your problem, you're using the DataStax java driver :) I just ran into this issue in the last week and it was incredibly frustrating. If you are doing a simple loop on a select * query, then the DataStax java driver will only process 2^31 rows (e.g. the Java Integer Max (2,147,483,647)) before it stops w/o any error or output in the logs. The fact that you said you only had about 2 billion rows but you are seeing missing data is a red flag. I found the only way around this is to do your select * in chunks based on the token range (see this gist for an example: https://gist.github.com/baholladay/21eb4c61ea8905302195 ) Just loop for every 100million rows and make a new query select * from TABLE where token(key) lastToken Thanks, Bryan On Mon, Jun 15, 2015 at 12:50 PM, Jean Tremblay jean.tremb...@zen-innovations.com wrote: Dear all, I identified a bit more closely the root cause of my missing data. The problem is occurring when I use dependency groupIdcom.datastax.cassandra/groupId artifactIdcassandra-driver-core/artifactId version2.1.6/version /dependency on my client against Cassandra 2.1.6. I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4. Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 2.1.6. !! So as far as I can locate the problem, I would say that the version 2.1.6 of the driver is not working properly and is loosing some of my records.!!! —— As far as my tombstones are concerned I don’t understand their origin. I removed all location in my code where I delete items, and I do not use TTL anywhere ( I don’t need this feature in my project). And yet I have many tombstones building up. Is there another origin for tombstone beside TTL, and deleting items? Could the compaction of LeveledCompactionStrategy be the origin of them? @Carlos thanks for your guidance. Kind regards Jean On 15 Jun 2015, at 11:17 , Carlos Rolo r...@pythian.com wrote: Hi Jean, The problem of that Warning is that you are reading too many tombstones per request. If you do have Tombstones without doing DELETE it because you probably TTL'ed the data when inserting (By mistake? Or did you set default_time_to_live in your table?). You can use nodetool cfstats to see how many tombstones per read slice you have. This is, probably, also the cause of your missing data. Data was tombstoned, so it is not available. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay jean.tremb...@zen-innovations.com wrote: Hi, I have reloaded the data in my cluster of 3 nodes RF: 2. I have loaded about 2 billion rows in one table. I use LeveledCompactionStrategy on my table. I use version 2.1.6. I use the default cassandra.yaml, only the ip address for seeds and throughput has been change. I loaded my data with simple insert statements. This took a bit more than one day to load the data… and one more day to compact the data on all nodes. For me this is quite acceptable since I should not be doing this again. I have done this with previous versions like 2.1.3 and others and I basically had absolutely no problems. Now I read the log files on the client side, there I see no warning and no errors. On the nodes side there I see many WARNING, all related with tombstones, but there are no ERRORS. My problem is that I see some *many missing records* in the DB, and I have never observed this with previous versions. 1) Is this a know problem? 2) Do you have any idea how I could track down this problem? 3) What is the meaning of this WARNING (the only type of ERROR | WARN I could find)? WARN [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, slices=[388:201001-388:201412:!] 4) Is it possible to have Tombstone when we make no DELETE statements? I’m lost… Thanks for your help. --
Re: Missing data
Thanks Bryan. I believe I have a different problem with the Datastax 2.1.6 driver. My problem is not that I make huge selects. My problem seems more to occur on some inserts. I inserts MANY rows and with the version 2.1.6 of the driver I seem to be loosing some records. But thanks anyway I will remember your mail when I bump into the select problem. Cheers Jean On 15 Jun 2015, at 19:13 , Bryan Holladay holla...@longsight.commailto:holla...@longsight.com wrote: Theres your problem, you're using the DataStax java driver :) I just ran into this issue in the last week and it was incredibly frustrating. If you are doing a simple loop on a select * query, then the DataStax java driver will only process 2^31 rows (e.g. the Java Integer Max (2,147,483,647)) before it stops w/o any error or output in the logs. The fact that you said you only had about 2 billion rows but you are seeing missing data is a red flag. I found the only way around this is to do your select * in chunks based on the token range (see this gist for an example: https://gist.github.com/baholladay/21eb4c61ea8905302195 ) Just loop for every 100million rows and make a new query select * from TABLE where token(key) lastToken Thanks, Bryan On Mon, Jun 15, 2015 at 12:50 PM, Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com wrote: Dear all, I identified a bit more closely the root cause of my missing data. The problem is occurring when I use dependency groupIdcom.datastax.cassandra/groupId artifactIdcassandra-driver-core/artifactId version2.1.6/version /dependency on my client against Cassandra 2.1.6. I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4. Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 2.1.6. !! So as far as I can locate the problem, I would say that the version 2.1.6 of the driver is not working properly and is loosing some of my records.!!! —— As far as my tombstones are concerned I don’t understand their origin. I removed all location in my code where I delete items, and I do not use TTL anywhere ( I don’t need this feature in my project). And yet I have many tombstones building up. Is there another origin for tombstone beside TTL, and deleting items? Could the compaction of LeveledCompactionStrategy be the origin of them? @Carlos thanks for your guidance. Kind regards Jean On 15 Jun 2015, at 11:17 , Carlos Rolo r...@pythian.commailto:r...@pythian.com wrote: Hi Jean, The problem of that Warning is that you are reading too many tombstones per request. If you do have Tombstones without doing DELETE it because you probably TTL'ed the data when inserting (By mistake? Or did you set default_time_to_live in your table?). You can use nodetool cfstats to see how many tombstones per read slice you have. This is, probably, also the cause of your missing data. Data was tombstoned, so it is not available. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649tel:%2B1%20613%20565%208696%20x1649 www.pythian.comhttp://www.pythian.com/ On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com wrote: Hi, I have reloaded the data in my cluster of 3 nodes RF: 2. I have loaded about 2 billion rows in one table. I use LeveledCompactionStrategy on my table. I use version 2.1.6. I use the default cassandra.yaml, only the ip address for seeds and throughput has been change. I loaded my data with simple insert statements. This took a bit more than one day to load the data… and one more day to compact the data on all nodes. For me this is quite acceptable since I should not be doing this again. I have done this with previous versions like 2.1.3 and others and I basically had absolutely no problems. Now I read the log files on the client side, there I see no warning and no errors. On the nodes side there I see many WARNING, all related with tombstones, but there are no ERRORS. My problem is that I see some *many missing records* in the DB, and I have never observed this with previous versions. 1) Is this a know problem? 2) Do you have any idea how I could track down this problem? 3) What is the meaning of this WARNING (the only type of ERROR | WARN I could find)? WARN [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, slices=[388:201001-388:201412:!] 4) Is it possible to have Tombstone when we make no DELETE statements? I’m lost… Thanks for your help. --
Re: MIssing data in range query
On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim ohech...@gmail.com wrote: Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement of that. Though, I didn't intend for the question to be about supercolumns. (Yep, understand tho that if you hadn't been told that advice before, it would grate a lot less. I will try to remember that Owen Kim has received this piece of info, and will do my best to not repeat it to you... :D) It is possible I'm hitting an odd edge case though I'm having trouble reproducing the issue in a controlled environment since there seems to be a timing element to it, or at least it's not consistently happening. I haven't been able to reproduce it on a single node test cluster. I'm moving on to test a larger one now. Right, my hypothesis is that there is something within the supercolumn write path which differs from the non-supercolumn write path. In theory this should be less possible since the 1.2 era supercolumn rewrite. To be clear, are you reading back via PK? No secondary indexes involved, right? The only bells your symptoms are ringing are secondary index bugs... =Rob
Re: MIssing data in range query
Nope. No secondary index. Just a slice query on the PK. On Tuesday, October 7, 2014, Robert Coli rc...@eventbrite.com wrote: On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim ohech...@gmail.com javascript:_e(%7B%7D,'cvml','ohech...@gmail.com'); wrote: Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement of that. Though, I didn't intend for the question to be about supercolumns. (Yep, understand tho that if you hadn't been told that advice before, it would grate a lot less. I will try to remember that Owen Kim has received this piece of info, and will do my best to not repeat it to you... :D) It is possible I'm hitting an odd edge case though I'm having trouble reproducing the issue in a controlled environment since there seems to be a timing element to it, or at least it's not consistently happening. I haven't been able to reproduce it on a single node test cluster. I'm moving on to test a larger one now. Right, my hypothesis is that there is something within the supercolumn write path which differs from the non-supercolumn write path. In theory this should be less possible since the 1.2 era supercolumn rewrite. To be clear, are you reading back via PK? No secondary indexes involved, right? The only bells your symptoms are ringing are secondary index bugs... =Rob
MIssing data in range query
Hello, I'm running Cassandra 1.2.16 with supercolumns and Hector. create column family CFName with column_type = 'Super' and comparator = 'UTF8Type' and subcomparator = 'UTF8Type' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and read_repair_chance = 0.2 and dclocal_read_repair_chance = 0.0 and populate_io_cache_on_flush = false and gc_grace = 43200 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy' and caching = 'KEYS_ONLY'; I'm doing a adding a time series supercolumn then doing a slice query over this super column. I'm really just trying to see if any data is in the time slice so I'm doing a slice query with limit 1. The insert isn't at the data bounds. However, sometimes, nothing shows up in the time slice, even 8 seconds after the insert. I'm doing quorum reads and writes so I'd expect consistent results but the slice query comes up empty, even if there have been multiple inserts. I'm not sure what's happening here and trying to narrow down suspects. Can key caching produce stale results? Do slice queries have different consistency guarantees?
Re: MIssing data in range query
On Tue, Oct 7, 2014 at 1:38 PM, Owen Kim ohech...@gmail.com wrote: I'm running Cassandra 1.2.16 with supercolumns and Hector. Slightly non-responsive response : In general supercolumn use is not recommended. It makes it more difficult to get support when one uses a feature no one else uses. =Rob
Re: MIssing data in range query
I'm aware. I've had the system up since pre-composite columns and haven't had the cycles to do a major data and schema migration. And that's not slightly non-responsive. On Tue, Oct 7, 2014 at 1:49 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Oct 7, 2014 at 1:38 PM, Owen Kim ohech...@gmail.com wrote: I'm running Cassandra 1.2.16 with supercolumns and Hector. Slightly non-responsive response : In general supercolumn use is not recommended. It makes it more difficult to get support when one uses a feature no one else uses. =Rob
Re: MIssing data in range query
On Tue, Oct 7, 2014 at 2:03 PM, Owen Kim ohech...@gmail.com wrote: I'm aware. I've had the system up since pre-composite columns and haven't had the cycles to do a major data and schema migration. And that's not slightly non-responsive. There may be unknown bugs in the code you're using, especially because no one else uses it is in fact slightly responsive. While I'm sure it does grate to be told that one should not be using a feature one cannot choose to not-use, I consider don't use them responsive to every question about supercolumns since 2010, unless the asker pre-emptively states they know this fact. I assure you that my meta-response is infinitely more responsive than the total non-response you were otherwise likely to receive... ... aaanyway ... Probably you are just hitting an edge case in the 1.2 era rewrite of supercolumns which no one else has ever encountered because no one uses them. For the record, I do not believe either of your hypotheses (key cache or slice queries having different guarantees) are likely to be implicated. One of them is trivial to test : create a test CF with the key cache disabled and try to repro there. Instead of attempting to debug by yourself, or on the user list (which will be full of people not-using supercolumns) I suggest filing an JIRA with reproduction steps, and then mentioning the URL on this thread for future googlers. =Rob
Re: MIssing data in range query
Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement of that. Though, I didn't intend for the question to be about supercolumns. It is possible I'm hitting an odd edge case though I'm having trouble reproducing the issue in a controlled environment since there seems to be a timing element to it, or at least it's not consistently happening. I haven't been able to reproduce it on a single node test cluster. I'm moving on to test a larger one now. On Tue, Oct 7, 2014 at 2:39 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Oct 7, 2014 at 2:03 PM, Owen Kim ohech...@gmail.com wrote: I'm aware. I've had the system up since pre-composite columns and haven't had the cycles to do a major data and schema migration. And that's not slightly non-responsive. There may be unknown bugs in the code you're using, especially because no one else uses it is in fact slightly responsive. While I'm sure it does grate to be told that one should not be using a feature one cannot choose to not-use, I consider don't use them responsive to every question about supercolumns since 2010, unless the asker pre-emptively states they know this fact. I assure you that my meta-response is infinitely more responsive than the total non-response you were otherwise likely to receive... ... aaanyway ... Probably you are just hitting an edge case in the 1.2 era rewrite of supercolumns which no one else has ever encountered because no one uses them. For the record, I do not believe either of your hypotheses (key cache or slice queries having different guarantees) are likely to be implicated. One of them is trivial to test : create a test CF with the key cache disabled and try to repro there. Instead of attempting to debug by yourself, or on the user list (which will be full of people not-using supercolumns) I suggest filing an JIRA with reproduction steps, and then mentioning the URL on this thread for future googlers. =Rob
Re: restoring from snapshot - missing data
On Mon, May 21, 2012 at 12:01 AM, Tamar Fraenkel ta...@tok-media.comwrote: If I am putting the snapshots on a clean ring, I need to first create the data model? Yes. -- Tyler Hobbs DataStax http://datastax.com/
Re: restoring from snapshot - missing data
Thanks. After creating the data model and matching the correct snapshot with the correct new node (same token) all worked fine! *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, May 21, 2012 at 9:06 PM, Tyler Hobbs ty...@datastax.com wrote: On Mon, May 21, 2012 at 12:01 AM, Tamar Fraenkel ta...@tok-media.comwrote: If I am putting the snapshots on a clean ring, I need to first create the data model? Yes. -- Tyler Hobbs DataStax http://datastax.com/ tokLogo.png
restoring from snapshot - missing data
Hi! I am testing backup and restore. I created the restore using parallel ssh on all 3 nodes. I created a new 3 ring setup and used the snapshot to test recover. Snapshot from every original node went to one of the new nodes. When I compare the content of the data dir it seems that all files from the original cluster exist on the backup cluster. *But* when I do some cqlsh queries it seems as though about 1/3 of my data is missing. Any idea what could be the issue? I thought that snapshot flushes all in-memory writes to disk, so it can't be that some data was not on the original snapshot. Help is much appreciated, Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.png
Re: restoring from snapshot - missing data
Did you use the same tokens for the nodes in both clusters? On Sun, May 20, 2012 at 1:25 PM, Tamar Fraenkel ta...@tok-media.com wrote: Hi! I am testing backup and restore. I created the restore using parallel ssh on all 3 nodes. I created a new 3 ring setup and used the snapshot to test recover. Snapshot from every original node went to one of the new nodes. When I compare the content of the data dir it seems that all files from the original cluster exist on the backup cluster. *But* when I do some cqlsh queries it seems as though about 1/3 of my data is missing. Any idea what could be the issue? I thought that snapshot flushes all in-memory writes to disk, so it can't be that some data was not on the original snapshot. Help is much appreciated, Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 -- Tyler Hobbs DataStax http://datastax.com/ tokLogo.png
Re: restoring from snapshot - missing data
Thanks. Just figured out yesterday that I switched the snapshots mixing the tokens. Will try again today. And another question. If I am putting the snapshots on a clean ring, I need to first create the data model? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, May 21, 2012 at 1:44 AM, Tyler Hobbs ty...@datastax.com wrote: Did you use the same tokens for the nodes in both clusters? On Sun, May 20, 2012 at 1:25 PM, Tamar Fraenkel ta...@tok-media.comwrote: Hi! I am testing backup and restore. I created the restore using parallel ssh on all 3 nodes. I created a new 3 ring setup and used the snapshot to test recover. Snapshot from every original node went to one of the new nodes. When I compare the content of the data dir it seems that all files from the original cluster exist on the backup cluster. *But* when I do some cqlsh queries it seems as though about 1/3 of my data is missing. Any idea what could be the issue? I thought that snapshot flushes all in-memory writes to disk, so it can't be that some data was not on the original snapshot. Help is much appreciated, Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 -- Tyler Hobbs DataStax http://datastax.com/ tokLogo.pngtokLogo.png
Re: commitlog replay missing data
Have you verified that data you expect to see is not in the server after shutdown? WRT the differed in the difference between the Memtable data size and SSTable live size, don't believe everything you read :) Memtable live size is increased by the serialised byte size of every column inserted, and is never decremented. Deletes and overwrites will inflate this value. What was your workload like? As of 0.8 we now have global memory management for cf's that tracks actual JVM bytes used by a CF. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 12/07/2011, at 3:28 PM, Jeffrey Wang jw...@palantir.com wrote: Hey all, Recently upgraded to 0.8.1 and noticed what seems to be missing data after a commitlog replay on a single-node cluster. I start the node, insert a bunch of stuff (~600MB), stop it, and restart it. There are log messages pertaining to the commitlog replay and no errors, but some of the data is missing. If I flush before stopping the node, everything is fine, and running cfstats in the two cases shows different amounts of data in the SSTables. Moreover, the amount of data that is missing is nondeterministic. Has anyone run into this? Thanks. Here is the output of a side-by-side diff between cfstats outputs for a single CF before restarting (left) and after (right). Somehow a 37MB memtable became a 2.9MB SSTable (note the difference in write count as well)? Column Family: Blocks Column Family: Blocks SSTable count: 0 | SSTable count: 1 Space used (live): 0 | Space used (live): 2907637 Space used (total): 0 | Space used (total): 2907637 Memtable Columns Count: 8198 | Memtable Columns Count: 0 Memtable Data Size: 37550510 | Memtable Data Size: 0 Memtable Switch Count: 0 | Memtable Switch Count: 1 Read Count: 0 Read Count: 0 Read Latency: NaN ms. Read Latency: NaN ms. Write Count: 8198 | Write Count: 1526 Write Latency: 0.018 ms. | Write Latency: 0.011 ms. Pending Tasks: 0Pending Tasks: 0 Key cache capacity: 20 Key cache capacity: 20 Key cache size: 0 Key cache size: 0 Key cache hit rate: NaN Key cache hit rate: NaN Row cache: disabled Row cache: disabled Compacted row minimum size: 0 | Compacted row minimum size: 1110 Compacted row maximum size: 0 | Compacted row maximum size: 2299 Compacted row mean size: 0| Compacted row mean size: 1960 Note that I patched https://issues.apache.org/jira/browse/CASSANDRA-2317 in my version, but there are no deletions involved so I don’t think it’s relevant unless I messed something up while patching. -Jeffrey
Re: commitlog replay missing data
Recently upgraded to 0.8.1 and noticed what seems to be missing data after a commitlog replay on a single-node cluster. I start the node, insert a bunch of stuff (~600MB), stop it, and restart it. There are log messages If you stop by a kill, make sure you use batched commitlog synch mode instead of periodic if you want guarantees on individual writes. (I don't believe you'd expect a significant disk space discrepancy though since in practice the delay until write() should be small. But don't quote me on this because I'd have to check the code to make sure that commit log reply isn't dependent on some marker that isn't written until commit log synch.) -- / Peter Schuller (@scode on twitter)
Re: commitlog replay missing data
Peter Schuller wrote: Recently upgraded to 0.8.1 and noticed what seems to be missing data after a commitlog replay on a single-node cluster. I start the node, insert a bunch of stuff (~600MB), stop it, and restart it. There are log messages If you stop by a kill, make sure you use batched commitlog synch mode instead of periodic if you want guarantees on individual writes. What are the other ways to stop Cassandra? What's the difference between batch vs periodic? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/commitlog-replay-missing-data-tp6573659p6580886.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: commitlog replay missing data
# wait for a bit until no one is sending it writes anymore More accurately, until all other nodes have realized it's down (nodetool ring on each respective host). -- / Peter Schuller (@scode on twitter)
commitlog replay missing data
Hey all, Recently upgraded to 0.8.1 and noticed what seems to be missing data after a commitlog replay on a single-node cluster. I start the node, insert a bunch of stuff (~600MB), stop it, and restart it. There are log messages pertaining to the commitlog replay and no errors, but some of the data is missing. If I flush before stopping the node, everything is fine, and running cfstats in the two cases shows different amounts of data in the SSTables. Moreover, the amount of data that is missing is nondeterministic. Has anyone run into this? Thanks. Here is the output of a side-by-side diff between cfstats outputs for a single CF before restarting (left) and after (right). Somehow a 37MB memtable became a 2.9MB SSTable (note the difference in write count as well)? Column Family: Blocks Column Family: Blocks SSTable count: 0 | SSTable count: 1 Space used (live): 0 | Space used (live): 2907637 Space used (total): 0 | Space used (total): 2907637 Memtable Columns Count: 8198 | Memtable Columns Count: 0 Memtable Data Size: 37550510 | Memtable Data Size: 0 Memtable Switch Count: 0 | Memtable Switch Count: 1 Read Count: 0 Read Count: 0 Read Latency: NaN ms. Read Latency: NaN ms. Write Count: 8198 | Write Count: 1526 Write Latency: 0.018 ms. | Write Latency: 0.011 ms. Pending Tasks: 0Pending Tasks: 0 Key cache capacity: 20 Key cache capacity: 20 Key cache size: 0 Key cache size: 0 Key cache hit rate: NaN Key cache hit rate: NaN Row cache: disabled Row cache: disabled Compacted row minimum size: 0 | Compacted row minimum size: 1110 Compacted row maximum size: 0 | Compacted row maximum size: 2299 Compacted row mean size: 0| Compacted row mean size: 1960 Note that I patched https://issues.apache.org/jira/browse/CASSANDRA-2317 in my version, but there are no deletions involved so I don't think it's relevant unless I messed something up while patching. -Jeffrey smime.p7s Description: S/MIME cryptographic signature