Re: High disk I/O during reads
Having battled similar issues with read latency recently, here's some general things to look out for. - At 118ms, something is definitely broken. You should be looking at under 10ms or lower, depending on hardware. - Do nodetool info on all 5 nodes. Is the load distributed evenly? Is it reasonable (under 500GB)? - Make sure you aren't running low on heap space. You could see that from nodetool info also. If you are running low, very bad things begin to happen (lots of GC, constant flushing of Memtables, reduction of Key Cache, etc). Generally, once there, the node doesn't recover, and read latency goes to sh*t. - Which compaction strategy are you using? Leveled compactions or size-tiered? There's different issues with both. - Is your Key Cache turned on? What's the Key Cache hit rate? - Is the Read Latency the same on all nodes? Or just one in particular? - Are pending compactions building up? - What's %util on disk? Same on all nodes? I would go through nodetool cfstats, info, compactionstats, tpstats, and see if things are roughly the same across all the nodes. You could also just be under capacity, but more likely, there's an actual problem looming somewhere. Cheers! -Matt On Sat, Mar 23, 2013 at 3:18 AM, i...@4friends.od.ua wrote: You can try to disable readahead on cassandra data disk. Jon Scarborough j...@fifth-aeon.net написал(а): Checked tpstats, there are very few dropped messages. Checked histograms. Mostly nothing surprising. The vast majority of rows are small, and most reads only access one or two SSTables. What I did discover is that of our 5 nodes, one is performing well, with disk I/O in the ballprk that seems reasonable. The other 4 nodes are doing roughly 4x the disk i/O per second. Interestingly, the node that is performing well also seems to be servicing about twice the number of reads that the other nodes are. I compared configuration between the node performing well to those that aren't, and so far haven't found any discrepancies. On Fri, Mar 22, 2013 at 10:43 AM, Wei Zhu wz1...@yahoo.com wrote: According to your cfstats, read latency is over 100 ms which is really really slow. I am seeing less than 3ms reads for my cluster which is on SSD. Can you also check the nodetool cfhistorgram, it tells you more about the number of SSTable involved and read/write latency. Somtimes average doesn't tell you the whole storey. Also check your nodetool tpstats, are there a lot dropped reads? -Wei - Original Message - From: Jon Scarborough j...@fifth-aeon.net To: user@cassandra.apache.org Sent: Friday, March 22, 2013 9:42:34 AM Subject: Re: High disk I/O during reads Key distribution across probably varies a lot from row to row in our case. Most reads would probably only need to look at a few SSTables, a few might need to look at more. I don't yet have a deep understanding of C* internals, but I would imagine even the more expensive use cases would involve something like this: 1) Check the index for each SSTable to determine if part of the row is there. 2) Look at the endpoints of the slice to determine if the data in a particular SSTable is relevant to the query. 3) Read the chunks of those SSTables, working backwards from the end of the slice until enough columns have been read to satisfy the limit clause in the query. So I would have guessed that even the more expensive queries on wide rows typically wouldn't need to read more than a few hundred KB from disk to do all that. Seems like I'm missing something major. Here's the complete CF definition, including compression settings: CREATE COLUMNFAMILY conversation_text_message ( conversation_key bigint PRIMARY KEY ) WITH comment='' AND comparator='CompositeType(org.apache.cassandra.db.marshal.DateType,org.apache.cassandra.db.marshal.LongType,org.apache.cassandra.db.marshal.AsciiType,org.apache.cassandra.db.marshal.AsciiType)' AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND default_validation=text AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND replicate_on_write=True AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='org.apache.cassandra.io.compress.SnappyCompressor'; Much thanks for any additional ideas. -Jon On Fri, Mar 22, 2013 at 8:15 AM, Hiller, Dean dean.hil...@nrel.gov wrote: Did you mean to ask are 'all' your keys spread across all SSTables? I am guessing at your intention. I mean I would very well hope my keys are spread across all sstables or otherwise that sstable should not be there as he has no keys in it ;). And I know we had HUGE disk size from the duplication in our sstables on size-tiered compactionwe never ran a major compaction but after we switched to LCS, we went from 300G to some 120G or something like that which was nice. We only have 300 data point posts / second so not an extreme write load on 6 nodes as well though these
Re: hinted handoff disabling trade-offs
Thanks Aaron, appreciate the advice. On Tue, Mar 19, 2013 at 3:14 AM, aaron morton aa...@thelastpickle.com wrote: I think I understand what it means for application-level data, but the part I'm not entirely sure about is what it could mean for Cassandra internals. Internally it means the write will not be retries to nodes that were either down or did not ack before rpc_timeout. That's all. If you are doing thing with read_repair_chance == 0 and CL ONE you are in a very eventually consistent world. The only thing that will guarantee consistency for you now is running nodetool repair. My cluster is under heavy write load. I'm considering disabling Hinted Handoffs so the nodes recover quicker in case compactions begin to back up. If the node cluster is approaching capacity, then ultimately the thing to do is add more nodes. The only things to do are disable the commit log and use a lower CL. If it's approaching capacity you will start to see pending mutations back up, maybe some dropped mutations and the maybe an increase in the difference between the latency reported in the proxyhistograms and the cfhistograms or cfstats. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 16/03/2013, at 4:50 PM, Matt Kap matvey1...@gmail.com wrote: Thanks Aaron. I am using CL=ONE. read_repair_chance=0. The part which I'm wondering about is what happens to the internal Cassandra writes if Hinted Handoffs are disabled. I think I understand what it means for application-level data, but the part I'm not entirely sure about is what it could mean for Cassandra internals. My cluster is under heavy write load. I'm considering disabling Hinted Handoffs so the nodes recover quicker in case compactions begin to back up. On Wed, Mar 6, 2013 at 2:06 AM, aaron morton aa...@thelastpickle.com wrote: The advantage of HH is that it reduces the probability of a DigestMismatch when using a CL ONE. A DigestMismatch means the read has to run a second time before returning to the client. - No risk of hinted-handoffs building up - No risk of hinted-handoffs flooding a node that just came up See the yaml config settings for the max hint window and the throttling. Can anyone suggest any other factors that I'm missing here. Specifically reasons not to do this. If you are doing this for performance first make sure your data model is efficient, that you are doing the most efficient reads (see my presentation here http://www.datastax.com/events/cassandrasummit2012/presentations), and your caching is bang on. Then consider if you can tune the CL, and if your client is token aware so it directs traffic to a node that has it. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/03/2013, at 9:19 PM, Michael Kjellman mkjell...@barracuda.com wrote: Also, if you have enough hints being created that its significantly impacting your heap I have a feeling things are going to get out of sync very quickly. On Mar 4, 2013, at 9:17 PM, Wz1975 wz1...@yahoo.com wrote: Why do you think disabling hinted handoff will improve memory usage? Thanks. -Wei Sent from my Samsung smartphone on ATT Original message Subject: Re: hinted handoff disabling trade-offs From: Michael Kjellman mkjell...@barracuda.com To: user@cassandra.apache.org user@cassandra.apache.org CC: Repair is slow. On Mar 4, 2013, at 8:07 PM, Matt Kap matvey1...@gmail.com wrote: I am looking to get a second opinion about disabling hinted-handoffs. I have an application that can tolerate a fair amount of inconsistency (advertising domain), and so I'm weighting the pros and cons of hinted handoffs. I'm running Cassandra 1.0, looking to upgrade to 1.1 soon. Pros of disabling hinted handoffs: - Reduces heap - Improves GC performance - No risk of hinted-handoffs building up - No risk of hinted-handoffs flooding a node that just came up Cons - Some writes can be lost, at least until repair runs Can anyone suggest any other factors that I'm missing here. Specifically reasons not to do this. Cheers! -Matt Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com. -- Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com. -- www.calcmachine.com - easy online calculator. -- www.calcmachine.com - easy online calculator.
Re: hinted handoff disabling trade-offs
Thanks Aaron. I am using CL=ONE. read_repair_chance=0. The part which I'm wondering about is what happens to the internal Cassandra writes if Hinted Handoffs are disabled. I think I understand what it means for application-level data, but the part I'm not entirely sure about is what it could mean for Cassandra internals. My cluster is under heavy write load. I'm considering disabling Hinted Handoffs so the nodes recover quicker in case compactions begin to back up. On Wed, Mar 6, 2013 at 2:06 AM, aaron morton aa...@thelastpickle.com wrote: The advantage of HH is that it reduces the probability of a DigestMismatch when using a CL ONE. A DigestMismatch means the read has to run a second time before returning to the client. - No risk of hinted-handoffs building up - No risk of hinted-handoffs flooding a node that just came up See the yaml config settings for the max hint window and the throttling. Can anyone suggest any other factors that I'm missing here. Specifically reasons not to do this. If you are doing this for performance first make sure your data model is efficient, that you are doing the most efficient reads (see my presentation here http://www.datastax.com/events/cassandrasummit2012/presentations), and your caching is bang on. Then consider if you can tune the CL, and if your client is token aware so it directs traffic to a node that has it. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/03/2013, at 9:19 PM, Michael Kjellman mkjell...@barracuda.com wrote: Also, if you have enough hints being created that its significantly impacting your heap I have a feeling things are going to get out of sync very quickly. On Mar 4, 2013, at 9:17 PM, Wz1975 wz1...@yahoo.com wrote: Why do you think disabling hinted handoff will improve memory usage? Thanks. -Wei Sent from my Samsung smartphone on ATT Original message Subject: Re: hinted handoff disabling trade-offs From: Michael Kjellman mkjell...@barracuda.com To: user@cassandra.apache.org user@cassandra.apache.org CC: Repair is slow. On Mar 4, 2013, at 8:07 PM, Matt Kap matvey1...@gmail.com wrote: I am looking to get a second opinion about disabling hinted-handoffs. I have an application that can tolerate a fair amount of inconsistency (advertising domain), and so I'm weighting the pros and cons of hinted handoffs. I'm running Cassandra 1.0, looking to upgrade to 1.1 soon. Pros of disabling hinted handoffs: - Reduces heap - Improves GC performance - No risk of hinted-handoffs building up - No risk of hinted-handoffs flooding a node that just came up Cons - Some writes can be lost, at least until repair runs Can anyone suggest any other factors that I'm missing here. Specifically reasons not to do this. Cheers! -Matt Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com. -- Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com. -- www.calcmachine.com - easy online calculator.
hinted handoff disabling trade-offs
I am looking to get a second opinion about disabling hinted-handoffs. I have an application that can tolerate a fair amount of inconsistency (advertising domain), and so I'm weighting the pros and cons of hinted handoffs. I'm running Cassandra 1.0, looking to upgrade to 1.1 soon. Pros of disabling hinted handoffs: - Reduces heap - Improves GC performance - No risk of hinted-handoffs building up - No risk of hinted-handoffs flooding a node that just came up Cons - Some writes can be lost, at least until repair runs Can anyone suggest any other factors that I'm missing here. Specifically reasons not to do this. Cheers! -Matt