i've actually allocatted plenty of RAM, but it isn't fully used. the read performance dwindles well before reaching a RAM limit.
thx On Fri, 2009-12-04 at 13:18 -0600, Jonathan Ellis wrote: > Fundamentally there's only so much I/O you can do at a time. If you > don't have enough, you need to upgrade to servers with better i/o > (i.e. not EC2: http://pl.atyp.us/wordpress/?p=2240&cpage=1) and/or > more ram to cache the reads against. > > On Fri, Dec 4, 2009 at 1:07 PM, B. Todd Burruss <[email protected]> wrote: > > this is very concerning to me. it doesn't seem to take much to bring > > the read performance to an unacceptable level. are there any > > suggestions about how to improve performance. > > > > here are the params from my config file that are not defaults. i > > adjusted these to get real good performance, but not over the long haul. > > has anyone had any luck adjusting these to help the problem tim and I > > are having? > > > > <CommitLogRotationThresholdInMB>256</CommitLogRotationThresholdInMB> > > <MemtableSizeInMB>1024</MemtableSizeInMB> > > <MemtableObjectCountInMillions>0.6</MemtableObjectCountInMillions> > > <CommitLogSyncPeriodInMS>1000</CommitLogSyncPeriodInMS> > > <MemtableFlushAfterMinutes>1440</MemtableFlushAfterMinutes> > > > > > > thx! > > > > On Fri, 2009-12-04 at 18:49 +0000, Freeman, Tim wrote: > >> The speed of compaction isn't the problem. The problem is that lots of > >> reads and writes cause compaction to fall behind. > >> > >> You could solve the problem by throttling reads and writes so compaction > >> isn't starved. (Maybe just the writes. I'm not sure.) > >> > >> Different nodes will have different compaction backlogs, so you'd want to > >> do this on a per node basis after Cassandra has made decisions about > >> whatever replication it's going to do. For example, Cassandra could > >> observe the number of pending compaction tasks and sleep that many > >> milliseconds before every read and write. > >> > >> The status quo is that I have to count a load test as passing only if the > >> amount of backlogged compaction work stays less than some bound. I'd > >> rather not have to peer into Cassandra internals to determine whether it's > >> really working or not. It's a problem if 16 hour load tests get different > >> results than 1 hour load tests because in my tests I'm renting a cluster > >> by the hour. > >> > >> Tim Freeman > >> Email: [email protected] > >> Desk in Palo Alto: (650) 857-2581 > >> Home: (408) 774-1298 > >> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and > >> Thursday; call my desk instead.) > >> > >> -----Original Message----- > >> From: Jonathan Ellis [mailto:[email protected]] > >> Sent: Thursday, December 03, 2009 3:06 PM > >> To: [email protected] > >> Subject: Re: Persistently increasing read latency > >> > >> Thanks for looking into this. Doesn't seem like there's much > >> low-hanging fruit to make compaction faster but I'll keep that in the > >> back of my mind. > >> > >> -Jonathan > >> > >> On Thu, Dec 3, 2009 at 4:58 PM, Freeman, Tim <[email protected]> wrote: > >> >>So this is working as designed, but the design is poor because it > >> >>causes confusion. If you can open a ticket for this that would be > >> >>great. > >> > > >> > Done, see: > >> > > >> > https://issues.apache.org/jira/browse/CASSANDRA-599 > >> > > >> >>What does iostat -x 10 (for instance) say about the disk activity? > >> > > >> > rkB/s is consistently high, and wkB/s varies. This is a typical entry > >> > with wkB/s at the high end of its range: > >> > > >> >>avg-cpu: %user %nice %sys %iowait %idle > >> >> 1.52 0.00 1.70 27.49 69.28 > >> >> > >> >>Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > >> >>avgrq-sz avgqu-sz await svctm %util > >> >>sda 3.10 3249.25 124.08 29.67 26299.30 26288.11 13149.65 > >> >>13144.06 342.04 17.75 92.25 5.98 91.92 > >> >>sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >> >> 0.00 0.00 0.00 0.00 0.00 > >> >>sda2 3.10 3249.25 124.08 29.67 26299.30 26288.11 13149.65 > >> >>13144.06 342.04 17.75 92.25 5.98 91.92 > >> >>sda3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >> >> 0.00 0.00 0.00 0.00 0.00 > >> > > >> > and at the low end: > >> > > >> >>avg-cpu: %user %nice %sys %iowait %idle > >> >> 1.50 0.00 1.77 25.80 70.93 > >> >> > >> >>Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > >> >>avgrq-sz avgqu-sz await svctm %util > >> >>sda 3.40 817.10 128.60 17.70 27828.80 6600.00 13914.40 3300.00 > >> >> 235.33 6.13 56.63 6.21 90.81 > >> >>sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >> >> 0.00 0.00 0.00 0.00 0.00 > >> >>sda2 3.40 817.10 128.60 17.70 27828.80 6600.00 13914.40 3300.00 > >> >> 235.33 6.13 56.63 6.21 90.81 > >> >>sda3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > >> >> 0.00 0.00 0.00 0.00 0.00 > >> > > >> > Tim Freeman > >> > Email: [email protected] > >> > Desk in Palo Alto: (650) 857-2581 > >> > Home: (408) 774-1298 > >> > Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and > >> > Thursday; call my desk instead.) > >> > > >> > > >> > -----Original Message----- > >> > From: Jonathan Ellis [mailto:[email protected]] > >> > Sent: Thursday, December 03, 2009 2:45 PM > >> > To: [email protected] > >> > Subject: Re: Persistently increasing read latency > >> > > >> > On Thu, Dec 3, 2009 at 4:34 PM, Freeman, Tim <[email protected]> wrote: > >> >>>Can you tell if the system is i/o or cpu bound during compaction? > >> >> > >> >> It's I/O bound. It's using ~9% of 1 of 4 cores as I watch it, and all > >> >> it's doing right now is compactions. > >> > > >> > What does iostat -x 10 (for instance) say about the disk activity? > >> > > > > > > >
