Re: Persistently increasing read latency

B. Todd Burruss Fri, 04 Dec 2009 11:40:49 -0800

i've actually allocatted plenty of RAM, but it isn't fully used.  the
read performance dwindles well before reaching a RAM limit.


thx

On Fri, 2009-12-04 at 13:18 -0600, Jonathan Ellis wrote:
> Fundamentally there's only so much I/O you can do at a time.  If you
> don't have enough, you need to upgrade to servers with better i/o
> (i.e. not EC2: http://pl.atyp.us/wordpress/?p=2240&cpage=1) and/or
> more ram to cache the reads against.
> 
> On Fri, Dec 4, 2009 at 1:07 PM, B. Todd Burruss <[email protected]> wrote:
> > this is very concerning to me.  it doesn't seem to take much to bring
> > the read performance to an unacceptable level.  are there any
> > suggestions about how to improve performance.
> >
> > here are the params from my config file that are not defaults.  i
> > adjusted these to get real good performance, but not over the long haul.
> > has anyone had any luck adjusting these to help the problem tim and I
> > are having?
> >
> > <CommitLogRotationThresholdInMB>256</CommitLogRotationThresholdInMB>
> > <MemtableSizeInMB>1024</MemtableSizeInMB>
> > <MemtableObjectCountInMillions>0.6</MemtableObjectCountInMillions>
> > <CommitLogSyncPeriodInMS>1000</CommitLogSyncPeriodInMS>
> > <MemtableFlushAfterMinutes>1440</MemtableFlushAfterMinutes>
> >
> >
> > thx!
> >
> > On Fri, 2009-12-04 at 18:49 +0000, Freeman, Tim wrote:
> >> The speed of compaction isn't the problem.  The problem is that lots of 
> >> reads and writes cause compaction to fall behind.
> >>
> >> You could solve the problem by throttling reads and writes so compaction 
> >> isn't starved.  (Maybe just the writes.  I'm not sure.)
> >>
> >> Different nodes will have different compaction backlogs, so you'd want to 
> >> do this on a per node basis after Cassandra has made decisions about 
> >> whatever replication it's going to do.  For example, Cassandra could 
> >> observe the number of pending compaction tasks and sleep that many 
> >> milliseconds before every read and write.
> >>
> >> The status quo is that I have to count a load test as passing only if the 
> >> amount of backlogged compaction work stays less than some bound.  I'd 
> >> rather not have to peer into Cassandra internals to determine whether it's 
> >> really working or not.  It's a problem if 16 hour load tests get different 
> >> results than 1 hour load tests because in my tests I'm renting a cluster 
> >> by the hour.
> >>
> >> Tim Freeman
> >> Email: [email protected]
> >> Desk in Palo Alto: (650) 857-2581
> >> Home: (408) 774-1298
> >> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and 
> >> Thursday; call my desk instead.)
> >>
> >> -----Original Message-----
> >> From: Jonathan Ellis [mailto:[email protected]]
> >> Sent: Thursday, December 03, 2009 3:06 PM
> >> To: [email protected]
> >> Subject: Re: Persistently increasing read latency
> >>
> >> Thanks for looking into this.  Doesn't seem like there's much
> >> low-hanging fruit to make compaction faster but I'll keep that in the
> >> back of my mind.
> >>
> >> -Jonathan
> >>
> >> On Thu, Dec 3, 2009 at 4:58 PM, Freeman, Tim <[email protected]> wrote:
> >> >>So this is working as designed, but the design is poor because it
> >> >>causes confusion.  If you can open a ticket for this that would be
> >> >>great.
> >> >
> >> > Done, see:
> >> >
> >> >   https://issues.apache.org/jira/browse/CASSANDRA-599
> >> >
> >> >>What does iostat -x 10 (for instance) say about the disk activity?
> >> >
> >> > rkB/s is consistently high, and wkB/s varies.  This is a typical entry 
> >> > with wkB/s at the high end of its range:
> >> >
> >> >>avg-cpu:  %user   %nice    %sys %iowait   %idle
> >> >>           1.52    0.00    1.70   27.49   69.28
> >> >>
> >> >>Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s 
> >> >>avgrq-sz avgqu-sz   await  svctm  %util
> >> >>sda          3.10 3249.25 124.08 29.67 26299.30 26288.11 13149.65 
> >> >>13144.06   342.04    17.75   92.25   5.98  91.92
> >> >>sda1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00   
> >> >>  0.00     0.00    0.00   0.00   0.00
> >> >>sda2         3.10 3249.25 124.08 29.67 26299.30 26288.11 13149.65 
> >> >>13144.06   342.04    17.75   92.25   5.98  91.92
> >> >>sda3         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00   
> >> >>  0.00     0.00    0.00   0.00   0.00
> >> >
> >> > and at the low end:
> >> >
> >> >>avg-cpu:  %user   %nice    %sys %iowait   %idle
> >> >>           1.50    0.00    1.77   25.80   70.93
> >> >>
> >> >>Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s 
> >> >>avgrq-sz avgqu-sz   await  svctm  %util
> >> >>sda          3.40 817.10 128.60 17.70 27828.80 6600.00 13914.40  3300.00 
> >> >>  235.33     6.13   56.63   6.21  90.81
> >> >>sda1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00   
> >> >>  0.00     0.00    0.00   0.00   0.00
> >> >>sda2         3.40 817.10 128.60 17.70 27828.80 6600.00 13914.40  3300.00 
> >> >>  235.33     6.13   56.63   6.21  90.81
> >> >>sda3         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00   
> >> >>  0.00     0.00    0.00   0.00   0.00
> >> >
> >> > Tim Freeman
> >> > Email: [email protected]
> >> > Desk in Palo Alto: (650) 857-2581
> >> > Home: (408) 774-1298
> >> > Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and 
> >> > Thursday; call my desk instead.)
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: Jonathan Ellis [mailto:[email protected]]
> >> > Sent: Thursday, December 03, 2009 2:45 PM
> >> > To: [email protected]
> >> > Subject: Re: Persistently increasing read latency
> >> >
> >> > On Thu, Dec 3, 2009 at 4:34 PM, Freeman, Tim <[email protected]> wrote:
> >> >>>Can you tell if the system is i/o or cpu bound during compaction?
> >> >>
> >> >> It's I/O bound.  It's using ~9% of 1 of 4 cores as I watch it, and all 
> >> >> it's doing right now is compactions.
> >> >
> >> > What does iostat -x 10 (for instance) say about the disk activity?
> >> >
> >
> >
> >

Re: Persistently increasing read latency

Reply via email to