Read perf investigation

2011-11-03 Thread Ian Danforth
All,

 I've done a bit more homework, and I continue to see long 200ms to 300ms
read times for some keys.

Test Setup

EC2 M1Large sending requests to a 5 node C* cluster also in EC2, also all
M1Large. RF=3. ReadConsistency = ONE. I'm using pycassa from python for all
communication.

Data Model

One column family with tens of millions of rows. The number of columns per
row varies between 0 and 1440 (per minute records). The values are all
ints. All data stored on EBS volumes. Total load per node is ~110GB.

According to VMstat I'm not swapping at all.

Highest %Util I see
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdf  0.00  2788.00   17.00  267.50  1168.00 23020.0085.02
   32.37  107.73   1.22  34.60

A more average profile I see is:

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdf  0.00 0.00   21.000.00  1288.00 0.0061.33
0.37   18.38   9.43  19.80

QUESTION

Where should I look next? I'd love to get a profile of exactly where
cassandra is spending its time on a per call basis.

Thanks in advance,

Ian


RE: Read perf investigation

2011-11-03 Thread Dan Hendry
Uh, so look at your await time of *107.3*. From the iostat man page: await:
The average time (in milliseconds) for I/O requests issued to the device to
be  served.  This includes the time spent by the requests in queue and the
time spent servicing them.

 

If the key you are reading from is not in Cassandras key cache or row cache,
Cassandra needs to do two disk seeks
(http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra).
This means that some of your *must* take on average 215 ms not even
including network latency. Looks like EBS, or more generally disk
saturation, is your problem. Perhaps consider RAID0 with ephemeral drives.

 

Dan

 

From: Ian Danforth [mailto:idanfo...@numenta.com] 
Sent: November-03-11 18:34
To: user@cassandra.apache.org
Subject: Read perf investigation

 

All,

 

 I've done a bit more homework, and I continue to see long 200ms to 300ms
read times for some keys.

 

Test Setup

 

EC2 M1Large sending requests to a 5 node C* cluster also in EC2, also all
M1Large. RF=3. ReadConsistency = ONE. I'm using pycassa from python for all
communication.

 

Data Model

 

One column family with tens of millions of rows. The number of columns per
row varies between 0 and 1440 (per minute records). The values are all ints.
All data stored on EBS volumes. Total load per node is ~110GB.

 

According to VMstat I'm not swapping at all.

 

Highest %Util I see

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util

xvdf  0.00  2788.00   17.00  267.50  1168.00 23020.0085.02
32.37  107.73   1.22  34.60

 

A more average profile I see is:

 

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util

xvdf  0.00 0.00   21.000.00  1288.00 0.0061.33
0.37   18.38   9.43  19.80

 

QUESTION

 

Where should I look next? I'd love to get a profile of exactly where
cassandra is spending its time on a per call basis.

 

Thanks in advance,

 

Ian

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.920 / Virus Database: 271.1.1/3993 - Release Date: 11/03/11
03:39:00