Yeah, I think you're right: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda1 34.00 70.00 409.60 11.20 22596.80 649.60 55.24 32.61 77.67 2.38 100.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle 2.25 0.00 0.55 39.72 0.90 56.58 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda1 39.60 79.20 357.40 10.00 20022.40 713.60 56.44 30.04 81.59 2.70 99.12 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 1.70 0.00 0.40 38.80 0.90 58.20 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda1 43.40 72.00 421.00 9.00 23248.00 648.00 55.57 32.64 75.74 2.33 100.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 2.00 0.00 0.40 37.41 1.00 59.19 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda1 19.60 58.40 348.20 6.60 19691.20 520.00 56.97 30.88 86.55 2.78 98.80 sda2 1.40 0.00 0.20 0.00 12.80 0.00 64.00 0.00 20.00 20.00 0.40 avg-cpu: %user %nice %system %iowait %steal %idle 2.65 0.00 0.65 38.32 0.90 57.48 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda1 34.00 61.80 424.40 10.60 24348.80 579.20 57.31 33.25 76.89 2.30 100.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 On Thu, Jan 14, 2010 at 10:22 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > it sounds like you just don't have enough ram for the OS to cache your > "hot" data set so you are getting killed on disk seeks. iostat -x 5 > (for example) during load should verify this. > > On Thu, Jan 14, 2010 at 11:19 AM, Suhail Doshi <digitalwarf...@gmail.com> > wrote: > > Looking at my data directory: 14 G. Just Index.db based files: 4.5 G. > > > > Yes only one node so far. > > > > vmstat -n 1 -S m > > procs -----------memory---------- ---swap-- -----io---- -system-- > > ----cpu---- > > r b swpd free buff cache si so bi bo in cs us sy > id > > wa > > 0 0 22 585 32 2557 0 0 71 31 5 5 1 0 > 94 > > 3 > > > > On Thu, Jan 14, 2010 at 10:11 AM, Jonathan Ellis <jbel...@gmail.com> > wrote: > > > >> how much data do you have on disk? (only on enode?) how large are > >> the columns you are reading? how much ram does vmstat say is being > >> used for cache? > >> > >> On Thu, Jan 14, 2010 at 11:06 AM, Suhail Doshi < > digitalwarf...@gmail.com> > >> wrote: > >> > Right now it's ~5-10 keys, with 5 columns per key. > >> > > >> > Later it will be 64 keys (max) with 200 columns per key worst case. > >> > > >> > Suhail > >> > > >> > On Thu, Jan 14, 2010 at 9:45 AM, Jonathan Ellis <jbel...@gmail.com> > >> wrote: > >> > > >> >> how many keys are you fetching? how many columns for each key? > >> >> > >> >> On Thu, Jan 14, 2010 at 1:49 AM, Suhail Doshi <suh...@mixpanel.com> > >> wrote: > >> >> > I've been seeing multiget_slice take an extremely long time: > >> >> > > >> >> > 2010-01-14 07:44:00,513 INFO ------------------ Cassandra, delay: > >> >> > 3.64020800591 ----------------------- > >> >> > 2010-01-14 07:44:00,513 INFO method: multiget_slice > >> >> > 2010-01-14 07:44:00,513 INFO {'keys': > >> >> > > >> >> > >> > [u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2', > >> >> > > >> >> > >> > u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a', > >> >> > > >> >> > >> > u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9', > >> >> > > >> >> > >> > u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac', > >> >> > > >> >> > >> > u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11'], > >> >> > 'column_parent': ColumnParent(column_family='DistinctIndex', > >> >> > super_column=None), 'predicate': SlicePredicate(column_names=None, > >> >> > slice_range=SliceRange(count=14000, start='date_2009-07-01', > >> >> reversed=False, > >> >> > finish='date_2010-01-14'))} > >> >> > > >> >> > 2010-01-14 07:44:00,513 INFO result: > >> >> > > >> >> > 2010-01-14 07:44:00,513 INFO > >> >> > > >> >> > >> > {'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a': > >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, > >> >> > name='date_2010-01-11', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, > >> >> > name='date_2010-01-12', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, > >> >> > name='date_2010-01-13', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, > >> >> > name='date_2010-01-14', value='1'), super_column=None)], > >> >> > > >> >> > >> > 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9': > >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, > >> >> > name='date_2010-01-11', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, > >> >> > name='date_2010-01-12', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, > >> >> > name='date_2010-01-13', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, > >> >> > name='date_2010-01-14', value='1'), super_column=None)], > >> >> > > >> >> > >> > 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11': > >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263333256, > >> >> > name='date_2010-01-12', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, > >> >> > name='date_2010-01-13', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, > >> >> > name='date_2010-01-14', value='1'), super_column=None)], > >> >> > > >> >> > >> > 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2': > >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, > >> >> > name='date_2010-01-11', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, > >> >> > name='date_2010-01-12', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, > >> >> > name='date_2010-01-13', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, > >> >> > name='date_2010-01-14', value='1'), super_column=None)], > >> >> > > >> >> > >> > 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac': > >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, > >> >> > name='date_2010-01-11', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, > >> >> > name='date_2010-01-12', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, > >> >> > name='date_2010-01-13', value='1'), super_column=None), > >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, > >> >> > name='date_2010-01-14', value='1'), super_column=None)]} > >> >> > > >> >> > > >> >> > The delay is the time at which it took to run the query and then > >> return a > >> >> > result. The box has 4GB of RAM and the *JVM_MAX_MEM (-Xmx) is set > at > >> 3G*. > >> >> If > >> >> > you're curious how I am running it: > >> >> > > >> >> > /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile > >> >> > /var/run/cassandra.pid -errfile &1 -outfile > >> /var/log/cassandra/output.log > >> >> > -cp > >> >> > > >> >> > >> > /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-incubating-0.5.0-rc1.jar:/usr/share/cassandra/apache-cassandra-incubating.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collect-1.0-rc1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json_simple-1.1.jar:/usr/share/cassandra/junit-4.6.jar:/usr/share/cassandra/libthrift-r894924.jar:/usr/share/cassandra/log4j-1.2.15.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar > >> >> > -Xmx3G -Xms128M -Dcassandra -Dstorage-config=/etc/cassandra > >> >> > -Dcom.sun.management.jmxremote.port=8080 > >> >> > -Dcom.sun.management.jmxremote.ssl=false > >> >> > -Dcom.sun.management.jmxremote.authenticate=false > >> >> > org.apache.cassandra.service.CassandraDaemon > >> >> > > >> >> > I am running version* 0.5.0rc2*. > >> >> > > >> >> > Does anyone know what the bottleneck might be and how reads using > >> >> > multiget_slice can be sped up? When I look at the memory used, it's > >> only > >> >> > about 1700 MB used. The box is not excessively swapping, running > >> >> > iostat--everything seemed pretty okay. > >> >> > > >> >> > Suhail > >> >> > > >> >> > >> > > >> > > >> > > >> > -- > >> > http://mixpanel.com > >> > Blog: http://blog.mixpanel.com > >> > > >> > > > > > > > > -- > > http://mixpanel.com > > Blog: http://blog.mixpanel.com > > > -- http://mixpanel.com Blog: http://blog.mixpanel.com