it sounds like you just don't have enough ram for the OS to cache your "hot" data set so you are getting killed on disk seeks. iostat -x 5 (for example) during load should verify this.
On Thu, Jan 14, 2010 at 11:19 AM, Suhail Doshi <digitalwarf...@gmail.com> wrote: > Looking at my data directory: 14 G. Just Index.db based files: 4.5 G. > > Yes only one node so far. > > vmstat -n 1 -S m > procs -----------memory---------- ---swap-- -----io---- -system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id > wa > 0 0 22 585 32 2557 0 0 71 31 5 5 1 0 94 > 3 > > On Thu, Jan 14, 2010 at 10:11 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > >> how much data do you have on disk? (only on enode?) how large are >> the columns you are reading? how much ram does vmstat say is being >> used for cache? >> >> On Thu, Jan 14, 2010 at 11:06 AM, Suhail Doshi <digitalwarf...@gmail.com> >> wrote: >> > Right now it's ~5-10 keys, with 5 columns per key. >> > >> > Later it will be 64 keys (max) with 200 columns per key worst case. >> > >> > Suhail >> > >> > On Thu, Jan 14, 2010 at 9:45 AM, Jonathan Ellis <jbel...@gmail.com> >> wrote: >> > >> >> how many keys are you fetching? how many columns for each key? >> >> >> >> On Thu, Jan 14, 2010 at 1:49 AM, Suhail Doshi <suh...@mixpanel.com> >> wrote: >> >> > I've been seeing multiget_slice take an extremely long time: >> >> > >> >> > 2010-01-14 07:44:00,513 INFO ------------------ Cassandra, delay: >> >> > 3.64020800591 ----------------------- >> >> > 2010-01-14 07:44:00,513 INFO method: multiget_slice >> >> > 2010-01-14 07:44:00,513 INFO {'keys': >> >> > >> >> >> [u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2', >> >> > >> >> >> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a', >> >> > >> >> >> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9', >> >> > >> >> >> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac', >> >> > >> >> >> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11'], >> >> > 'column_parent': ColumnParent(column_family='DistinctIndex', >> >> > super_column=None), 'predicate': SlicePredicate(column_names=None, >> >> > slice_range=SliceRange(count=14000, start='date_2009-07-01', >> >> reversed=False, >> >> > finish='date_2010-01-14'))} >> >> > >> >> > 2010-01-14 07:44:00,513 INFO result: >> >> > >> >> > 2010-01-14 07:44:00,513 INFO >> >> > >> >> >> {'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a': >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, >> >> > name='date_2010-01-11', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, >> >> > name='date_2010-01-12', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, >> >> > name='date_2010-01-13', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, >> >> > name='date_2010-01-14', value='1'), super_column=None)], >> >> > >> >> >> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9': >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, >> >> > name='date_2010-01-11', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, >> >> > name='date_2010-01-12', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, >> >> > name='date_2010-01-13', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, >> >> > name='date_2010-01-14', value='1'), super_column=None)], >> >> > >> >> >> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11': >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263333256, >> >> > name='date_2010-01-12', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, >> >> > name='date_2010-01-13', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, >> >> > name='date_2010-01-14', value='1'), super_column=None)], >> >> > >> >> >> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2': >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, >> >> > name='date_2010-01-11', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, >> >> > name='date_2010-01-12', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, >> >> > name='date_2010-01-13', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, >> >> > name='date_2010-01-14', value='1'), super_column=None)], >> >> > >> >> >> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac': >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323, >> >> > name='date_2010-01-11', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256, >> >> > name='date_2010-01-12', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556, >> >> > name='date_2010-01-13', value='1'), super_column=None), >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804, >> >> > name='date_2010-01-14', value='1'), super_column=None)]} >> >> > >> >> > >> >> > The delay is the time at which it took to run the query and then >> return a >> >> > result. The box has 4GB of RAM and the *JVM_MAX_MEM (-Xmx) is set at >> 3G*. >> >> If >> >> > you're curious how I am running it: >> >> > >> >> > /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile >> >> > /var/run/cassandra.pid -errfile &1 -outfile >> /var/log/cassandra/output.log >> >> > -cp >> >> > >> >> >> /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-incubating-0.5.0-rc1.jar:/usr/share/cassandra/apache-cassandra-incubating.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collect-1.0-rc1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json_simple-1.1.jar:/usr/share/cassandra/junit-4.6.jar:/usr/share/cassandra/libthrift-r894924.jar:/usr/share/cassandra/log4j-1.2.15.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar >> >> > -Xmx3G -Xms128M -Dcassandra -Dstorage-config=/etc/cassandra >> >> > -Dcom.sun.management.jmxremote.port=8080 >> >> > -Dcom.sun.management.jmxremote.ssl=false >> >> > -Dcom.sun.management.jmxremote.authenticate=false >> >> > org.apache.cassandra.service.CassandraDaemon >> >> > >> >> > I am running version* 0.5.0rc2*. >> >> > >> >> > Does anyone know what the bottleneck might be and how reads using >> >> > multiget_slice can be sped up? When I look at the memory used, it's >> only >> >> > about 1700 MB used. The box is not excessively swapping, running >> >> > iostat--everything seemed pretty okay. >> >> > >> >> > Suhail >> >> > >> >> >> > >> > >> > >> > -- >> > http://mixpanel.com >> > Blog: http://blog.mixpanel.com >> > >> > > > > -- > http://mixpanel.com > Blog: http://blog.mixpanel.com >