Nо. But i think that all data will be in memory and gstat (FreeBSD utility) show that no any disk activity.
2009/11/14 TuxRacer69 <[email protected]> > Hi Ruslan, > > did you store the logs and the data on 2 different disks as described at: > http://wiki.apache.org/cassandra/StorageConfiguration > and > http://wiki.apache.org/cassandra/FAQ#what_kind_of_hardware_should_i_use > ? > > Cheers > TuxRacer > > > ruslan usifov wrote: > >> Hello! >> >> I'm new in cassandra son i can misunderstand some things. >> >> In follow "benchmark". I have insert 4000000 records like this >> >> {"value": str(i), "text": "some small text"} >> >> I use lazyboy lib (http://github.com/digg/lazyboy) to simplify work with >> cassandra thrift interface. So my insert python program look like this: >> >> from lazyboy import * >> from lazyboy.key import Key; >> >> import time; >> import random; >> >> # Define your cluster(s) >> connection.add_pool('test', ['localhost:9160']) >> >> for j in xrange(0, 41): >> bt = time.time(); >> begin = 100000 * j; >> >> for i in xrange(begin, begin + 100000): >> if (i != begin) and ((i % 10000) == 0): >> print time.time() - bt; >> bt = time.time() >> >> rec = record.Record(); >> rec.key = Key("test", "Aquarium", str(i)); >> >> rec.update({"value": str(i), "text": "ruslan text"}) >> rec.save(); >> >> print time.time() - bt; >> print "%s'th 100000 inserts done" % (j); >> >> time.sleep(10); >> >> >> Then i try to fetch random records from my storage: >> >> begin = time.time(); >> >> for i in xrange(0, 100000): >> if i and (i % 10000) == 0: >> print time.time() - begin; >> begin = time.time() >> >> rec = record.Record(); >> rec.load(Key("test", "Aquarium", str(random.randint(0, 3000000)))); >> >> print time.time() - begin; >> >> >> And on evry 10000 requests i get about 8 seconds: >> >> 8.04699993134 >> 8.07800006866 >> 8.18799996376 >> 8.17199993134 >> 8.15600013733 >> 8.09399986267 >> 8.07800006866 >> 8.04699993134 >> 8.06200003624 >> 8.06299996376 >> >> >> Then i do similar test with MySQL on InnoDB storage engine, with follow >> program: >> >> import MySQLdb as dbi; >> from MySQLdb.cursors import *; >> >> import time; >> import random; >> import sys; >> >> g_dbh = dbi.connect(db="test", user="root", passwd="root"); >> cursor = g_dbh.cursor(); >> >> begin = time.time(); >> >> for i in xrange(0, 100000): >> if i and (i % 10000) == 0: >> print time.time() - begin; >> begin = time.time() >> >> cursor.execute("select * from test where value=%s", random.randint(0, >> 3000000)); >> cursor.fetchone(); >> >> print time.time() - begin; >> >> >> And get about 1.5 seconds per 10000 requests: >> 1.54699993134 >> 1.57800006852 >> 1.18799996376 >> 1.46671993134 >> 1.76670013733 >> 1.50399986267 >> 1.57800003872 >> 1.50699993134 >> 1.50200003624 >> 1.50099996313 >> >> Is it normal? Or i do something wrong. i have that cassandra slow in >> 8/1.5 = 5.3 times less than Mysql InnoDB >> >> >> In cassandra i off all debugging, and my keyspace look like this: >> >> <Keyspaces> >> <Keyspace Name="test"> >> <ColumnFamily CompareWith="BytesType" Name="Aquarium" /> >> </Keyspace> >> </Keyspaces> >> >> >> My innoDb table look like this: >> >> CREATE TABLE `test` ( >> `value` int(11) NOT NULL, >> `text` char(255) NOT NULL, >> PRIMARY KEY (`value`) >> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 >> >> >> In mysql i use TCP/IP connection to server not UNIX domain sockets. All >> test where done on Intel core 2 duo 8600 3Gz. On FreeBSD 7.2 >> >> >
