Hi Ruslan,
did you store the logs and the data on 2 different disks as described at:
http://wiki.apache.org/cassandra/StorageConfiguration
and
http://wiki.apache.org/cassandra/FAQ#what_kind_of_hardware_should_i_use
?
Cheers
TuxRacer
ruslan usifov wrote:
Hello!
I'm new in cassandra son i can misunderstand some things.
In follow "benchmark". I have insert 4000000 records like this
{"value": str(i), "text": "some small text"}
I use lazyboy lib (http://github.com/digg/lazyboy) to simplify work
with cassandra thrift interface. So my insert python program look like
this:
from lazyboy import *
from lazyboy.key import Key;
import time;
import random;
# Define your cluster(s)
connection.add_pool('test', ['localhost:9160'])
for j in xrange(0, 41):
bt = time.time();
begin = 100000 * j;
for i in xrange(begin, begin + 100000):
if (i != begin) and ((i % 10000) == 0):
print time.time() - bt;
bt = time.time()
rec = record.Record();
rec.key = Key("test", "Aquarium", str(i));
rec.update({"value": str(i), "text": "ruslan text"})
rec.save();
print time.time() - bt;
print "%s'th 100000 inserts done" % (j);
time.sleep(10);
Then i try to fetch random records from my storage:
begin = time.time();
for i in xrange(0, 100000):
if i and (i % 10000) == 0:
print time.time() - begin;
begin = time.time()
rec = record.Record();
rec.load(Key("test", "Aquarium", str(random.randint(0, 3000000))));
print time.time() - begin;
And on evry 10000 requests i get about 8 seconds:
8.04699993134
8.07800006866
8.18799996376
8.17199993134
8.15600013733
8.09399986267
8.07800006866
8.04699993134
8.06200003624
8.06299996376
Then i do similar test with MySQL on InnoDB storage engine, with
follow program:
import MySQLdb as dbi;
from MySQLdb.cursors import *;
import time;
import random;
import sys;
g_dbh = dbi.connect(db="test", user="root", passwd="root");
cursor = g_dbh.cursor();
begin = time.time();
for i in xrange(0, 100000):
if i and (i % 10000) == 0:
print time.time() - begin;
begin = time.time()
cursor.execute("select * from test where value=%s",
random.randint(0, 3000000));
cursor.fetchone();
print time.time() - begin;
And get about 1.5 seconds per 10000 requests:
1.54699993134
1.57800006852
1.18799996376
1.46671993134
1.76670013733
1.50399986267
1.57800003872
1.50699993134
1.50200003624
1.50099996313
Is it normal? Or i do something wrong. i have that cassandra slow in
8/1.5 = 5.3 times less than Mysql InnoDB
In cassandra i off all debugging, and my keyspace look like this:
<Keyspaces>
<Keyspace Name="test">
<ColumnFamily CompareWith="BytesType" Name="Aquarium" />
</Keyspace>
</Keyspaces>
My innoDb table look like this:
CREATE TABLE `test` (
`value` int(11) NOT NULL,
`text` char(255) NOT NULL,
PRIMARY KEY (`value`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
In mysql i use TCP/IP connection to server not UNIX domain sockets.
All test where done on Intel core 2 duo 8600 3Gz. On FreeBSD 7.2