Hi
I'll explain a bit. I'm working with Abhinav.
We've an application which was earlier based on Lucene which would
index a huge volume of data, and later use the indices to fetch data
and perform a fuzzy matching operation. We wanted to use Cassandra
primarily because of the sharding/availability/SPOF capabilities and
the write-speed. The application is running on an 8-core machine, and
we've 8 threads, each reading different files and writing to 3
different CFs -
- one to store the raw data, keyed by an ID, the ID is of the form
ThreadName-<counter> and is unique
- one to store a subset of the raw data - I mean a small set of
fields, and keyed by the same ID as before
- one to store the inverted index, keyed by a field in the data with
all the ID of the records for which that field matched
On the 8-core machine, with 8-threads, it took us approx 20 min. to
create the index store with a data set of 24M rows. And this was for a
single instance of Cassandra. 480 sec. mentioned by Abhinav earlier
was for a smaller dataset.
When we created a ring, by adding another similar machine, and
re-executed the application from scratch (consistency level = ONE),
the total time increased considerably - actually doubled. And the
nodes were unbalanced showing 70-30 distribution of load (sometimes
even more skewed). Effectively, in the ring, it's taking much longer
and the data distribution in skewed. Similar thing happened when we
tried the application on a collection of desktops (4/5 of them).
We have faced another issue while doing this. We performed jstack on
the application, and found an output similar to the JIRA issue 1594
(which I mentioned in another mail earlier) - and this is true for
both 0.6.8 and 0.7 versions. The cpu usage on the nodes is never
greater than 50-60% (user+sys), the disk busy time is quite high. The
CPU usage when we were using Lucene was pretty high for all the cores
(90% or more). It may be possible that the usage has gone down because
of the disk IO - but we aren't completely sure on this.
We have a feeling that we aren't creating the cluster properly or have
missed certain important configuration aspects. The configuration we
are using is the default one. Changes to the memtable-throughput in MB
didn't have much effect.
Following is a snapshot from the cfstat output (for a data set of 2M rows):
Keyspace: fct_cdr
Read Count: 277537
Read Latency: 0.43607250564789557 ms.
Write Count: 3781264
Write Latency: 0.01323008708199163 ms.
Pending Tasks: 0
Column Family: RawCDR
SSTable count: 1
Space used (live): 719796067
Space used (total): 1439605485
Memtable Columns Count: 218459
Memtable Data Size: 120398507
Memtable Switch Count: 4
Read Count: 0
Read Latency: NaN ms.
Write Count: 1203177
Write Latency: 0.016 ms.
Pending Tasks: 0
Key cache capacity: 10000
Key cache size: 0
Key cache hit rate: NaN
Row cache capacity: 1000
Row cache size: 0
Row cache hit rate: NaN
Compacted row minimum size: 535
Compacted row maximum size: 924
Compacted row mean size: 642
Column Family: Index
SSTable count: 5
Space used (live): 326960041
Space used (total): 564423442
Memtable Columns Count: 264507
Memtable Data Size: 9443853
Memtable Switch Count: 15
Read Count: 178785
Read Latency: 0.425 ms.
Write Count: 1203177
Write Latency: 0.012 ms.
Pending Tasks: 0
Key cache capacity: 10000
Key cache size: 10000
Key cache hit rate: 0.0
Row cache capacity: 1000
Row cache size: 1000
Row cache hit rate: 0.0
Compacted row minimum size: 215
Compacted row maximum size: 310
Compacted row mean size: 215
Column Family: IndexInverse
SSTable count: 3
Space used (live): 164782651
Space used (total): 164782651
Memtable Columns Count: 289647
Memtable Data Size: 12757041
Memtable Switch Count: 3
Read Count: 98950
Read Latency: 0.457 ms.
Write Count: 1201911
Write Latency: 0.017 ms.
Pending Tasks: 0
Key cache capacity: 10000
Key cache size: 10000
Key cache hit rate: 0.0
Row cache capacity: 1000
Row cache size: 1000
Row cache hit rate: 0.0
Compacted row minimum size: 149
Compacted row maximum size: 14237
Compacted row mean size: 179
The write latency shown in this is not bad, but we need to confirm
this. It may be the case that it's something to do with the
application and/or our configuration.
Regards
Arijit
--
"And when the night is cloudy,
There is still a light that shines on me,
Shine on until tomorrow, let it be."