For the record, I note that "no row cache" is the default on
user-defined CFs; we include it in the sample configuration file as an
example only.
On Wed, Mar 10, 2010 at 9:58 AM, Sylvain Lebresne wrote:
>> So did you disable the row cache entirely?
>
> Yes (getting back reasonable performances).
> So did you disable the row cache entirely?
Yes (getting back reasonable performances).
>> From: Sylvain Lebresne
>>
>> Well, I've found the reason.
>> The default cassandra configuration use a 10% row cache.
>> And the row cache reads all the row each time. So it was indeed reading
>> the
>> fu
So did you disable the row cache entirely?
> From: Sylvain Lebresne
>
> Well, I've found the reason.
> The default cassandra configuration use a 10% row cache.
> And the row cache reads all the row each time. So it was indeed reading
> the
> full row each time even though the request was asking
Well, I've found the reason.
The default cassandra configuration use a 10% row cache.
And the row cache reads all the row each time. So it was indeed reading the
full row each time even though the request was asking for only one column.
My bad (at least I learned something).
--
Sylvain
On Tue, M
On Tue, Mar 9, 2010 at 2:28 PM, Sylvain Lebresne wrote:
> > A row causes a disk seek while columns are contiguous. So if the row
> isn't
> > in the cache, you're being impaired by the seeks. In general, fatter
> rows
> > should be more performant than skinny ones.
>
> Sure, I understand that. S
> A row causes a disk seek while columns are contiguous. So if the row isn't
> in the cache, you're being impaired by the seeks. In general, fatter rows
> should be more performant than skinny ones.
Sure, I understand that. Still, I get 400 columns by seconds (ie, 400 seeks by
seconds) when the
On Tue, Mar 9, 2010 at 1:14 PM, Sylvain Lebresne wrote:
> I've inserted 1000 row of 100 column each (python stress.py -t 2 -n
> 1000 -c 100 -i 5)
> If I read, I get the roughly the same number of row whether I read the
> whole row
> (python stress.py -t 10 -n 1000 -o read -r -c 100) or only the f
Alright,
What I'm observing shows better with bigger columns, so I've slightly modified
the stress.py test so that it inserts column of 50K bytes (I attach
the modified stress.py
for info but it really just read 5 bytes from /dev/null and use
that as data.
I also added a sleep to the insert ot
On Tue, Mar 9, 2010 at 8:31 AM, Sylvain Lebresne wrote:
> Well, unless I'm mistaking, that's the same in my example as I give in
> both case
> to stress.py the option '-c 1' which tells it to retrieve only one
> column each time
> even in the case where I have 100 columns by row.
Oh.
Why would y
On Tue, Mar 9, 2010 at 7:15 AM, Sylvain Lebresne wrote:
> 1) stress.py -t 10 -o read -n 5000 -c 1 -r
> 2) stress.py -t 10 -o read -n 50 -c 1 -r
>
> In the case 1) I get around 200 reads/seconds and that's pretty stable. The
> disk is spinning like crazy (~25% io_wait), very few cpu or me
On Tue, Mar 9, 2010 at 2:52 PM, Jonathan Ellis wrote:
> By "reads" do you mean what stress.py counts (rows) or rows * columns?
> If it is rows, then you are still actually reading more columns/s in
> case 2.
Well, unless I'm mistaking, that's the same in my example as I give in
both case
to stre
in my experience #2 will work well up to a point where it will trigger
a limitation of cassandra (slated to be resolved in .7 \o/) where all
of the columns under a given key must be able to fit into memory. For
things like index's of data I have opted to shard the keys for really
large data sets t
Hello,
I've done some tests and it seems that somehow to have more rows with few
columns is better than to have more rows with fewer columns, at least as long
as read performance is concerned.
Using stress.py, on a quad core 2.27Ghz with 4Go RAM and the out of the box
cassandra configuration, I in
13 matches
Mail list logo