1000's of CF's. virtual CFs do NOT workŠ..map/reduce

Hiller, Dean Tue, 02 Oct 2012 05:35:59 -0700

So basically, with moving towards the 1000's of CF all being put in one
CF, our performance is going to tank on map/reduce, correct?  I mean, from
what I remember we could do map/reduce on a single CF, but by stuffing
1000's of virtual Cf's into one CF, our map/reduce will have to read in
all 999 virtual CF's rows that we don't want just to map/reduce the ONE CF.

Map/reduce VERY VERY SLOW when reading in 1000 times more rows :( :(.

Is this correct?  This really sounds like highly undesirable behavior.
There needs to be a way for people with 1000's of CF's to also run
map/reduce on any one CF.  Doing Map/reduce on 1000 times the number of
rows will be 1000 times slowerŠ.and of course, we will most likely get up
to 20,000 tables from my most recent projectionsŠ.our last test load, we
ended up with 8k+ CF's.  Since I kept two other keyspaces, cassandra
started getting really REALLY slow when we got up to 15k+ CF's in the
system though I didn't look into why.

I don't mind having 1000's of virtual CF's in ONE CF, BUT I need to
map/reduce "just" the virtual CF!!!!!  Ugh.

Thanks,
Dean

On 10/1/12 3:38 PM, "Ben Hood" <0x6e6...@gmail.com> wrote:

>On Mon, Oct 1, 2012 at 9:38 PM, Brian O'Neill <b...@alumni.brown.edu>
>wrote:
>> Its just a convenient way of prefixing:
>> 
>>http://hector-client.github.com/hector/build/html/content/virtual_keyspac
>>es.html
>
>So given that it is possible to use a CF per tenant, should we assume
>that there at sufficient scale that there is less overhead to prefix
>keys than there is to manage multiple CFs?
>
>Ben

1000's of CF's. virtual CFs do NOT workŠ..map/reduce

Reply via email to