Good Day Everyone,

I am very happy with the (almost) linear scalability offered by C*. We had
a lot of problems with RDBMS.

But, I heard that C* has a limit on number of column families that can be
created in a single cluster.
The reason being each CF stores 1-2 MB on the JVM heap.

In our use case, we have about 10000+ CF and we want to support
multi-tenancy.
(i.e 10000 * no of tenants)

We are new to C* and being from RDBMS background, I would like to
understand how to tackle this scenario from your advice.

Our plan is to use Off-Heap memtable approach.
http://www.datastax.com/dev/blog/off-heap-memtables-in-Cassandra-2-1

Each node in the cluster has following configuration
16 GB machine (8GB Cassandra JVM + 2GB System + 6GB Off-Heap)
IMO, this should be able to support 1000 CF with no(very less) impact on
performance and startup time.

We tackle multi-tenancy using different keyspaces.(Solution I found on the
web)

Using this approach we can have 10 clusters doing the job. (We actually are
worried about the cost)

Can you please help us evaluate this strategy? I want to hear communities
opinion on this.

My major concerns being,

1. Is Off-Heap strategy safe and my assumption of 16 GB supporting 1000 CF
right?

2. Can we use multiple keyspaces to solve multi-tenancy? IMO, the number of
column families increase even when we use multiple keyspace.

3. I understand the complexity using multi-cluster for single application.
The code base will get tightly coupled with infrastructure. Is this the
right approach?

Any suggestion is appreciated.

Thanks,
Arun

Reply via email to