Re: 10000+ CF support from Cassandra

2015-06-24 Thread Arun Chaitanya
any ideas or advises? On Mon, Jun 22, 2015 at 10:55 AM, Arun Chaitanya chaitan64a...@gmail.com wrote: Hello All, Now we settled on the following approach. I want to know if there are any problems that you foresee in the production environment. Our Approach: Use Off Heap Memory

Re: 10000+ CF support from Cassandra

2015-06-24 Thread Jack Krupansky
By entries, do you mean rows or columns? Please clarify how many columns each of your tables has, and how many rows you are populating for each table. In case I didn't make it clear earlier, limit yourself to low hundreds (like 250) of tables and you should be fine. Thousands of tables is a clear

Re: 10000+ CF support from Cassandra

2015-06-24 Thread Arun Chaitanya
Hi Jack, When I mean entries, I meant rows. Each column family has about 200 columns. Disabling of slab allocation is an expert-only feature - its use is generally an anti-pattern, not recommended. I understand this and have seen this recommendation at several places. I want to understand the

Re: 10000+ CF support from Cassandra

2015-06-24 Thread Jack Krupansky
I would say that it's mostly a performance issue, tied to memory management, but the main problem is that a large number of tables invites a whole host of clluster management difficulties that require... expert attention, which then means you need an expert to maintain and enhance it. Cassandra

Re: 10000+ CF support from Cassandra

2015-06-21 Thread Arun Chaitanya
Hello All, Now we settled on the following approach. I want to know if there are any problems that you foresee in the production environment. Our Approach: Use Off Heap Memory Modifications to default cassandra.yaml and cassandra-env.sh

Re: 10000+ CF support from Cassandra

2015-06-01 Thread Jonathan Haddad
Sorry for this naive question but how important is this tuning? Can this have a huge impact in production? Massive. Here's a graph of when we did some JVM tuning at my previous company: http://33.media.tumblr.com/5d0efca7288dc969c1ac4fc3d36e0151/tumblr_inline_mzvj254quj1rd24f4.png About an

Re: 10000+ CF support from Cassandra

2015-06-01 Thread Arun Chaitanya
Thanks Jon and Jack, I strongly advise against this approach. Jon, I think so too. But so you actually foresee any problems with this approach? I can think of a few. [I want to evaluate if we can live with this problem] - No more CQL. - No data types, everything needs to be a blob. -

Re: 10000+ CF support from Cassandra

2015-06-01 Thread graham sanderson
I strongly advise against this approach. Jon, I think so too. But so you actually foresee any problems with this approach? I can think of a few. [I want to evaluate if we can live with this problem] Just to be clear, I’m not saying this is a great approach, I AM saying that it may be better

Re: 10000+ CF support from Cassandra

2015-05-28 Thread Jack Krupansky
How big is each of the tables - are they all fairly small or fairly large? Small as in no more than thousands of rows or large as in tens of millions or hundreds of millions of rows? Small tables are are not ideal for a Cassandra cluster since the rows would be spread out across the nodes, even

Re: 10000+ CF support from Cassandra

2015-05-28 Thread Arun Chaitanya
Hello Jack, Column families? As opposed to tables? Are you using Thrift instead of CQL3? You should be focusing on the latter, not the former. We have an ORM developed in our company, which maps each DTO to a column family. So, we have many column families. We are using CQL3. But either way,

Re: 10000+ CF support from Cassandra

2015-05-28 Thread Graham Sanderson
Depending on your use case and data types (for example if you can have a minimally Nested Json representation of the objects; Than you could go with a common mapstring,string representation where keys are top love object fields and values are valid Json literals as strings; eg unquoted

Re: 10000+ CF support from Cassandra

2015-05-28 Thread Jonathan Haddad
While Graham's suggestion will let you collapse a bunch of tables into a single one, it'll likely result in so many other problems it won't be worth the effort. I strongly advise against this approach. First off, different workloads need different tuning. Compaction strategies,

Re: 10000+ CF support from Cassandra

2015-05-27 Thread Jack Krupansky
Scalability of Cassandra refers primarily to number of rows and number of nodes - to add more data, add more nodes. Column families? As opposed to tables? Are you using Thrift instead of CQL3? You should be focusing on the latter, not the former. But either way, the general guidance is that

Re: 10000+ CF support from Cassandra

2015-05-26 Thread graham sanderson
Are the CFs different, or all the same schema? Are you contractually obligated to actually separate data into separate CFs? It seems like you’d have a lot simpler time if you could use the part of the partition key to separate data. Note also, I don’t know what disks you are using, but disk

Re: 10000+ CF support from Cassandra

2015-05-26 Thread Arun Chaitanya
Hello Graham, Are the CFs different, or all the same schema? The column families are different. May be with better data modelling, we can combine a few of them. Are you contractually obligated to actually separate data into separate CFs? No. Its just that we have several sub systems(around