Re: Cassandra table limitation

2016-04-06 Thread jason zhao yang
Hi Thanks, The schema is different. Putting a tenant id as first partition key will make spark logic more complex ( filtering is needed in search-all). > There's also the issue of lots of memtables flushing to disk during commit log rotations. Can be problematic. If this is the case, I think

Re: Is there a component of Cassandra that queries 'system.local' continuously over the network?

2016-04-06 Thread Sotirios Delimanolis
It was the driver after all. The C# driver (and I'm guessing others) query this table as part of their heartbeat for idle connections. We have a lot of clients. This adds up. I don't believe this is the cause of the increasing network traffic. On Wednesday, April 6, 2016 2:22 PM, Sotirios

Is there a component of Cassandra that queries 'system.local' continuously over the network?

2016-04-06 Thread Sotirios Delimanolis
Hey, I'm investigating an issue where the network traffic on a Cassandra 2.1 node increases over time, regardless of the load our clients are under. I tried enabling TRACE logging for org.apache.cassandra.transport.Message and got bombarded with logs like these DEBUG [SharedPool-Worker-2]

Re: Cassandra table limitation

2016-04-06 Thread Jonathan Haddad
There's also the issue of lots of memtables flushing to disk during commit log rotations. Can be problematic. On Wed, Apr 6, 2016 at 2:08 PM Michael Penick wrote: > Are the tenants using the same schema? If so, you might consider using the > tenant's ID as part of

Re: Cassandra table limitation

2016-04-06 Thread Michael Penick
Are the tenants using the same schema? If so, you might consider using the tenant's ID as part of the primary key for the tables they have in common. If they're all using different, largish schemas I'm not sure that Cassandra is well suited to that type of multi-tenancy. There's the per overhead

Re: nodetool drain running for days

2016-04-06 Thread Jeff Jirsa
Drain should not run for days – if it were me, I’d be: Checking for ‘DRAINED’ in the server logs Running ‘nodetool flush’ just to explicitly flush the commitlog/memtables (generally useful before doing drain, too, it can be somewhat race-y) Explicitly killing cassandra following the flush – drain

RE: Cassandra Single Node Setup Questions

2016-04-06 Thread Paco Trujillo
The fact that there is one single DC does not mean that you do not need multiples nodes. Without multiples nodes you do not have redundancy (the nodes fail and you lose the database) and you cannot scale the cluster if the number of users and/or rows grows adding more nodes unless you have

Re: Cassandra Single Node Setup Questions

2016-04-06 Thread Jack Krupansky
There are two distinct meanings for the term DC - physical data center and workload isolation. I presume the former was intended here. In any case, replication is valid for both physically separate data centers and within a single data center. The latter being useful when a single server machine

RE: Cassandra Single Node Setup Questions

2016-04-06 Thread Paco Trujillo
When we start using cassandra in our company, we decide to use a single node Cassandra cluster as PoC. Everything was correct until we really need the power of a Cassandra cluster and then our data models were not appropriate for a cluster with multiple nodes because of redundancy, data access

RE: Cassandra Single Node Setup Questions

2016-04-06 Thread Bhupendra Baraiya
We have around 20 Million rows and around 200 concurrent users The reason we want single Node is we have only single DC , I believe if there is only one DC there is no question of keeping multiple nodes The main reason we want to migrate to Cassandra is we have a denormalized data structure in

Cassandra Single Node Setup Questions

2016-04-06 Thread Bhupendra Baraiya
Hi , I had few question related to Single Node Setup in Cassandra 1) We want to install Cassandra but multiple Node is not what we need Can we proceed with Single Node and store millions of data in Single Node only 2) How many Partitions are allowed per Node , that is

nodetool drain running for days

2016-04-06 Thread Paco Trujillo
We are having performance problems with our cluster regarding to timeouts when repairs are running or massive deletes. One of the advice I received was update our casssandra version from 2.0.17 to 2.2. I am draining one of the nodes to start the upgrade and the drain is running now for two