Re: too many full gc in one node of the cluster

2015-11-13 Thread Jason Wee
Used to manage/develop for cassandra 1.0.8 for quite sometime. Although 1.0 was rocking stable but we encountered various problems as load per node grow beyond 500gb. upgrading is one of the solution but may not be the solution for you but I strongly recommend you upgrade to 1.1 or 1.2. we

Re: Getting code=2200 [Invalid query] message=Invalid column name ... while executing ALTER statement

2015-11-13 Thread Carlos Alonso
Maybe schema disagreement? Run nodetool describecluster to discover Carlos Alonso | Software Engineer | @calonso On 13 November 2015 at 11:14, Rajesh Radhakrishnan < rajesh.radhakrish...@phe.gov.uk> wrote: > > Hi, > > I am using Cassandra 2.1.5 in a cluster of two

RE: Getting code=2200 [Invalid query] message=Invalid column name ... while executing ALTER statement

2015-11-13 Thread Rajesh Radhakrishnan
Thank you Carlos for looking. But when I rand the nodetool describecluster. It is showing the same schema versions for both nodes? So it is something else! Please help me from this bottleneck. Thank you. From: Carlos Alonso [i...@mrcalonso.com] Sent: 13 November

Re: too many full gc in one node of the cluster

2015-11-13 Thread Jeff Jirsa
What Kai Wang is hinting at: one of the common tuning problems people face is they follow the advice in cassandra-env.sh, which says 100M of young gen space (Xmn) per core. Many people find that insufficient – raising that to be 30,40, or 50% of heap size (Xmx) MAY help keep short-lived objects

RE: Getting code=2200 [Invalid query] message=Invalid column name ... while executing ALTER statement

2015-11-13 Thread Rajesh Radhakrishnan
We got a work around now! Thank you Laing for the reply. Yes I do agree with your point, but we got a scenario where the columns need to be added in the later stage of the process. We are doing the following: 1. CREATE THE TABLE IF NOT EXISTS 2. INSERT IDS INTO THE TABLE 3. CHECK THE

Re: Getting code=2200 [Invalid query] message=Invalid column name ... while executing ALTER statement

2015-11-13 Thread Alex Popescu
I'm glad to hear that you got it working; but I'd suggest trying to answer these questions before moving forward with this solution: 1. is the set of columns really that dynamic? if not, then define them upfront. there's no weight to empty columns. 2. if the set of columns is really dynamic,

Re: UnknownColumnFamily exception / schema inconsistencies

2015-11-13 Thread Maciek Sakrejda
Any advice on how to proceed here? Sebastian seems to have guessed correctly at the underlying issue, but I'm still not sure how to resolve this given what I see in the data directory and the catalogs. On Wed, Nov 11, 2015 at 12:15 PM, Maciek Sakrejda wrote: > On Wed, Nov 11,

Re: UnknownColumnFamily exception / schema inconsistencies

2015-11-13 Thread Sebastian Estevez
I think you're just missing the steps in *Bold*: If THERE ARE TWO OR MORE DIRECTORIES: 4)Identify from schema_column_families which cf ID is the "new" one (currently in use). cqlsh -e "select * from system.schema_column_families"|grep *5) Move the data from the "old" one to the "new" one and

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Peddi, Praveen
Lol... We are running on AWS servers and no clocks are not 20 minutes off. From: Jon Haddad > Reply-To: "user@cassandra.apache.org" > Date:

Re: UnknownColumnFamily exception / schema inconsistencies

2015-11-13 Thread Robert Coli
On Fri, Nov 13, 2015 at 12:31 PM, Maciek Sakrejda wrote: > On Fri, Nov 13, 2015 at 9:56 AM, Sebastian Estevez < > sebastian.este...@datastax.com> wrote: > >> I think you're just missing the steps in *Bold*: >> >> Thanks, but I wasn't clear on what to do if the "new" directory

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Jon Haddad
Well, yeah, it's still possible. Are your clocks 20 minutes off? > On Nov 13, 2015, at 1:20 PM, Peddi, Praveen wrote: > > Hi Jon, > Thanks for your response. > Clock skews is not a possibility because the row that got re-appeared is 20 > mins older than the one got deleted

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Robert Coli
On Fri, Nov 13, 2015 at 1:09 PM, Peddi, Praveen wrote: > We are seeing a scenario where some of the rows in the table reappears > even after they are deleted. We have seen this in Prod 3 times in last 1 > week and *coincidentally all 3 times on the same partition*. We have >

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Peddi, Praveen
Hi Rob, We do not currently run repairs because we know our deployment time for each cassandra node is very short. I do understand we have to run repairs but would repair be in the picture here when no nodes in the cluster were down for last 2 weeks? Thanks Praveen From: Robert Coli

Deletes Reappeared even when nodes are not down

2015-11-13 Thread Peddi, Praveen
Hi, We are using Cassandra 2.0.8, with replication factor of 3. We are seeing a scenario where some of the rows in the table reappears even after they are deleted. We have seen this in Prod 3 times in last 1 week and coincidentally all 3 times on the same partition. We have confirmed that nodes

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Jon Haddad
It was in AWS that I had clocks by off by 30 seconds or so. Virtualization is a nightmare for clocks. As long as you've checked, we can move onto other possibilities :) > On Nov 13, 2015, at 1:28 PM, Peddi, Praveen wrote: > > Lol… > We are running on AWS servers and no

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Robert Coli
On Fri, Nov 13, 2015 at 1:47 PM, Peddi, Praveen wrote: > We do not currently run repairs because we know our deployment time for > each cassandra node is very short. I do understand we have to run repairs > but would repair be in the picture here when no nodes in the cluster

Re: UnknownColumnFamily exception / schema inconsistencies

2015-11-13 Thread Maciek Sakrejda
On Fri, Nov 13, 2015 at 9:56 AM, Sebastian Estevez < sebastian.este...@datastax.com> wrote: > I think you're just missing the steps in *Bold*: > > Thanks, but I wasn't clear on what to do if the "new" directory does not exist at all on some of the nodes (only the old). Can I just rename the "old"

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Jonathan Haddad
You could have dropped mutations without downtime. Check nodetool tpstats. On Fri, Nov 13, 2015 at 2:48 PM Peddi, Praveen wrote: > Hi Rob, > We do not currently run repairs because we know our deployment time for > each cassandra node is very short. I do understand we have to

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Jon Haddad
Any chance your clocks are off? > On Nov 13, 2015, at 1:09 PM, Peddi, Praveen wrote: > > Hi, > We are using Cassandra 2.0.8, with replication factor of 3. > > We are seeing a scenario where some of the rows in the table reappears even > after they are deleted. We have seen

Re: Deletes Reappeared even when nodes are not down

2015-11-13 Thread Peddi, Praveen
Hi Jon, Thanks for your response. Clock skews is not a possibility because the row that got re-appeared is 20 mins older than the one got deleted (based on last modified date field). We are definitely not talking about few millis here. Praveen From: Jon Haddad

Re: too many full gc in one node of the cluster

2015-11-13 Thread Robert Coli
On Thu, Nov 12, 2015 at 10:35 PM, Shuo Chen wrote: > We have a small cassandra cluster with 4 nodes for production. All the > nodes have similar hardware configuration and similar data load. The C* > version is 1.0.7 (prretty old) > > One of the node has much higher cpu

Re: too many full gc in one node of the cluster

2015-11-13 Thread Shuo Chen
The 4th node has even less data node than the others. And I check the connections to the port 9160 counts among the nodes. This node also has less connections. So it is really strange... On Sat, Nov 14, 2015 at 5:50 AM, Robert Coli wrote: > On Thu, Nov 12, 2015 at 10:35

Re: Spark on cassandra

2015-11-13 Thread Ravi
I did join on single big table and it's working fine using code you showed below. Can we do table join on non partition key? Or not a primary key column ? Thanks, Ravi On Thu, Nov 12, 2015 at 5:41 AM DuyHai Doan wrote: > Hello Prem > > I believe it's better to ask your

Re: Timeout with static column

2015-11-13 Thread Brice Figureau
On Thu, 2015-11-12 at 11:13 -0600, Tyler Hobbs wrote: > Can you try to isolate this to a reproducible test case or script and > open a jira ticket at https://issues.apache.org/jira/browse/CASSANDRA? I just created: https://issues.apache.org/jira/browse/CASSANDRA-10698 It's unfortunately not a

Getting code=2200 [Invalid query] message=Invalid column name ... while executing ALTER statement

2015-11-13 Thread Rajesh Radhakrishnan
Hi, I am using Cassandra 2.1.5 in a cluster of two nodes (running CentOS) and using Python driver to connect to Cassandra. My Python code snippet is show here: #--- import time, os,

Re: too many full gc in one node of the cluster

2015-11-13 Thread Kai Wang
What's the size of young generation (-Xmn) ? On Fri, Nov 13, 2015 at 6:38 AM, Jason Wee wrote: > Used to manage/develop for cassandra 1.0.8 for quite sometime. Although > 1.0 was rocking stable but we encountered various problems as load per node > grow beyond 500gb.

Re: Getting code=2200 [Invalid query] message=Invalid column name ... while executing ALTER statement

2015-11-13 Thread Laing, Michael
Dynamic schema changes are generally a bad idea, especially if they are rapid. You should rethink your approach. On Fri, Nov 13, 2015 at 7:20 AM, Rajesh Radhakrishnan < rajesh.radhakrish...@phe.gov.uk> wrote: > > Thank you Carlos for looking. > But when I rand the nodetool describecluster. > It