Re: Records in table after deleting sstable manually
Thanks Jeff. Appreciate your reply. as you said , looks like some there were entries in commitlogs and when cassandra was brought up after deleting sstables, data from commitlog replayed. May be next time I will let the replay happen after deleting sstable and then truncate table using CQL. This will ensure my table is empty. I could not truncate from CQL in the first place as one of the node was not up. Regards, Kunal On Tue, Aug 11, 2020 at 8:45 AM Jeff Jirsa wrote: > The data probably came from either hints or commitlog replay. > > If you use `truncate` from CQL, it solves both of those concerns. > > > On Tue, Aug 11, 2020 at 8:42 AM Kunal wrote: > >> HI, >> >> We have a 3 nodes cassandra cluster and one of the table grew big, around >> 2 gb while it was supposed to be few MBs. During nodetool repair, one of >> the cassandra went down. Even after multiple restart, one of the node was >> going down after coming up for few mins. We decided to truncate the table >> by removing the corresponding sstable from the disk since truncating a >> table from cqlsh needs all the nodes to be up which was not the case in our >> env. After deleting sstable from disk on all the 3 nodes, we brought up >> cassandra and all the nodes came up fine and dont see any issue , but we >> observed the size of the sstable is~100MB which was bit strange and the >> table has old rows (around 20K) from previous date, before removing the >> rows were 500K. Not sure how the table has old records and sstable is of >> ~100M even after removing the sstable. >> Any ideas ? Any help to understand this would be appreciated. >> >> Regards, >> Kunal >> > -- Regards, Kunal Vaid
Records in table after deleting sstable manually
HI, We have a 3 nodes cassandra cluster and one of the table grew big, around 2 gb while it was supposed to be few MBs. During nodetool repair, one of the cassandra went down. Even after multiple restart, one of the node was going down after coming up for few mins. We decided to truncate the table by removing the corresponding sstable from the disk since truncating a table from cqlsh needs all the nodes to be up which was not the case in our env. After deleting sstable from disk on all the 3 nodes, we brought up cassandra and all the nodes came up fine and dont see any issue , but we observed the size of the sstable is~100MB which was bit strange and the table has old rows (around 20K) from previous date, before removing the rows were 500K. Not sure how the table has old records and sstable is of ~100M even after removing the sstable. Any ideas ? Any help to understand this would be appreciated. Regards, Kunal
Re: Disabling Swap for Cassandra
Thanks for the responses. Appreciae it. @Dor, so you are saying if we add "memlock unlimited" in limits.conf, the entire heap (Xms=Xmx) can be locked at startup ? Will this be applied to all Java processes ? We have couple of Java programs running with the same owner. Thanks Kunal On Thu, Apr 16, 2020 at 4:31 PM Dor Laor wrote: > It is good to configure swap for the OS but exempt Cassandra > from swapping. Why is it good? Since you never know the > memory utilization of additional agents and processes you or > other admins will run on your server. > > So do configure a swap partition. > You can control the eagerness of the kernel by the swappiness > sysctl parameter. You can even control it per cgroup: > > https://askubuntu.com/questions/967588/how-can-i-prevent-certain-process-from-being-swapped > > You should make sure Cassandra locks its memory so the kernel > won't choose its memory to be swapped out (since it will kill > your latency). You do it by mlock. Read more on: > > https://stackoverflow.com/questions/578137/can-i-tell-linux-not-to-swap-out-a-particular-processes-memory > > The scylla /dist/common/limits.d/scylladb.com looks like this: > scylla - core unlimited > scylla - memlock unlimited > scylla - nofile 20 > scylla - as unlimited > scylla - nproc8096 > > On Thu, Apr 16, 2020 at 3:57 PM Nitan Kainth > wrote: > > > > Swap is controlled by OS and will use it when running short of memory. I > don’t think you can disable at Cassandra level > > > > > > Regards, > > > > Nitan > > > > Cell: 510 449 9629 > > > > > > On Apr 16, 2020, at 5:50 PM, Kunal wrote: > > > > > > > > Hello, > > > > > > > > I need some suggestion from you all. I am new to Cassandra and was > reading Cassandra best practices. On one document, it was mentioned that > Cassandra should not be using swap, it degrades the performance. > > > > My question is instead of disabling swap system wide, can we force > Cassandra not to use swap? Some documentation suggests to use > memory_locking_policy in cassandra.yaml. > > > > > > How do I check if our Cassandra already has this parameter and still > uses swap ? Is there any way i can check this. I already checked > cassandra.yaml and dont see this parameter. Is there any other place i can > check and confirm? > > > > > > Also, Can I set memlock parameter to unlimited (64kB default), so entire > Heap (Xms = Xmx) can be locked at node startup ? Will that help? > > > > > > Or if you have any other suggestions, please let me know. > > > > > > > > > > > > Regards, > > > > Kunal > > > > > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > > -- Regards, Kunal Vaid
Disabling Swap for Cassandra
Hello, I need some suggestion from you all. I am new to Cassandra and was reading Cassandra best practices. On one document, it was mentioned that Cassandra should not be using swap, it degrades the performance. My question is instead of disabling swap system wide, can we force Cassandra not to use swap? Some documentation suggests to use memory_locking_policy in cassandra.yaml. How do I check if our Cassandra already has this parameter and still uses swap ? Is there any way i can check this. I already checked cassandra.yaml and dont see this parameter. Is there any other place i can check and confirm? Also, Can I set memlock parameter to unlimited (64kB default), so entire Heap (Xms = Xmx) can be locked at node startup ? Will that help? Or if you have any other suggestions, please let me know. Regards, Kunal
One of the cassandra node is going down.
Hi All, I am facing a situation in my 3 nodes cassandra wherein one of the cassandra nodes is going down after around 5-10mins. Below messages are seen in debug.log of node which is going down: === No Title INFO [ScheduledTasks:1] 2019-05-30 14:39:25,179 StatusLogger.java:101 - system_schema.views 2,16 INFO [Service Thread] 2019-05-30 14:39:25,182 StatusLogger.java:101 - system.schema_keyspaces 0,0 INFO [ScheduledTasks:1] 2019-05-30 14:39:25,182 StatusLogger.java:101 - system_schema.functions 2,16 INFO [Service Thread] 2019-05-30 14:39:32,569 StatusLogger.java:101 - system.sstable_activity 280,10053 WARN [GossipTasks:1] 2019-05-30 14:39:32,572 FailureDetector.java:288 - Not marking nodes down due to local pause of 7413014745 > 50 DEBUG [GossipTasks:1] 2019-05-30 14:39:32,578 FailureDetector.java:294 - Still not marking nodes down due to local pause INFO [ScheduledTasks:1] 2019-05-30 14:39:32,577 StatusLogger.java:101 - virtuoranc.pmcollectionstatus 0,0 INFO [Service Thread] 2019-05-30 14:39:32,578 StatusLogger.java:101 - system.batchlog 0,0 INFO [ScheduledTasks:1] 2019-05-30 14:39:32,579 StatusLogger.java:101 - virtuoranc.snmp_trapdestination 0,0 INFO [Service Thread] 2019-05-30 14:39:32,579 StatusLogger.java:101 - system.schema_columns 0,0 INFO [ScheduledTasks:1] 2019-05-30 14:39:32,579 StatusLogger.java:101 - virtuoranc.auditlog 0,0 INFO [Service Thread] 2019-05-30 14:39:32,580 StatusLogger.java:101 - system.hints 0,0 INFO [ScheduledTasks:1] 2019-05-30 14:39:32,580 StatusLogger.java:101 - virtuoranc.jobproperties 0,0 INFO [Service Thread] 2019-05-30 14:39:32,580 StatusLogger.java:101 - system.IndexInfo 0,0 = We tried to clean this node and started it and ran nodetool repair -full as well but it went down in between. Also nodetool command starts taking too much time to give output after its been 3-4 mins of cassandra startup. And at one point nodetool gives below error. nodetool tpstats nodetool: Failed to connect to '127.0.0.1:7199' - SocketTimeoutException: 'Read timed out'. Can you please let me know what is happening with this node. Any help is appreciated. Regards, Kunal Vaid
Re: Unpair cassandra datacenters
Thanks Sandeep for your reply. Let me try out the steps you suggested. I will let you know. Appreciate your help. Regards, Kunal Vaid On Mon, Apr 22, 2019 at 4:18 PM Sandeep Nethi wrote: > Hi Kunal, > > The simple solution for this case would be as follows, > > 1. Run *Full repair.* > 2. Add firewall to block network on port 7000(,7001 if ssl enabled) > between two datacenter nodes. > 3. Check the status of cassandra cluster from both data centers, each DC > must show down node status of another DC nodes after the firewall change. > 4. Change replication factor for all keyspaces on each data center. > 5. Start decommissioning nodes from each datacenter (Should be removenode > in this case). > 6. Update seeds list on each datacenter to local datacenter nodes and > perform a rolling restart. > > Hope this helps, Try to test this scenario on non-prod system first. > > Thanks, > Sandeep > > On Tue, Apr 23, 2019 at 11:00 AM Kunal wrote: > >> HI Marc, >> >> Appreciate your prompt response. >> >> Yes we are starting datacenter B from scratch. We tried using cluster >> name change on side B and it works but our requirement says we can not >> change cluster name because during our product's major or patch release, >> the scripts expect cluster name to be the same. >> . >> On datacenter B , we are changing the seeds nodes. On datacenter A , we >> are changing the seeds nodes in cassandra.yml but that will be picked up >> during cassandra restart only but we can not have downtime for datacenter >> A. It has to be up all the time. >> >> >> Regards, >> Kunal Vaid >> >> >> On Mon, Apr 22, 2019 at 3:49 PM Marc Selwan >> wrote: >> >>> Hi Kunal, >>> >>> Did you edit the cassandra.yaml file in each data center to remove the >>> seed nodes? On which ever data center is starting from scratch (I think >>> it's B in your case), you may want to also change the cluster name. >>> >>> Best, >>> *Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|* >>> Twitter <https://twitter.com/MarcSelwan> >>> >>> * Quick links | *DataStax <http://www.datastax.com> *| *Training >>> <http://www.academy.datastax.com> *| *Documentation >>> <http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html> >>> *| *Downloads <http://www.datastax.com/download> >>> >>> >>> >>> On Mon, Apr 22, 2019 at 3:38 PM Kunal wrote: >>> >>>> Hi Friends, >>>> >>>> I need small help in unpairing two datacenters. >>>> We have 2 datacenters (say A and B ) with 3 nodes in each datacenter. >>>> We want to remove one whole data center (B) (3 nodes) from the other one >>>> (B). basically, want to unpair both datacenter and want to use them both >>>> individually. >>>> We are trying this using nodetool decommission and it is removing the 3 >>>> nodes from B datacenter. But when we are trying to bring up datacenter B to >>>> use it separately from Datacenter A, it is joining back to datacenter A. We >>>> noticed in debug.log, nodes from datacenter A keeps looking for nodes in >>>> datacenter B and getting connection refused error when the nodes of >>>> datacenter B are down, but as soon as nodes comes back, they are joining to >>>> the cluster. >>>> We don't want nodes from datacenter B to join datacenter A once they >>>> are decommissioned. >>>> >>>> Can you please let me know if i am missing anything. >>>> >>>> Thanks in advance. >>>> >>>> Regards, >>>> Kunal Vaid >>>> >>> >> >> -- >> >> >> >> Regards, >> Kunal Vaid >> > -- Regards, Kunal Vaid
Re: Unpair cassandra datacenters
HI Marc, Appreciate your prompt response. Yes we are starting datacenter B from scratch. We tried using cluster name change on side B and it works but our requirement says we can not change cluster name because during our product's major or patch release, the scripts expect cluster name to be the same. . On datacenter B , we are changing the seeds nodes. On datacenter A , we are changing the seeds nodes in cassandra.yml but that will be picked up during cassandra restart only but we can not have downtime for datacenter A. It has to be up all the time. Regards, Kunal Vaid On Mon, Apr 22, 2019 at 3:49 PM Marc Selwan wrote: > Hi Kunal, > > Did you edit the cassandra.yaml file in each data center to remove the > seed nodes? On which ever data center is starting from scratch (I think > it's B in your case), you may want to also change the cluster name. > > Best, > *Marc Selwan | *DataStax *| *PM, Server Team *|* *(925) 413-7079* *|* > Twitter <https://twitter.com/MarcSelwan> > > * Quick links | *DataStax <http://www.datastax.com> *| *Training > <http://www.academy.datastax.com> *| *Documentation > <http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedIntro_r.html> > *| *Downloads <http://www.datastax.com/download> > > > > On Mon, Apr 22, 2019 at 3:38 PM Kunal wrote: > >> Hi Friends, >> >> I need small help in unpairing two datacenters. >> We have 2 datacenters (say A and B ) with 3 nodes in each datacenter. We >> want to remove one whole data center (B) (3 nodes) from the other one (B). >> basically, want to unpair both datacenter and want to use them both >> individually. >> We are trying this using nodetool decommission and it is removing the 3 >> nodes from B datacenter. But when we are trying to bring up datacenter B to >> use it separately from Datacenter A, it is joining back to datacenter A. We >> noticed in debug.log, nodes from datacenter A keeps looking for nodes in >> datacenter B and getting connection refused error when the nodes of >> datacenter B are down, but as soon as nodes comes back, they are joining to >> the cluster. >> We don't want nodes from datacenter B to join datacenter A once they are >> decommissioned. >> >> Can you please let me know if i am missing anything. >> >> Thanks in advance. >> >> Regards, >> Kunal Vaid >> > -- Regards, Kunal Vaid
Unpair cassandra datacenters
Hi Friends, I need small help in unpairing two datacenters. We have 2 datacenters (say A and B ) with 3 nodes in each datacenter. We want to remove one whole data center (B) (3 nodes) from the other one (B). basically, want to unpair both datacenter and want to use them both individually. We are trying this using nodetool decommission and it is removing the 3 nodes from B datacenter. But when we are trying to bring up datacenter B to use it separately from Datacenter A, it is joining back to datacenter A. We noticed in debug.log, nodes from datacenter A keeps looking for nodes in datacenter B and getting connection refused error when the nodes of datacenter B are down, but as soon as nodes comes back, they are joining to the cluster. We don't want nodes from datacenter B to join datacenter A once they are decommissioned. Can you please let me know if i am missing anything. Thanks in advance. Regards, Kunal Vaid
Re: time tracking for down node for nodetool repair
Thanks everyone for your valuable suggestion. Really appreciate it Regards, Kunal Vaid On Mon, Apr 8, 2019 at 7:41 PM Nitan Kainth wrote: > Valid suggestion. Stick to the plan, avoid downtime of a node more than > hinted handoff window. OR increase window to a larger value, if you know it > is going to take longer than current setting > > > Regards, > > Nitan > > Cell: 510 449 9629 > > On Apr 8, 2019, at 8:43 PM, Soumya Jena > wrote: > > Cassandra tracks it and no new hints will be created once the default 3 > hours window is passed . However , cassandra will not automatically > trigger a repair if your node is down for more than 3 hours .Default > settings of 3 hours for hints is defined in cassandra.yaml file . Look for > "max_hint_window_in_ms" in the cassandra.yaml file. Its configurable . > Apart from the periodic repair you should start a repair when you bring up > a node which has missed some writes . > > One more thing is if node is down for long time and missed a lot of > writes sometimes it may be better to add that as a new fresh node rather > than adding it and then doing repair . > > On Mon, Apr 8, 2019 at 4:49 PM Stefan Miklosovic < > stefan.mikloso...@instaclustr.com> wrote: > >> Ah I see it is the default for hinted handoffs. I was somehow thinking >> its bigger figure I do not know why :) >> >> I would say you should run repairs continuously / periodically so you >> would not even have to do some thinking about that and it should run >> in the background in a scheduled manner if possible. >> >> Regards >> >> On Tue, 9 Apr 2019 at 04:19, Kunal wrote: >> > >> > Hello everyone.. >> > >> > >> > >> > I have a 6 node Cassandra datacenter, 3 nodes on each datacenter. If >> one of the node goes down and remain down for more than 3 hr, I have to run >> nodetool repair. Just wanted to ask if Cassandra automatically tracks the >> time when one of the Cassandra node goes down or do I need to write code to >> track the time and run repair when node comes back online after 3 hrs. >> > >> > >> > Thanks in anticipation. >> > >> > Regards, >> > Kunal Vaid >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> >> -- Regards, Kunal Vaid
time tracking for down node for nodetool repair
Hello everyone.. I have a 6 node Cassandra datacenter, 3 nodes on each datacenter. If one of the node goes down and remain down for more than 3 hr, I have to run nodetool repair. Just wanted to ask if Cassandra automatically tracks the time when one of the Cassandra node goes down or do I need to write code to track the time and run repair when node comes back online after 3 hrs. Thanks in anticipation. Regards, Kunal Vaid
Re: Two datacenters with one cassandra node in each datacenter
Hi Dinesh, We have very small setup and size of data is also very small. Max data size is around 2gb. Latency expectations is around 10-15ms. Regards, Kunal On Wed, Feb 6, 2019 at 11:27 PM dinesh.jo...@yahoo.com.INVALID wrote: > You also want to use Cassandra with a minimum of 3 nodes. > > Dinesh > > > On Wednesday, February 6, 2019, 11:26:07 PM PST, dinesh.jo...@yahoo.com < > dinesh.jo...@yahoo.com> wrote: > > > Hey Kunal, > > Can you add more details about the size of data, read/write throughput, > what are your latency expectations, etc? What do you mean by "performance" > issue with replication? Without these details it's a bit tough to answer > your questions. > > Dinesh > > > On Wednesday, February 6, 2019, 3:47:05 PM PST, Kunal < > kunal.v...@gmail.com> wrote: > > > HI All, > > I need some recommendation on using two datacenters with one node in each > datacenter. > > In our organization, We are trying to have two cassandra dataceters with > only 1 node on each side. From the preliminary investigation, I see > replication is happening but I want to know if we can use this deployment > in production? Will there be any performance issue with replication ? > > We have already setup 2 datacenters with one node on each datacenter and > replication is working fine. > > Can you please let me know if this kind of setup is recommended for > production deployment. > Thanks in anticipation. > > Regards, > Kunal Vaid > -- Regards, Kunal Vaid
Two datacenters with one cassandra node in each datacenter
HI All, I need some recommendation on using two datacenters with one node in each datacenter. In our organization, We are trying to have two cassandra dataceters with only 1 node on each side. From the preliminary investigation, I see replication is happening but I want to know if we can use this deployment in production? Will there be any performance issue with replication ? We have already setup 2 datacenters with one node on each datacenter and replication is working fine. Can you please let me know if this kind of setup is recommended for production deployment. Thanks in anticipation. Regards, Kunal Vaid
RE: system.size_estimates - safe to remove sstables?
No, this is a different cluster. Kunal On 13-Mar-2018 6:27 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote: Kunal, Is this the GCE cluster you are speaking of in the “Adding new DC?” thread? Kenneth Brotman *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] *Sent:* Sunday, March 11, 2018 2:18 PM *To:* user@cassandra.apache.org *Subject:* Re: system.size_estimates - safe to remove sstables? Finally, got a chance to work on it over the weekend. It worked as advertised. :) Thanks a lot, Chris. Kunal On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote: Thanks a lot, Chris. Will try it today/tomorrow and update here. Thanks, Kunal On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote: While its off you can delete the files in the directory yeah Chris On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote: Hi Chris, I checked for snapshots and backups - none found. Also, we're not using opscenter, hadoop or spark or any such tool. So, do you think we can just remove the cf and restart the service? Thanks, Kunal On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote: Any chance space used by snapshots? What files exist there that are taking up space? > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote: > > Hi all, > > I have a 2-node cluster running cassandra 2.1.18. > One of the nodes has run out of disk space and died - almost all of it shows up as occupied by size_estimates CF. > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' output. > > This is while the other node is chugging along - shows only 25MiB consumed by size_estimates (du -sh output). > > Any idea why this descripancy? > Is it safe to remove the size_estimates sstables from the affected node and restart the service? > > Thanks, > Kunal - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org <user-unsubscribe@cassandra.apacheorg> For additional commands, e-mail: user-h...@cassandra.apache.org
Re: [EXTERNAL] RE: Adding new DC?
Yes, that's correct. The customer wants us to migrate the cassandra setup in their AWS account. Thanks, Kunal On 13 March 2018 at 04:56, Kenneth Brotman <kenbrot...@yahoo.com.invalid> wrote: > I didn’t understand something. Are you saying you are using one data > center on Google and one on Amazon? > > > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Monday, March 12, 2018 4:24 PM > *To:* user@cassandra.apache.org > *Cc:* Nikhil Soman > *Subject:* Re: [EXTERNAL] RE: Adding new DC? > > > > > > On 13 March 2018 at 03:28, Kenneth Brotman <kenbrot...@yahoo.com.invalid> > wrote: > > You can’t migrate and upgrade at the same time perhaps but you could do > one and then the other so as to end up on new version. I’m guessing it’s > an error in the yaml file or a port not open. Is there any good reason for > a production cluster to still be on version 2.1x? > > > > I'm not trying to migrate AND upgrade at the same time. However, the apt > repo shows only 2.120 as the available version. > > This is the output from the new node in AWS > > > > ubuntu@ip-10-0-43-213:*~*$ apt-cache policy cassandra > cassandra: > Installed: 2.1.20 > Candidate: 2.1.20 > Version table: > *** 2.1.20 500 >500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 > Packages >100 /var/lib/dpkg/status > > Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node > into GCE nodes. > > As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE > firewall for the public IP of the AWS instance. > > > > I mentioned earlier - there are some differences in the column types - for > example, date (>= 2.2) vs. timestamp (2.1.x) > > The application has not been updated yet. > > Hence sticking to 2.1.x for now. > > > > And, so far, 2.1.x has been serving the purpose. > > Kunal > > > > > > Kenneth Brotman > > > > *From:* Durity, Sean R [mailto:sean_r_dur...@homedepot.com] > *Sent:* Monday, March 12, 2018 11:36 AM > *To:* user@cassandra.apache.org > *Subject:* RE: [EXTERNAL] RE: Adding new DC? > > > > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com > <kgangakhed...@gmail.com>] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > > Thanks, > > Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 2:32 PM > *To:* user@cassandra.apache.org > *Subject:* Adding new DC? > > > > Hi all, > > > > We currently have a cluster in GCE for one of the customers. > > They want it to be migrated to AWS. > > > > I have setup one node in AWS to join into the cluster by following: > > https://docsdatastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html > <https:
Re: [EXTERNAL] RE: Adding new DC?
On 13 March 2018 at 04:54, Kenneth Brotman <kenbrot...@yahoo.com.invalid> wrote: > Kunal, > > > > Please provide the following setting from the yaml files you are using: > > > > seeds: > In GCE: seeds: "10.142.14.27" In AWS (new node being added): seeds: "35.196.96.247,35.227.127.245,35.196.241.232" (these are the public IP addresses of 3 nodes from GCE) I have verified that I am able to do cqlsh from the AWS instance to all 3 ip addresses. > listen_address: > We use the listen_interface setting instead of listen_address. In GCE: listen_interface: eth0 (running ubuntu 14.04 LTS) In AWS: listen_interface: ens3 (running ubuntu 16.04 LTS) > broadcast_address: > I tried setting broadcast_address to one instance in GCE: broadcast_address: 35.196.96.247 In AWS: broadcast_address: 13.127.89.251 (this is the public/elastic IP of the node in AWS) rpc_address: > Like listen_address, we use rpc_interface. In GCE: rpc_interface: eth0 In AWS: rpc_interface: ens3 > endpoint_snitch: > In both setups, we currently use GossipingPropertyFileSnitch. The cassandra-rackdc.properties files from both setups: GCE: dc=DC1 rack=RAC1 AWS: dc=DC2 rack=RAC1 > auto_bootstrap: > When the google cloud instances started up, we hadn't set this explicitly - so, they started off with default value (auto_bootstrap: true) However, as outlined in the datastax doc for adding new dc, I had added 'auto_bootstrap: false' to the google cloud instances (not restarted the service as per the doc). In the AWS instance, I had added 'auto_bootstrap: false' - the doc says we need to do "nodetool rebuild" and hence no automatic bootstrapping. But, haven't gotten to that step yet. Thanks, Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Monday, March 12, 2018 4:13 PM > *To:* user@cassandra.apache.org > *Cc:* Nikhil Soman > *Subject:* Re: [EXTERNAL] RE: Adding new DC? > > > > > > On 13 March 2018 at 00:06, Durity, Sean R <sean_r_dur...@homedepot.com> > wrote: > > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > > > > I'm not trying to upgrade as of now - first priority is the migration. > > We can look at version upgrade later on. > > > > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > > > > I just tried setting the broadcast_address in one of the instances in GCE > to its public IP and restarted the service. > > However, it now shows all other nodes (in GCE) as DN in nodetool status > output and the other nodes also report this node as DN with its > internal/private IP address. > > > > I also tried setting the broadcast_rpc_address to the internal/private IP > address - still the same. > > > > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > > > > No. of nodes: 5 > > RF: 3 > > Data size (as reported by the load factor in nodetool status output): > ~30GB per node > > > > Thanks, > Kunal > > > > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk. > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > &
Re: [EXTERNAL] RE: Adding new DC?
On 13 March 2018 at 03:28, Kenneth Brotman <kenbrot...@yahoo.com.invalid> wrote: > You can’t migrate and upgrade at the same time perhaps but you could do > one and then the other so as to end up on new version. I’m guessing it’s > an error in the yaml file or a port not open. Is there any good reason for > a production cluster to still be on version 2.1x? > I'm not trying to migrate AND upgrade at the same time. However, the apt repo shows only 2.1.20 as the available version. This is the output from the new node in AWS ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra cassandra: Installed: 2.1.20 Candidate: 2.1.20 Version table: *** 2.1.20 500 500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 Packages 100 /var/lib/dpkg/status Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node into GCE nodes. As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE firewall for the public IP of the AWS instance. I mentioned earlier - there are some differences in the column types - for example, date (>= 2.2) vs. timestamp (2.1.x) The application has not been updated yet. Hence sticking to 2.1.x for now. And, so far, 2.1.x has been serving the purpose. Kunal > > Kenneth Brotman > > > > *From:* Durity, Sean R [mailto:sean_r_dur...@homedepot.com] > *Sent:* Monday, March 12, 2018 11:36 AM > *To:* user@cassandra.apache.org > *Subject:* RE: [EXTERNAL] RE: Adding new DC? > > > > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com > <kgangakhed...@gmail.com>] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk. > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > > Thanks, > > Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 2:32 PM > *To:* user@cassandra.apache.org > *Subject:* Adding new DC? > > > > Hi all, > > > > We currently have a cluster in GCE for one of the customers. > > They want it to be migrated to AWS. > > > > I have setup one node in AWS to join into the cluster by following: > > https://docs.datastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk=> > > > > Will add more nodes once the first one joins successfully. > > > > The node in AWS has an elastic IP - which is white-listed for ports > 7000-7001, 7199, 9042 in GCE firewall. > > > > The snitch is set to GossipingPropertyFileSnitch. The GCE setup has > dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. > > > > When I start cassandra service on the AWS instance, I see the version > handshake msgs in the logs trying to connect to the public IPs o
Re: [EXTERNAL] RE: Adding new DC?
On 13 March 2018 at 00:06, Durity, Sean R <sean_r_dur...@homedepot.com> wrote: > You cannot migrate and upgrade at the same time across major versions. > Streaming is (usually) not compatible between versions. > I'm not trying to upgrade as of now - first priority is the migration. We can look at version upgrade later on. > > > As to the migration question, I would expect that you may need to put the > external-facing ip addresses in several places in the cassandra.yaml file. > And, yes, it would require a restart. Why is a non-restart more desirable? > Most Cassandra changes require a restart, but you can do a rolling restart > and not impact your application. This is fairly normal admin work and > can/should be automated. > I just tried setting the broadcast_address in one of the instances in GCE to its public IP and restarted the service. However, it now shows all other nodes (in GCE) as DN in nodetool status output and the other nodes also report this node as DN with its internal/private IP address. I also tried setting the broadcast_rpc_address to the internal/private IP address - still the same. > > > How large is the cluster to migrate (# of nodes and size of data). The > preferred method might depend on how much data needs to move. Is any > application outage acceptable? > No. of nodes: 5 RF: 3 Data size (as reported by the load factor in nodetool status output): ~30GB per node Thanks, Kunal > > > Sean Durity > > lord of the (C*) rings (Staff Systems Engineer – Cassandra) > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 10:20 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] RE: Adding new DC? > > > > Hi Kenneth, > > > > Replies inline below. > > > > On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> > wrote: > > Hi Kunal, > > > > That version of Cassandra is too far before me so I’ll let others answer. > I was wonder why you wouldn’t want to end up on 3.0x if you’re going > through all the trouble of migrating anyway? > > > > > > Application side constraints - some data types are different between 2.1.x > and 3.x (for example, date vs. timestamp). > > > > Besides, this is production setup - so, cannot take risk. > > Are both data centers in the same region on AWS? Can you provide yaml > file for us to see? > > > > > > No, they are in different regions - GCE setup is in us-east while AWS > setup is in Asia-south (Mumbai) > > > > Thanks, > > Kunal > > Kenneth Brotman > > > > *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] > *Sent:* Sunday, March 11, 2018 2:32 PM > *To:* user@cassandra.apache.org > *Subject:* Adding new DC? > > > > Hi all, > > > > We currently have a cluster in GCE for one of the customers. > > They want it to be migrated to AWS. > > > > I have setup one node in AWS to join into the cluster by following: > > https://docs.datastax.com/en/cassandra/2.1/cassandra/ > operations/ops_add_dc_to_cluster_t.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk=> > > > > Will add more nodes once the first one joins successfully. > > > > The node in AWS has an elastic IP - which is white-listed for ports > 7000-7001, 7199, 9042 in GCE firewall. > > > > The snitch is set to GossipingPropertyFileSnitch. The GCE setup has > dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. > > > > When I start cassandra service on the AWS instance, I see the version > handshake msgs in the logs trying to connect to the public IPs of the GCE > nodes: > > OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx > > However, nodetool status output on both sides don't show the other side at > all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS > setup doesn't show old DC (dc=DC1). > > > > In cassandra.yaml file, I'm only using listen_interface and rpc_interface > settings - no explicit IP addresses used - so, ends up using the internal > private IP ranges. > > > > Do I need to explicitly add the broadcast_address? for both side? > > Would that require restarting of cassandra service on GCE side? Or is it > possible to change that setting on-the-fly without a restart? > > > > I would prefer a non-restart option. > > > > PS: The ca
RE: Adding new DC?
Hi Kenneth, Replies inline below. On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote: Hi Kunal, That version of Cassandra is too far before me so I’ll let others answer. I was wonder why you wouldn’t want to end up on 3.0x if you’re going through all the trouble of migrating anyway? Application side constraints - some data types are different between 2.1.x and 3.x (for example, date vs. timestamp). Besides, this is production setup - so, cannot take risk. Are both data centers in the same region on AWS? Can you provide yaml file for us to see? No, they are in different regions - GCE setup is in us-east while AWS setup is in Asia-south (Mumbai) Thanks, Kunal Kenneth Brotman *From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] *Sent:* Sunday, March 11, 2018 2:32 PM *To:* user@cassandra.apache.org *Subject:* Adding new DC? Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docs.datastax.com/en/cassandra/2.1/cassandra/ operations/ops_add_dc_to_cluster_t.html Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=DC1). In cassandra.yaml file, I'm only using listen_interface and rpc_interface settings - no explicit IP addresses used - so, ends up using the internal private IP ranges. Do I need to explicitly add the broadcast_address? for both side? Would that require restarting of cassandra service on GCE side? Or is it possible to change that setting on-the-fly without a restart? I would prefer a non-restart option. PS: The cassandra version running in GCE is 2.1.18 while the new node setup in AWS is running 2.1.20 - just in case if that's relevant Thanks, Kunal
Adding new DC?
Hi all, We currently have a cluster in GCE for one of the customers. They want it to be migrated to AWS. I have setup one node in AWS to join into the cluster by following: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html Will add more nodes once the first one joins successfully. The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 7199, 9042 in GCE firewall. The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, rack=RAC1 while on AWS, I changed the DC to dc=DC2. When I start cassandra service on the AWS instance, I see the version handshake msgs in the logs trying to connect to the public IPs of the GCE nodes: OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx However, nodetool status output on both sides don't show the other side at all. That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup doesn't show old DC (dc=DC1). In cassandra.yaml file, I'm only using listen_interface and rpc_interface settings - no explicit IP addresses used - so, ends up using the internal private IP ranges. Do I need to explicitly add the broadcast_address? for both side? Would that require restarting of cassandra service on GCE side? Or is it possible to change that setting on-the-fly without a restart? I would prefer a non-restart option. PS: The cassandra version running in GCE is 2.1.18 while the new node setup in AWS is running 2.1.20 - just in case if that's relevant. Thanks, Kunal
Re: system.size_estimates - safe to remove sstables?
Finally, got a chance to work on it over the weekend. It worked as advertised. :) Thanks a lot, Chris. Kunal On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote: > Thanks a lot, Chris. > > Will try it today/tomorrow and update here. > > Thanks, > Kunal > > On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote: > >> While its off you can delete the files in the directory yeah >> >> Chris >> >> >> On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> >> wrote: >> >> Hi Chris, >> >> I checked for snapshots and backups - none found. >> Also, we're not using opscenter, hadoop or spark or any such tool. >> >> So, do you think we can just remove the cf and restart the service? >> >> Thanks, >> Kunal >> >> On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote: >> >>> Any chance space used by snapshots? What files exist there that are >>> taking up space? >>> >>> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar < >>> kgangakhed...@gmail.com> wrote: >>> > >>> > Hi all, >>> > >>> > I have a 2-node cluster running cassandra 2.1.18. >>> > One of the nodes has run out of disk space and died - almost all of it >>> shows up as occupied by size_estimates CF. >>> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du >>> -sh' output. >>> > >>> > This is while the other node is chugging along - shows only 25MiB >>> consumed by size_estimates (du -sh output). >>> > >>> > Any idea why this descripancy? >>> > Is it safe to remove the size_estimates sstables from the affected >>> node and restart the service? >>> > >>> > Thanks, >>> > Kunal >>> >>> >>> - >>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>> For additional commands, e-mail: user-h...@cassandra.apache.org >>> >>> >> >> >
Re: system.size_estimates - safe to remove sstables?
Thanks a lot, Chris. Will try it today/tomorrow and update here. Thanks, Kunal On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote: > While its off you can delete the files in the directory yeah > > Chris > > > On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> > wrote: > > Hi Chris, > > I checked for snapshots and backups - none found. > Also, we're not using opscenter, hadoop or spark or any such tool. > > So, do you think we can just remove the cf and restart the service? > > Thanks, > Kunal > > On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote: > >> Any chance space used by snapshots? What files exist there that are >> taking up space? >> >> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> >> wrote: >> > >> > Hi all, >> > >> > I have a 2-node cluster running cassandra 2.1.18. >> > One of the nodes has run out of disk space and died - almost all of it >> shows up as occupied by size_estimates CF. >> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du >> -sh' output. >> > >> > This is while the other node is chugging along - shows only 25MiB >> consumed by size_estimates (du -sh output). >> > >> > Any idea why this descripancy? >> > Is it safe to remove the size_estimates sstables from the affected node >> and restart the service? >> > >> > Thanks, >> > Kunal >> >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> >> > >
Re: system.size_estimates - safe to remove sstables?
Hi Chris, I checked for snapshots and backups - none found. Also, we're not using opscenter, hadoop or spark or any such tool. So, do you think we can just remove the cf and restart the service? Thanks, Kunal On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote: > Any chance space used by snapshots? What files exist there that are taking > up space? > > > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> > wrote: > > > > Hi all, > > > > I have a 2-node cluster running cassandra 2.1.18. > > One of the nodes has run out of disk space and died - almost all of it > shows up as occupied by size_estimates CF. > > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' > output. > > > > This is while the other node is chugging along - shows only 25MiB > consumed by size_estimates (du -sh output). > > > > Any idea why this descripancy? > > Is it safe to remove the size_estimates sstables from the affected node > and restart the service? > > > > Thanks, > > Kunal > > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >
system.size_estimates - safe to remove sstables?
Hi all, I have a 2-node cluster running cassandra 2.1.18. One of the nodes has run out of disk space and died - almost all of it shows up as occupied by size_estimates CF. Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' output. This is while the other node is chugging along - shows only 25MiB consumed by size_estimates (du -sh output). Any idea why this descripancy? Is it safe to remove the size_estimates sstables from the affected node and restart the service? Thanks, Kunal
Re: TRUNCATE on a disk almost full - possible?
Great, thanks a lot for the help, guys. I just did the truncation + clearsnapshot just now - worked smoothly.. :) Freed up 400GB, yay \o/ Really appreciate your help. Thanks once again. Kunal On 21 April 2017 at 15:04, Nicolas Guyomar <nicolas.guyo...@gmail.com> wrote: > Hi Kunal, > > Timeout usually occured in the client (eg cqlsh), it does not mean that > the truncate operation is interrupted. > > Have you checked that you have no old snapshot (automatic snaphost for > instance) that you could get rid off to get some space back ? > > On 21 April 2017 at 11:27, benjamin roth <brs...@gmail.com> wrote: > >> Truncate needs no space. It just creates a hard link of all affected >> SSTables under the corresponding -SNAPSHOT dir (at least with default >> settings) and then removes the SSTables. >> Also this operation should be rather fast as it is mostly a file-deletion >> process with some metadata updates. >> >> 2017-04-21 11:21 GMT+02:00 Kunal Gangakhedkar <kgangakhed...@gmail.com>: >> >>> Hi all, >>> >>> We have a CF that's grown too large - it's not getting actively used in >>> the app right now. >>> The on-disk size of the . directory is ~407GB and I have only >>> ~40GB free left on the disk. >>> >>> I understand that if I trigger a TRUNCATE on this CF, cassandra will try >>> to take snapshot. >>> My question: >>> Is the ~40GB enough to safely truncate this table? >>> >>> I will manually remove the . directory once the truncate is >>> completed. >>> >>> Also, while browsing through earlier msgs regarding truncate, I noticed >>> that it's possible to get OperationTimedOut >>> <http://www.mail-archive.com/user@cassandra.apache.org/msg48958.html> >>> exception. Does that stop the truncate operation? >>> >>> Is there any other safe way to clean up the CF? >>> >>> Thanks, >>> Kunal >>> >> >> >
TRUNCATE on a disk almost full - possible?
Hi all, We have a CF that's grown too large - it's not getting actively used in the app right now. The on-disk size of the . directory is ~407GB and I have only ~40GB free left on the disk. I understand that if I trigger a TRUNCATE on this CF, cassandra will try to take snapshot. My question: Is the ~40GB enough to safely truncate this table? I will manually remove the . directory once the truncate is completed. Also, while browsing through earlier msgs regarding truncate, I noticed that it's possible to get OperationTimedOut <http://www.mail-archive.com/user@cassandra.apache.org/msg48958.html> exception. Does that stop the truncate operation? Is there any other safe way to clean up the CF? Thanks, Kunal
Re: Backups eating up disk space
Hi all, Is it safe to delete the backup folders from various CFs from 'system' keyspace too? I seem to have missed them in the last cleanup - and now, the size_estimates and compactions_in_progress seem to have grown large ( >200G and ~6G respectively). Can I remove them too? Thanks, Kunal On 13 January 2017 at 18:30, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote: > Great, thanks a lot to all for the help :) > > I finally took the dive and went with Razi's suggestions. > In summary, this is what I did: > >- turn off incremental backups on each of the nodes in rolling fashion >- remove the 'backups' directory from each keyspace on each node. > > This ended up freeing up almost 350GB on each node - yay :) > > Again, thanks a lot for the help, guys. > > Kunal > > On 12 January 2017 at 21:15, Khaja, Raziuddin (NIH/NLM/NCBI) [C] < > raziuddin.kh...@nih.gov> wrote: > >> snapshots are slightly different than backups. >> >> >> >> In my explanation of the hardlinks created in the backups folder, notice >> that compacted sstables, never end up in the backups folder. >> >> >> >> On the other hand, a snapshot is meant to represent the data at a >> particular moment in time. Thus, the snapshots directory contains hardlinks >> to all active sstables at the time the snapshot was taken, which would >> include: compacted sstables; and any sstables from memtable flush or >> streamed from other nodes that both exist in the table directory and the >> backups directory. >> >> >> >> So, that would be the difference between snapshots and backups. >> >> >> >> Best regards, >> >> -Razi >> >> >> >> >> >> *From: *Alain RODRIGUEZ <arodr...@gmail.com> >> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >> *Date: *Thursday, January 12, 2017 at 9:16 AM >> >> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> >> *Subject: *Re: Backups eating up disk space >> >> >> >> My 2 cents, >> >> >> >> As I mentioned earlier, we're not currently using snapshots - it's only >> the backups that are bothering me right now. >> >> >> >> I believe backups folder is just the new name for the previously called >> snapshots folder. But I can be completely wrong, I haven't played that much >> with snapshots in new versions yet. >> >> >> >> Anyway, some operations in Apache Cassandra can trigger a snapshot: >> >> >> >> - Repair (when not using parallel option but sequential repairs instead) >> >> - Truncating a table (by default) >> >> - Dropping a table (by default) >> >> - Maybe other I can't think of... ? >> >> >> >> If you want to clean space but still keep a backup you can run: >> >> >> >> "nodetool clearsnapshots" >> >> "nodetool snapshot " >> >> >> >> This way and for a while, data won't be taking space as old files will be >> cleaned and new files will be only hardlinks as detailed above. Then you >> might want to work at a proper backup policy, probably implying getting >> data out of production server (a lot of people uses S3 or similar >> services). Or just do that from time to time, meaning you only keep a >> backup and disk space behaviour will be hard to predict. >> >> >> >> C*heers, >> >> --- >> >> Alain Rodriguez - @arodream - al...@thelastpickle.com >> >> France >> >> >> >> The Last Pickle - Apache Cassandra Consulting >> >> http://www.thelastpickle.com >> >> >> >> 2017-01-12 6:42 GMT+01:00 Prasenjit Sarkar <prasenjit.sar...@datos.io>: >> >> Hi Kunal, >> >> >> >> Razi's post does give a very lucid description of how cassandra manages >> the hard links inside the backup directory. >> >> >> >> Where it needs clarification is the following: >> >> --> incremental backups is a system wide setting and so its an all or >> nothing approach >> >> >> >> --> as multiple people have stated, incremental backups do not create >> hard links to compacted sstables. however, this can bloat the size of your >> backups >> >> >> >> --> again as stated, it is a general industry practice to place backups >> in a different secondary storage location than the main production site. So >> b
Re: Backups eating up disk space
Great, thanks a lot to all for the help :) I finally took the dive and went with Razi's suggestions. In summary, this is what I did: - turn off incremental backups on each of the nodes in rolling fashion - remove the 'backups' directory from each keyspace on each node. This ended up freeing up almost 350GB on each node - yay :) Again, thanks a lot for the help, guys. Kunal On 12 January 2017 at 21:15, Khaja, Raziuddin (NIH/NLM/NCBI) [C] < raziuddin.kh...@nih.gov> wrote: > snapshots are slightly different than backups. > > > > In my explanation of the hardlinks created in the backups folder, notice > that compacted sstables, never end up in the backups folder. > > > > On the other hand, a snapshot is meant to represent the data at a > particular moment in time. Thus, the snapshots directory contains hardlinks > to all active sstables at the time the snapshot was taken, which would > include: compacted sstables; and any sstables from memtable flush or > streamed from other nodes that both exist in the table directory and the > backups directory. > > > > So, that would be the difference between snapshots and backups. > > > > Best regards, > > -Razi > > > > > > *From: *Alain RODRIGUEZ <arodr...@gmail.com> > *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Date: *Thursday, January 12, 2017 at 9:16 AM > > *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: Backups eating up disk space > > > > My 2 cents, > > > > As I mentioned earlier, we're not currently using snapshots - it's only > the backups that are bothering me right now. > > > > I believe backups folder is just the new name for the previously called > snapshots folder. But I can be completely wrong, I haven't played that much > with snapshots in new versions yet. > > > > Anyway, some operations in Apache Cassandra can trigger a snapshot: > > > > - Repair (when not using parallel option but sequential repairs instead) > > - Truncating a table (by default) > > - Dropping a table (by default) > > - Maybe other I can't think of... ? > > > > If you want to clean space but still keep a backup you can run: > > > > "nodetool clearsnapshots" > > "nodetool snapshot " > > > > This way and for a while, data won't be taking space as old files will be > cleaned and new files will be only hardlinks as detailed above. Then you > might want to work at a proper backup policy, probably implying getting > data out of production server (a lot of people uses S3 or similar > services). Or just do that from time to time, meaning you only keep a > backup and disk space behaviour will be hard to predict. > > > > C*heers, > > --- > > Alain Rodriguez - @arodream - al...@thelastpickle.com > > France > > > > The Last Pickle - Apache Cassandra Consulting > > http://www.thelastpickle.com > > > > 2017-01-12 6:42 GMT+01:00 Prasenjit Sarkar <prasenjit.sar...@datos.io>: > > Hi Kunal, > > > > Razi's post does give a very lucid description of how cassandra manages > the hard links inside the backup directory. > > > > Where it needs clarification is the following: > > --> incremental backups is a system wide setting and so its an all or > nothing approach > > > > --> as multiple people have stated, incremental backups do not create hard > links to compacted sstables. however, this can bloat the size of your > backups > > > > --> again as stated, it is a general industry practice to place backups in > a different secondary storage location than the main production site. So > best to move it to the secondary storage before applying rm on the backups > folder > > > > In my experience with production clusters, managing the backups folder > across multiple nodes can be painful if the objective is to ever recover > data. With the usual disclaimers, better to rely on third party vendors to > accomplish the needful rather than scripts/tablesnap. > > > > Regards > > Prasenjit > > > > > > On Wed, Jan 11, 2017 at 7:49 AM, Khaja, Raziuddin (NIH/NLM/NCBI) [C] < > raziuddin.kh...@nih.gov> wrote: > > Hello Kunal, > > > > Caveat: I am not a super-expert on Cassandra, but it helps to explain to > others, in order to eventually become an expert, so if my explanation is > wrong, I would hope others would correct me. J > > > > The active sstables/data files are are all the files located in the > directory for the table. > > You can safely remove all fil
Re: Backups eating up disk space
Thanks for the reply, Razi. As I mentioned earlier, we're not currently using snapshots - it's only the backups that are bothering me right now. So my next question is pertaining to this statement of yours: As far as I am aware, using *rm* is perfectly safe to delete the > directories for snapshots/backups as long as you are careful not to delete > your actively used sstable files and directories. How do I find out which are the actively used sstables? If by that you mean the main data files, does that mean I can safely remove all files ONLY under the "backups/" directory? Or, removing any files that are current hard-links inside backups can potentially cause any issues? Thanks, Kunal On 11 January 2017 at 01:06, Khaja, Raziuddin (NIH/NLM/NCBI) [C] < raziuddin.kh...@nih.gov> wrote: > Hello Kunal, > > > > I would take a look at the following configuration options in the > Cassandra.yaml > > > > *Common automatic backup settings* > > *Incremental_backups:* > > http://docs.datastax.com/en/archived/cassandra/3.x/ > cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__ > incremental_backups > > > > (Default: false) Backs up data updated since the last snapshot was taken. > When enabled, Cassandra creates a hard link to each SSTable flushed or > streamed locally in a backups subdirectory of the keyspace data. Removing > these links is the operator's responsibility. > > > > *snapshot_before_compaction*: > > http://docs.datastax.com/en/archived/cassandra/3.x/ > cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__ > snapshot_before_compaction > > > > (Default: false) Enables or disables taking a snapshot before each > compaction. A snapshot is useful to back up data when there is a data > format change. Be careful using this option: Cassandra does not clean up > older snapshots automatically. > > > > > > *Advanced automatic backup setting* > > *auto_snapshot*: > > http://docs.datastax.com/en/archived/cassandra/3.x/ > cassandra/configuration/configCassandra_yaml.html# > configCassandra_yaml__auto_snapshot > > > > (Default: true) Enables or disables whether Cassandra takes a snapshot of > the data before truncating a keyspace or dropping a table. To prevent data > loss, Datastax strongly advises using the default setting. If you > set auto_snapshot to false, you lose data on truncation or drop. > > > > > > *nodetool* also provides methods to manage snapshots. > http://docs.datastax.com/en/archived/cassandra/3.x/ > cassandra/tools/toolsNodetool.html > > See the specific commands: > >- nodetool clearsnapshot > > <http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsClearSnapShot.html> >Removes one or more snapshots. >- nodetool listsnapshots > > <http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsListSnapShots.html> >Lists snapshot names, size on disk, and true size. >- nodetool snapshot > > <http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsSnapShot.html> >Take a snapshot of one or more keyspaces, or of a table, to backup >data. > > > > As far as I am aware, using *rm* is perfectly safe to delete the > directories for snapshots/backups as long as you are careful not to delete > your actively used sstable files and directories. I think the *nodetool > clearsnapshot* command is provided so that you don’t accidentally delete > actively used files. Last I used *clearsnapshot*, (a very long time > ago), I thought it left behind the directory, but this could have been > fixed in newer versions (so you might want to check that). > > > > HTH > > -Razi > > > > > > *From: *Jonathan Haddad <j...@jonhaddad.com> > *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Date: *Tuesday, January 10, 2017 at 12:26 PM > *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: Backups eating up disk space > > > > If you remove the files from the backup directory, you would not have data > loss in the case of a node going down. They're hard links to the same > files that are in your data directory, and are created when an sstable is > written to disk. At the time, they take up (almost) no space, so they > aren't a big deal, but when the sstable gets compacted, they stick around, > so they end up not freeing space up. > > > > Usually you use incremental backups as a means of moving the sstables off > the node to a backup location. If you're not doing anything with them, > they're just wasting space and you should d
Re: Backups eating up disk space
Thanks for quick reply, Jon. But, what about in case of node/cluster going down? Would there be data loss if I remove these files manually? How is it typically managed in production setups? What are the best-practices for the same? Do people take snapshots on each node before removing the backups? This is my first production deployment - so, still trying to learn. Thanks, Kunal On 10 January 2017 at 21:36, Jonathan Haddad <j...@jonhaddad.com> wrote: > You can just delete them off the filesystem (rm) > > On Tue, Jan 10, 2017 at 8:02 AM Kunal Gangakhedkar < > kgangakhed...@gmail.com> wrote: > >> Hi all, >> >> We have a 3-node cassandra cluster with incremental backup set to true. >> Each node has 1TB data volume that stores cassandra data. >> >> The load in the output of 'nodetool status' comes up at around 260GB each >> node. >> All our keyspaces use replication factor = 3. >> >> However, the df output shows the data volumes consuming around 850GB of >> space. >> I checked the keyspace directory structures - most of the space goes in >> /data///backups. >> >> We have never manually run snapshots. >> >> What is the typical procedure to clear the backups? >> Can it be done without taking the node offline? >> >> Thanks, >> Kunal >> >
Backups eating up disk space
Hi all, We have a 3-node cassandra cluster with incremental backup set to true. Each node has 1TB data volume that stores cassandra data. The load in the output of 'nodetool status' comes up at around 260GB each node. All our keyspaces use replication factor = 3. However, the df output shows the data volumes consuming around 850GB of space. I checked the keyspace directory structures - most of the space goes in /data///backups. We have never manually run snapshots. What is the typical procedure to clear the backups? Can it be done without taking the node offline? Thanks, Kunal
Unsubscribe
Unsubscribe Regards, Kunal Gaikwad
Cassandra cluster hardware configuration
Hi, I want to setup a Cassandra cluster of about 3-5 nodes cluster, can anyone suggest me what hardware configuration should I consider considering the RF as 3. The data size should be around 100 GB on the DT environment. Regards, Kunal Gaikwad
Re: Cassandra OOM on joining existing ring
Hi, Looks like that is my primary problem - the sstable count for the daily_challenges column family is 5k. Azure had scheduled maintenance window on Sat. All the VMs got rebooted one by one - including the current cassandra one - and it's taking forever to bring cassandra back up online. Is there any way I can re-organize my existing data? so that I can bring down that count? I don't want to lose that data. If possible, can I do that while cassandra is down? As I mentioned, it's taking forever to get the service up - it's stuck in reading those 5k sstable (+ another 5k of corresponding secondary index) files. :( Oh, did I mention I'm new to cassandra? Thanks, Kunal Kunal On 11 July 2015 at 03:29, Sebastian Estevez sebastian.este...@datastax.com wrote: #1 There is one table - daily_challenges - which shows compacted partition max bytes as ~460M and another one - daily_guest_logins - which shows compacted partition max bytes as ~36M. 460 is high, I like to keep my partitions under 100mb when possible. I've seen worse though. The fix is to add something else (maybe month or week or something) into your partition key: PRIMARY KEY ((segment_type, something_else), date, user_id, sess_id) #2 looks like your jam version is 3 per your env.sh so you're probably okay to copy the env.sh over from the C* 3.0 link I shared once you uncomment and tweak the MAX_HEAP. If there's something wrong your node won't come up. tail your logs. All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Fri, Jul 10, 2015 at 2:44 PM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: And here is my cassandra-env.sh https://gist.github.com/kunalg/2c092cb2450c62be9a20 Kunal On 11 July 2015 at 00:04, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: From jhat output, top 10 entries for Instance Count for All Classes (excluding platform) shows: 2088223 instances of class org.apache.cassandra.db.BufferCell 1983245 instances of class org.apache.cassandra.db.composites.CompoundSparseCellName 1885974 instances of class org.apache.cassandra.db.composites.CompoundDenseCellName 63 instances of class org.apache.cassandra.io.sstable.IndexHelper$IndexInfo 503687 instances of class org.apache.cassandra.db.BufferDeletedCell 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier 101800 instances of class org.apache.cassandra.utils.concurrent.Ref 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State 90704 instances of class org.apache.cassandra.utils.concurrent.Ref$GlobalState 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey At the bottom of the page, it shows: Total of 8739510 instances occupying 193607512 bytes. JFYI. Kunal On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Thanks for quick reply. 1. I don't know what are the thresholds that I should look for. So, to save this back-and-forth, I'm attaching the cfstats output for the keyspace. There is one table - daily_challenges - which shows compacted partition max bytes as ~460M and another one - daily_guest_logins - which shows compacted partition max bytes as ~36M. Can that be a problem? Here is the CQL schema for the daily_challenges column family: CREATE TABLE app_10001.daily_challenges ( segment_type text, date timestamp, user_id int, sess_id text, data text, deleted boolean, PRIMARY KEY (segment_type, date, user_id, sess_id) ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128
Re: Cassandra OOM on joining existing ring
Attaching the stack dump captured from the last OOM. Kunal On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Forgot to mention: the data size is not that big - it's barely 10GB in all. Kunal On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Hi, I have a 2 node setup on Azure (east us region) running Ubuntu server 14.04LTS. Both nodes have 8GB RAM. One of the nodes (seed node) died with OOM - so, I am trying to add a replacement node with same configuration. The problem is this new node also keeps dying with OOM - I've restarted the cassandra service like 8-10 times hoping that it would finish the replication. But it didn't help. The one node that is still up is happily chugging along. All nodes have similar configuration - with libjna installed. Cassandra is installed from datastax's debian repo - pkg: dsc21 version 2.1.7. I started off with the default configuration - i.e. the default cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM = 2GB) But, that didn't help. So, I then tried to increase the heap to 4GB manually and restarted. It still keeps crashing. Any clue as to why it's happening? Thanks, Kunal ERROR [SharedPool-Worker-6] 2015-07-10 05:12:16,862 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.init(HeapByteBuffer.java:57) ~[na:1.8.0_45] at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_45] at org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:137) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:97) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:61) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.Memtable.put(Memtable.java:192) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1212) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.index.AbstractSimplePerColumnSecondaryIndex.insert(AbstractSimplePerColumnSecondaryIndex.java:131) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.index.SecondaryIndexManager$StandardUpdater.insert(SecondaryIndexManager.java:791) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.AtomicBTreeColumns$ColumnUpdater.apply(AtomicBTreeColumns.java:444) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.AtomicBTreeColumns$ColumnUpdater.apply(AtomicBTreeColumns.java:418) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.utils.btree.BTree.build(BTree.java:116) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.utils.btree.BTree.update(BTree.java:177) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.AtomicBTreeColumns.addAllWithSizeDelta(AtomicBTreeColumns.java:225) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.Memtable.put(Memtable.java:210) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1212) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:389) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:352) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:54) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[apache-cassandra-2.1.7.jar:2.1.7] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_45] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.1.7.jar:2.1.7] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.7.jar:2.1.7] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] ERROR [CompactionExecutor:3] 2015-07-10 05:12:16,862 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:3,1,main] java.lang.OutOfMemoryError: Java heap space at java.util.ArrayDeque.doubleCapacity(ArrayDeque.java:157) ~[na:1.8.0_45
Re: Cassandra OOM on joining existing ring
Forgot to mention: the data size is not that big - it's barely 10GB in all. Kunal On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Hi, I have a 2 node setup on Azure (east us region) running Ubuntu server 14.04LTS. Both nodes have 8GB RAM. One of the nodes (seed node) died with OOM - so, I am trying to add a replacement node with same configuration. The problem is this new node also keeps dying with OOM - I've restarted the cassandra service like 8-10 times hoping that it would finish the replication. But it didn't help. The one node that is still up is happily chugging along. All nodes have similar configuration - with libjna installed. Cassandra is installed from datastax's debian repo - pkg: dsc21 version 2.1.7. I started off with the default configuration - i.e. the default cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM = 2GB) But, that didn't help. So, I then tried to increase the heap to 4GB manually and restarted. It still keeps crashing. Any clue as to why it's happening? Thanks, Kunal
Cassandra OOM on joining existing ring
Hi, I have a 2 node setup on Azure (east us region) running Ubuntu server 14.04LTS. Both nodes have 8GB RAM. One of the nodes (seed node) died with OOM - so, I am trying to add a replacement node with same configuration. The problem is this new node also keeps dying with OOM - I've restarted the cassandra service like 8-10 times hoping that it would finish the replication. But it didn't help. The one node that is still up is happily chugging along. All nodes have similar configuration - with libjna installed. Cassandra is installed from datastax's debian repo - pkg: dsc21 version 2.1.7. I started off with the default configuration - i.e. the default cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM = 2GB) But, that didn't help. So, I then tried to increase the heap to 4GB manually and restarted. It still keeps crashing. Any clue as to why it's happening? Thanks, Kunal
Re: Cassandra OOM on joining existing ring
I'm new to cassandra How do I find those out? - mainly, the partition params that you asked for. Others, I think I can figure out. We don't have any large objects/blobs in the column values - it's all textual, date-time, numeric and uuid data. We use cassandra to primarily store segmentation data - with segment type as partition key. That is again divided into two separate column families; but they have similar structure. Columns per row can be fairly large - each segment type as the row key and associated user ids and timestamp as column value. Thanks, Kunal On 10 July 2015 at 16:36, Jack Krupansky jack.krupan...@gmail.com wrote: What does your data and data model look like - partition size, rows per partition, number of columns per row, any large values/blobs in column values? You could run fine on an 8GB system, but only if your rows and partitions are reasonably small. Any large partitions could blow you away. -- Jack Krupansky On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Attaching the stack dump captured from the last OOM. Kunal On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Forgot to mention: the data size is not that big - it's barely 10GB in all. Kunal On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Hi, I have a 2 node setup on Azure (east us region) running Ubuntu server 14.04LTS. Both nodes have 8GB RAM. One of the nodes (seed node) died with OOM - so, I am trying to add a replacement node with same configuration. The problem is this new node also keeps dying with OOM - I've restarted the cassandra service like 8-10 times hoping that it would finish the replication. But it didn't help. The one node that is still up is happily chugging along. All nodes have similar configuration - with libjna installed. Cassandra is installed from datastax's debian repo - pkg: dsc21 version 2.1.7. I started off with the default configuration - i.e. the default cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM = 2GB) But, that didn't help. So, I then tried to increase the heap to 4GB manually and restarted. It still keeps crashing. Any clue as to why it's happening? Thanks, Kunal
Re: Cassandra OOM on joining existing ring
Thanks for quick reply. 1. I don't know what are the thresholds that I should look for. So, to save this back-and-forth, I'm attaching the cfstats output for the keyspace. There is one table - daily_challenges - which shows compacted partition max bytes as ~460M and another one - daily_guest_logins - which shows compacted partition max bytes as ~36M. Can that be a problem? Here is the CQL schema for the daily_challenges column family: CREATE TABLE app_10001.daily_challenges ( segment_type text, date timestamp, user_id int, sess_id text, data text, deleted boolean, PRIMARY KEY (segment_type, date, user_id, sess_id) ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted); 2. I don't know - how do I check? As I mentioned, I just installed the dsc21 update from datastax's debian repo (ver 2.1.7). Really appreciate your help. Thanks, Kunal On 10 July 2015 at 23:33, Sebastian Estevez sebastian.este...@datastax.com wrote: 1. You want to look at # of sstables in cfhistograms or in cfstats look at: Compacted partition maximum bytes Maximum live cells per slice 2) No, here's the env.sh from 3.0 which should work with some tweaks: https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh You'll at least have to modify the jamm version to what's in yours. I think it's 2.5 All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Thanks, Sebastian. Couple of questions (I'm really new to cassandra): 1. How do I interpret the output of 'nodetool cfstats' to figure out the issues? Any documentation pointer on that would be helpful. 2. I'm primarily a python/c developer - so, totally clueless about JVM environment. So, please bare with me as I would need a lot of hand-holding. Should I just copy+paste the settings you gave and try to restart the failing cassandra server? Thanks, Kunal On 10 July 2015 at 22:35, Sebastian Estevez sebastian.este...@datastax.com wrote: #1 You need more information. a) Take a look at your .hprof file (memory heap from the OOM) with an introspection tool like jhat or visualvm or java flight recorder and see what is using up your RAM. b) How big are your large rows (use nodetool cfstats on each node). If your data model is bad, you are going to have to re-design it no matter what. #2 As a possible workaround try using the G1GC allocator with the settings from c* 3.0 instead of CMS. I've seen lots of success with it lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not* set the newgen size for G1 sets it dynamically: # min and max heap sizes should be set to the same value to avoid # stop-the-world GC pauses during resize, and so that we can lock the # heap in memory on startup to prevent any of it from being swapped # out. JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE} JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE} # Per-thread stack size. JVM_OPTS=$JVM_OPTS -Xss256k # Use the Hotspot garbage-first collector. JVM_OPTS=$JVM_OPTS -XX:+UseG1GC # Have the JVM do less remembered set work during STW, instead # preferring concurrent GC. Reduces p99.9 latency. JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5 # The JVM maximum is 8 PGC threads and 1/4
Re: Cassandra OOM on joining existing ring
And here is my cassandra-env.sh https://gist.github.com/kunalg/2c092cb2450c62be9a20 Kunal On 11 July 2015 at 00:04, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: From jhat output, top 10 entries for Instance Count for All Classes (excluding platform) shows: 2088223 instances of class org.apache.cassandra.db.BufferCell 1983245 instances of class org.apache.cassandra.db.composites.CompoundSparseCellName 1885974 instances of class org.apache.cassandra.db.composites.CompoundDenseCellName 63 instances of class org.apache.cassandra.io.sstable.IndexHelper$IndexInfo 503687 instances of class org.apache.cassandra.db.BufferDeletedCell 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier 101800 instances of class org.apache.cassandra.utils.concurrent.Ref 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State 90704 instances of class org.apache.cassandra.utils.concurrent.Ref$GlobalState 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey At the bottom of the page, it shows: Total of 8739510 instances occupying 193607512 bytes. JFYI. Kunal On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Thanks for quick reply. 1. I don't know what are the thresholds that I should look for. So, to save this back-and-forth, I'm attaching the cfstats output for the keyspace. There is one table - daily_challenges - which shows compacted partition max bytes as ~460M and another one - daily_guest_logins - which shows compacted partition max bytes as ~36M. Can that be a problem? Here is the CQL schema for the daily_challenges column family: CREATE TABLE app_10001.daily_challenges ( segment_type text, date timestamp, user_id int, sess_id text, data text, deleted boolean, PRIMARY KEY (segment_type, date, user_id, sess_id) ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted); 2. I don't know - how do I check? As I mentioned, I just installed the dsc21 update from datastax's debian repo (ver 2.1.7). Really appreciate your help. Thanks, Kunal On 10 July 2015 at 23:33, Sebastian Estevez sebastian.este...@datastax.com wrote: 1. You want to look at # of sstables in cfhistograms or in cfstats look at: Compacted partition maximum bytes Maximum live cells per slice 2) No, here's the env.sh from 3.0 which should work with some tweaks: https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh You'll at least have to modify the jamm version to what's in yours. I think it's 2.5 All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Thanks, Sebastian. Couple of questions (I'm really new to cassandra): 1. How do I interpret the output of 'nodetool cfstats' to figure out the issues? Any documentation pointer on that would be helpful. 2. I'm primarily a python/c developer - so, totally clueless about JVM environment. So, please bare with me as I would need a lot of hand-holding. Should I just copy+paste the settings you gave and try to restart the failing cassandra server? Thanks, Kunal On 10 July 2015 at 22:35, Sebastian Estevez sebastian.este...@datastax.com wrote: #1 You need more information. a) Take a look at your .hprof file (memory heap from the OOM
Re: Cassandra OOM on joining existing ring
Thanks, Sebastian. Couple of questions (I'm really new to cassandra): 1. How do I interpret the output of 'nodetool cfstats' to figure out the issues? Any documentation pointer on that would be helpful. 2. I'm primarily a python/c developer - so, totally clueless about JVM environment. So, please bare with me as I would need a lot of hand-holding. Should I just copy+paste the settings you gave and try to restart the failing cassandra server? Thanks, Kunal On 10 July 2015 at 22:35, Sebastian Estevez sebastian.este...@datastax.com wrote: #1 You need more information. a) Take a look at your .hprof file (memory heap from the OOM) with an introspection tool like jhat or visualvm or java flight recorder and see what is using up your RAM. b) How big are your large rows (use nodetool cfstats on each node). If your data model is bad, you are going to have to re-design it no matter what. #2 As a possible workaround try using the G1GC allocator with the settings from c* 3.0 instead of CMS. I've seen lots of success with it lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not* set the newgen size for G1 sets it dynamically: # min and max heap sizes should be set to the same value to avoid # stop-the-world GC pauses during resize, and so that we can lock the # heap in memory on startup to prevent any of it from being swapped # out. JVM_OPTS=$JVM_OPTS -Xms${MAX_HEAP_SIZE} JVM_OPTS=$JVM_OPTS -Xmx${MAX_HEAP_SIZE} # Per-thread stack size. JVM_OPTS=$JVM_OPTS -Xss256k # Use the Hotspot garbage-first collector. JVM_OPTS=$JVM_OPTS -XX:+UseG1GC # Have the JVM do less remembered set work during STW, instead # preferring concurrent GC. Reduces p99.9 latency. JVM_OPTS=$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5 # The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC. # Machines with 10 cores may need additional threads. # Increase to = full cores (do not count HT cores). #JVM_OPTS=$JVM_OPTS -XX:ParallelGCThreads=16 #JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=16 # Main G1GC tunable: lowering the pause target will lower throughput and vise versa. # 200ms is the JVM default and lowest viable setting # 1000ms increases throughput. Keep it smaller than the timeouts in cassandra.yaml. JVM_OPTS=$JVM_OPTS -XX:MaxGCPauseMillis=500 # Do reference processing in parallel GC. JVM_OPTS=$JVM_OPTS -XX:+ParallelRefProcEnabled # This may help eliminate STW. # The default in Hotspot 8u40 is 40%. #JVM_OPTS=$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25 # For workloads that do large allocations, increasing the region # size may make things more efficient. Otherwise, let the JVM # set this automatically. #JVM_OPTS=$JVM_OPTS -XX:G1HeapRegionSize=32m # Make sure all memory is faulted and zeroed on startup. # This helps prevent soft faults in containers and makes # transparent hugepage allocation more effective. JVM_OPTS=$JVM_OPTS -XX:+AlwaysPreTouch # Biased locking does not benefit Cassandra. JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking # Larger interned string table, for gossip's benefit (CASSANDRA-6410) JVM_OPTS=$JVM_OPTS -XX:StringTableSize=103 # Enable thread-local allocation blocks and allow the JVM to automatically # resize them at runtime. JVM_OPTS=$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB # http://www.evanjones.ca/jvm-mmap-pause.html JVM_OPTS=$JVM_OPTS -XX:+PerfDisableSharedMem All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Fri, Jul 10, 2015 at 12:55 PM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: I upgraded my instance from 8GB to a 14GB one. Allocated 8GB to jvm heap in cassandra-env.sh. And now, it crashes even faster with an OOM.. Earlier, with 4GB heap, I could go upto ~90% replication completion (as reported by nodetool netstats); now, with 8GB heap, I cannot even get there. I've already restarted cassandra service 4 times with 8GB heap. No clue what's going on.. :( Kunal On 10 July 2015 at 17:45, Jack Krupansky jack.krupan...@gmail.com wrote: You, and only you, are responsible for knowing your
Re: Cassandra OOM on joining existing ring
From jhat output, top 10 entries for Instance Count for All Classes (excluding platform) shows: 2088223 instances of class org.apache.cassandra.db.BufferCell 1983245 instances of class org.apache.cassandra.db.composites.CompoundSparseCellName 1885974 instances of class org.apache.cassandra.db.composites.CompoundDenseCellName 63 instances of class org.apache.cassandra.io.sstable.IndexHelper$IndexInfo 503687 instances of class org.apache.cassandra.db.BufferDeletedCell 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier 101800 instances of class org.apache.cassandra.utils.concurrent.Ref 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State 90704 instances of class org.apache.cassandra.utils.concurrent.Ref$GlobalState 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey At the bottom of the page, it shows: Total of 8739510 instances occupying 193607512 bytes. JFYI. Kunal On 10 July 2015 at 23:49, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Thanks for quick reply. 1. I don't know what are the thresholds that I should look for. So, to save this back-and-forth, I'm attaching the cfstats output for the keyspace. There is one table - daily_challenges - which shows compacted partition max bytes as ~460M and another one - daily_guest_logins - which shows compacted partition max bytes as ~36M. Can that be a problem? Here is the CQL schema for the daily_challenges column family: CREATE TABLE app_10001.daily_challenges ( segment_type text, date timestamp, user_id int, sess_id text, data text, deleted boolean, PRIMARY KEY (segment_type, date, user_id, sess_id) ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted); 2. I don't know - how do I check? As I mentioned, I just installed the dsc21 update from datastax's debian repo (ver 2.1.7). Really appreciate your help. Thanks, Kunal On 10 July 2015 at 23:33, Sebastian Estevez sebastian.este...@datastax.com wrote: 1. You want to look at # of sstables in cfhistograms or in cfstats look at: Compacted partition maximum bytes Maximum live cells per slice 2) No, here's the env.sh from 3.0 which should work with some tweaks: https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh You'll at least have to modify the jamm version to what's in yours. I think it's 2.5 All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Thanks, Sebastian. Couple of questions (I'm really new to cassandra): 1. How do I interpret the output of 'nodetool cfstats' to figure out the issues? Any documentation pointer on that would be helpful. 2. I'm primarily a python/c developer - so, totally clueless about JVM environment. So, please bare with me as I would need a lot of hand-holding. Should I just copy+paste the settings you gave and try to restart the failing cassandra server? Thanks, Kunal On 10 July 2015 at 22:35, Sebastian Estevez sebastian.este...@datastax.com wrote: #1 You need more information. a) Take a look at your .hprof file (memory heap from the OOM) with an introspection tool like jhat or visualvm or java flight recorder and see what is using up your RAM. b) How big are your large rows (use nodetool cfstats on each node). If your data
Re: Cassandra OOM on joining existing ring
I upgraded my instance from 8GB to a 14GB one. Allocated 8GB to jvm heap in cassandra-env.sh. And now, it crashes even faster with an OOM.. Earlier, with 4GB heap, I could go upto ~90% replication completion (as reported by nodetool netstats); now, with 8GB heap, I cannot even get there. I've already restarted cassandra service 4 times with 8GB heap. No clue what's going on.. :( Kunal On 10 July 2015 at 17:45, Jack Krupansky jack.krupan...@gmail.com wrote: You, and only you, are responsible for knowing your data and data model. If columns per row or rows per partition can be large, then an 8GB system is probably too small. But the real issue is that you need to keep your partition size from getting too large. Generally, an 8GB system is okay, but only for reasonably-sized partitions, like under 10MB. -- Jack Krupansky On Fri, Jul 10, 2015 at 8:05 AM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: I'm new to cassandra How do I find those out? - mainly, the partition params that you asked for. Others, I think I can figure out. We don't have any large objects/blobs in the column values - it's all textual, date-time, numeric and uuid data. We use cassandra to primarily store segmentation data - with segment type as partition key. That is again divided into two separate column families; but they have similar structure. Columns per row can be fairly large - each segment type as the row key and associated user ids and timestamp as column value. Thanks, Kunal On 10 July 2015 at 16:36, Jack Krupansky jack.krupan...@gmail.com wrote: What does your data and data model look like - partition size, rows per partition, number of columns per row, any large values/blobs in column values? You could run fine on an 8GB system, but only if your rows and partitions are reasonably small. Any large partitions could blow you away. -- Jack Krupansky On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Attaching the stack dump captured from the last OOM. Kunal On 10 July 2015 at 13:32, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Forgot to mention: the data size is not that big - it's barely 10GB in all. Kunal On 10 July 2015 at 13:29, Kunal Gangakhedkar kgangakhed...@gmail.com wrote: Hi, I have a 2 node setup on Azure (east us region) running Ubuntu server 14.04LTS. Both nodes have 8GB RAM. One of the nodes (seed node) died with OOM - so, I am trying to add a replacement node with same configuration. The problem is this new node also keeps dying with OOM - I've restarted the cassandra service like 8-10 times hoping that it would finish the replication. But it didn't help. The one node that is still up is happily chugging along. All nodes have similar configuration - with libjna installed. Cassandra is installed from datastax's debian repo - pkg: dsc21 version 2.1.7. I started off with the default configuration - i.e. the default cassandra-env.sh - which calculates the heap size automatically (1/4 * RAM = 2GB) But, that didn't help. So, I then tried to increase the heap to 4GB manually and restarted. It still keeps crashing. Any clue as to why it's happening? Thanks, Kunal